The Database CI/CD Best Practice with GitHub

Wanna other VCS providers instead? 👉

Database change is a tricky part of the application development process: it usually involves multiple databases from different environments and cross-team collaboration, to add on top of it, databases are touch and go. It got us thinking: can we treat database the same way we treat application code?

DORA (DevOps Research & Assessment) pointed out that integrating database work into the software delivery process positively contributes to continuous delivery. It’s about time to make databases a part of the CI/CD cycle.

But how does it work, really?

Critical Elements of Database CI/CD#

To answer the "how", we first need to sort out the typical database change workflow. Before SQL statements can be safely applied to the database, there are two key steps: review & change.

SQL Review#

This step is to make sure that the changes:

Implement the business logic accurately;
Follow database design best practices;

Here, the devs are generally responsible for the former task and the DBAs for the latter. The DevOps philosophy looks to solve this problem by integrating Ops and Devs. The reality is that when DBA exists in an organization, it is difficult to merge the two teams directly. One potential solution is to retain the DBA’s task while enabling dev teams to pre-review the SQL. This shift-left approach can significantly reduce the chance of release delays. Moreover, if there are no DBAs in an organization, then it is even more crucial to empower the dev team with the capability to make sure the SQL doesn’t wreak havoc on the database.

SQL Change Execution#

This step is to make sure that:

Statements are executed correctly. We don’t want wrong database connections, insufficient permissions, object name conflicts, or basic syntax errors on our hands.
All planned statements are executed. Omissions may occur when there are many scripts to be executed or if there are multiple target databases for batch execution.
The change executions process should not impact the business. Hardware resource exhaustion and locking the table for an extended period are not pleasant for the company.

To avoid change-related errors, reducing the manual aspects is also crucial: the more things are automated, the fewer chances for mistakes to happen. Pre-configured pipelines to automatically apply SQL to the databases? That sounds rad. In order to avoid affecting regular business operations negatively, various zero-downtime change techniques should be adopted, especially for databases with large datasets.

Thus, the critical elements for implementing Database CI/CD should enable dev teams to perform SQL reviews and streamlined SQL change rollout.

SQL Review and Change Rollout with VCS Integration#

Let’s first explore how to enable the dev teams to perform SQL reviews themselves.

Very few developers are experts at reviewing SQL statements for "architecturally correctness" and even for senior DBAs, the manual checks can be highly inefficient and error-prone. Fortunately, the industry has created various automated review tools by integrating different SQL check specifications.

However, these tools have one common problem - they are all designed for the DBAs. On the one hand, these tools often require higher operational privileges to the databases and are thus not suitable to be used directly by developers. On the other hand, developers have their IDE, and a separate external arbitrator is the last thing they need. Imagine how bad it would be when you have to copy & paste code between multiple tools.

So what should a developer-friendly SQL review tool look like?

We usually perform the traditional code review process on version control systems (VCS), and the same should be applied to SQL. Therefore, SQL review tools should be integrated into the code review workflow. Bytebase has made SQL Review Action available on the GitHub Marketplace; once enabled, SQL Review Actions will be triggered as you submit PR on GitHub.

Let’s look at how to implement streamlined SQL change rollouts.

Standalone SQL deployment tools are not uncommon. These tools typically upload SQL scripts manually, proceed with the deployment via an approval flow, and then provide feedback after the rollout is complete. This model accurately depicts how the developers and the DBAs work independently, and the fragmented process is one of the most common reasons for delayed releases. After all, who can guarantee that there will never be a mistake when you are constantly moving SQL scripts between multiple systems manually?

We need a more efficient and automated release process. Let’s recall the classic CI/CD workflow for application code: commit changes > code review > merge branch > auto-build > auto-deploy. Since we’ve already implemented SQL review on GitHub Actions, why can’t we include the subsequent rollout process?

Well, yes, we can!

A SQL change rollout tool for Database CI/CD should have the ability to integrate with VCS. Once your SQL scripts have been vetted and merged into the target branch, the release process is triggered, and the scripts are automatically pushed to Bytebase. Of course, the DBA can perform another sanity check before executing the SQL against the target database.

A Complete Database CI/CD Workflow#

But how does it work, really? Here, we present our envisioned Database CI/CD workflow.

database-devops-workflow

The developer creates a Pull Request containing the SQL script;
SQL Review CI is automatically triggered to review SQL and offers suggestions via a comment in the PR;
The team leader or another peer on the dev teams approves the change and merges the SQL script into the watched branch (default is the main branch);
The merge event automatically triggers the rollout pipeline in Bytebase and creates a ticket capturing the intended change;
(Optional) an approval flow will be auto matched based on the change risk and be followed via Bytebase’s built-in UI;
Approved scripts are executed gradually according to the configured rollout stages;
When the rollout is completed, Bitbucket CI may get notified and proceed to deploy the application.

Set Up Database CI/CD with GitHub in Bytebase (Free Plan)#

Here's a step-by-step tutorial on how to set up this Database CI/CD with GitHub in Bytebase.

Step 1 - Run Bytebase in Docker and set the External URL generated by ngrok#

ngrok is a reverse proxy tunnel, and in our case, we need it for a public network address to allow VCS to call the Bytebase API. ngrok we used here is for demonstration purposes.

ngrok-reverse-proxy

Run Bytebase in Docker with the following command:

docker run --rm --init \
  --name bytebase \
  --publish 8080:8080 --pull always \
  --volume ~/.bytebase/data:/var/opt/bytebase \
  bytebase/bytebase:3.5.1

Bytebase is running successfully in Docker, and you can visit it via localhost:8080. Register an admin account and it will be granted the workspace admin role automatically.
Login to ngrok Dashboard and complete the Getting Started steps to install and configure. If you want to use the same domain each time you launch ngrok, go to Cloud Edge > Domains, where you'll find the domain <<YOURS>>.ngrok-free.app linked to your account.
Run the ngrok command ngrok http --domain=<<YOURS>>.ngrok-free.app 8080 to start ngrok with your specific domain, and you will see the output displayed below:
Log in Bytebase and click the gear icon (Settings) on the top right. Click General under Workspace. Paste <<YOURS>>.ngrok-free.app as External URL under Network section and click Update.
Now you can access Bytebase via <<YOURS>>.ngrok-free.app.

Step 2 - Add GitHub.com as a Git provider in Bytebase#

Visit Bytebase via your ngrok URL. Click CI/CD > GitOps, choose GitHub.com.
Follow the fine-grained personal access token link to https://github.com/settings/tokens?type=beta and click Generate new token. Fill in the fields, then select the permission as follows and click Generate token.
- Contents (Read-only)
- Metadata (Read-only)
- Pull requests (Read and write)
- Webhooks (Read and write)
Copy the generated token, go back to Bytebase, and paste the token into the field and click Confirm and add. The provider is added successfully.

Step 3 - Configure a GitOps Workflow in Bytebase#

Go to GitHub, create a new project bb-gitops-2024.
Go to Bytebase, go to the Sample Project. Click Integration >GitOps on the left and click Add GitOps connector. Choose GitHub.com (the git provider you just configured) and xxxx/bb-gitops-2024 (the repository you just created).
Keep the default settings for the remaining fields and click Finish. The gitops connector is created successfully.

Step 4 - Configure SQL Review in Prod#

Go to CI/CD > SQL Review in workspace, choose Prod as the environment. Make sure a SQL review policy is attached and enabled on Prod.
Click Edit, click PostgreSQL tab. Make sure Enforce "NOT NULL" constraints on columns is enabled. This is to make sure the SQL Review can work.

Step 5 - Create a Pull Request and Trigger SQL Review CI#

Go to bb-gitops-2024 on GitHub.com. Add a new file 2024092500_create_table_t2024.sql under bytebase/ which is the directory configured in the previous step. Copy the following SQL script into the file and commit the change to a new branch which will then create a Pull Request.
```
CREATE TABLE "public"."t2024" (
      "id" integer PRIMARY KEY,
      "name" text
);
```
Wait for a while, there is a SQL Review comment added. As we configured in the previous step, NOT NULL is a warning level SQL Review rule.

Edit our sql file as following and commit it on the same branch, and merge the MR.

CREATE TABLE "public"."t2024" (
      "id" integer NOT NULL PRIMARY KEY,
      "name" text NOT NULL
);

There will be a new comment saying the PR has triggered a Bytebase rollout.
Follow the link to go to Bytebase. There's an issue with two stages, this is because we have two databases in this project, by default, the SQL will be applied to all databases within the project. If you merge the previous version SQL script, the SQL Review task run here will show yellow warning and waiting for rollout. Click Resolve to resolve the issue.
After the rollout completes, click View change to see the diff.
You may also go to a specific database page to view all its change history.

Advanced Features (Enterprise Plan)#

You may upgrade to Enterprise plan to explore more features.

Manual Rollout#

Go to Environments > Prod, Find Rollout policy section, and choose Manual rollout > Require rolling out by dedicated roles.

bb-env-prod-manual-rollout

Custom Approval#

Go to CI/CD > Custom Approval. Set Project Owner -> DBA as Approval flow for DDL > High Risk.
Go to CI/CD > Risk Center. Click Add rule and click Load for the first template. Click Add.

Schema Drift#

Bytebase has built-in schema drift detection to detect unexpected schema changes. Let's use the SQL Editor Admin Mode to simulate this.

Click terminal icon (SQL Editor) on the top right. You'll be redirected to SQL Editor. Click Admin mode. Everything you do in this mode is the same as connecting directly to the server, which is not recorded by Bytebase.

Select (Prod) employee on the left, and paste and run the following script:

ALTER TABLE "public"."employee"
    ADD COLUMN "city" text NOT NULL DEFAULT '';

Go back to Bytebase Console, and click Databases > employee under Prod. Click Sync Now. After seeing the success message, refresh the page. You'll see the schema drift. You may configure auto scan on instance detail page to avoid manual sync.
Go to Database > Anomalies, and you'll see the Schema drift there too.

Summary#

Now with Bytebase, you have a complete Database CI/CD workflow with GitHub. You can apply this workflow to your own project and customize it to fit your needs. If you have any questions, please feel free join and discuss in Discord.