Database CI/CD and Schema Migration with ClickHouse
A series of articles about Database CI/CD and Schema Migration with ClickHouse.
- Database CI/CD and Schema Migration with ClickHouse (this one)
- Database CI/CD and Schema Migration with ClickHouse and GitHub
ClickHouse is a fast open-source column-oriented database management system that allows generating analytical data reports in real-time using SQL queries.
This tutorial will guide you step-by-step to set up database change management for ClickHouse (Cloud and self-managed) in Bytebase. With Bytebase, a team can have a formalized review and rollout process to make ClickHouse schema change and data change.
You’ll have a GUI and the full change history. You can use Bytebase free version to finish the tutorial.
There is also a bonus section about drift detection for those advanced users if needed.
Features included
- Change Workflow
- Change History
- Drift Detection
Prerequisites
Before you start this tutorial, make sure:
- You have either of
- a ClickHouse Cloud account
- a self-managed ClickHouse running. You can follow How to Run ClickHouse with Docker.
- You have Docker installed locally.
Step 1 - Start Bytebase in Docker
-
Make sure your docker daemon is running, and start the Bytebase docker container.
-
Bytebase is running successfully in Docker, and you can visit it via
localhost:8080
. -
Visit localhost:8080 in your browser. Register the first admin account which will be granted
Workspace Admin
.
Step 2 - Add ClickHouse in Bytebase
In Bytebase, an Instance could be your on-premises MySQL instance, an AWS RDS instance etc, in this tutorial, a ClickHouse Cloud account or a self-managed instance.
-
Visit
localhost:8080
and log in asWorkspace Admin
. -
Click Add Instance.
-
Add a ClickHouse instance and click Create.
Pay attention to some fields:
Environment: choose
Test
, if you chooseProd
, issues will wait for approval by default. In this tutorial, we try to keep it simple. However, it’s all configurable.If you use Cloud version. Go to your ClickHouse Cloud account, and click View connection string.
Host or Socket and Port are in the grey box as parameters.
Username and Password are generated and stored in
clickhouse_credentials.txt
while you registered the ClickHouse Cloud account.Be sure to allow access to this service from
Anywhere
.Choose
CA Certificate
for SSL Connection, if you use macOS, open the file/etc/ssl/cert.pem
, copy the content as a whole then paste it into the CA Certificate field box. Click Test Connection to verify it’s working. -
If you use self-managed version of ClickHouse, fill in instance form as the following image and click Create.
Step 3 - Create a project with ClickHouse instance
In Bytebase, Project is the container to group logically related Databases, Issues and Users together, which is similar to the project concept in other dev tools such as Jira, GitLab. So before you deal with the database, a project must be created.
-
After the instance is created, click Projects on the top bar.
-
Click New Project to create a new project
TestClickHouse
, key isTCH
, mode isstandard
. Click Create.
Step 4 - Create a database in ClickHouse via Bytebase
In Bytebase, a Database is the one created by 'CREATE DATABASE xxx'. A database always belongs to a single Project. Issue represents a specific collaboration activity between Developer and DBA such as creating a database, altering a schema. It's similar to the issue concept in other issue management tools.
-
After the project is created. Click New DB on the project top bar.
-
Fill the form with Name -
db_demo
, Environment -Test
, and Instance -ClickHouse instance
. Click Create. -
It will create an issue “CREATE DATABASE ….” automatically. Because it’s for
Test
environment, the issue will run without waiting for your approval by default. Click Resolve, and the issue isDone
. The database is created. -
Go back to the home page by clicking Home on the left sidebar. On the home page, you can see the project, the database, and the issue you just resolved.
Step 5 - Create a table in ClickHouse via Bytebase
In Step 4, you created an issue to create a database via UI workflow and then executed it. Let’s try to create another issue to alter that database.
-
Go to project
TestClickHouse
, and click Alter Schema. -
Choose
db_demo
and click Next. It could generate a pipeline if you have different databases for different environments. -
Input SQL as follows, and click Create.
-
Bytebase will do some basic checks and then execute the SQL. Since it’s for
Test
environment, the issue is automatically approved by default. Click Resolve issue, and the issue status will becomeDone
. -
On the issue page, click View change. You will see diff for the change.
-
You can also go to Change History under the project to view the full history. Or go into a specific database to view its history.
Bonus Section - Drift Detect
This section requires you to have Enterprise Plan (you can start 14 days trial directly in the product without credit card).
Now you can see the full change history of db_demo
. However, what is Establish new baseline? When should it be used?
By adopting Bytebase, we expect teams to use Bytebase exclusively for all schema changes. Meanwhile, if someone has made ClickHouse schema change out side of Bytebase, obviously Bytebase won’t know it. And because Bytebase has recorded its own copy of schema, when Bytebase compares that with the live schema having that out-of-band schema change, it will notice a discrepancy and surface a schema drift anomaly. If that change is intended, then you should baseline the schema state again to reconcile.
In this section, you’ll be guided through this process.
-
Go to ClickHouse Cloud, click Open SQL console, and add a column
age
there. Make sure the new column is added. -
Wait for about 10 mins for Bytebase to detect the drift. Go back to Bytebase, and you can find the Schema Drift on:
database db_demo
Anomaly Center
-
Click View diff, you will see the exact drift.
-
Go to
db_demo
> Change History and click Establish new baseline to reconcile the schema. -
It will create a baseline issue. Click Resolve to mark it done.
-
Go back to
db_demo
or Anomaly Center, and you will find the drift is gone.
Summary and Next
Now you have connected ClickHouse with Bytebase, and tried out the UI workflow to do schema change. Bytebase will record the full change history for you. With Enterprise Plan, you can even have drift detection.
In the next article, you’ll try out GitOps workflow, which will store your ClickHouse schema in GitHub and trigger the change upon committing the change to the repository, to bring your ClickHouse change workflow to the next level, aka Database DevOps - Database as Code.