Snowflake offers scalable compute and storage resources, with a SQL-based interface for data manipulation and analysis. However, when integrating Continuous Integration/Continuous Deployment (CI/CD) processes with Snowflake, especially in comparison to its querying capabilities in SnowSight, several shortcomings or challenges can be identified.
No streamlined review and rollout process
A typical change workflow in Snowflake:
- A data engineer creates a schema change ticket in Jira.
- The Snowflake admin reviews the ticket, and then uses SnowSight to apply the change to the testing instance.
- The data engineer verifies the change and replies to the ticket to request the admin to apply the change to the production instance.
- The Snowflake admin uses SnowSight to apply the change to the production environment.
- The data engineer verifies the change and closes the ticket.
The above process involves back-and-forth and is error-prone. e.g. What if the admin mistakenly applies the change to prod first?
No automated SQL lint
Modern CI pipeline has adopted automatic review checks. For Snowflake, it's especially important, since dropping a column might break the downstream data pipeline. The snowflake platform doesn't provide such lint.
GitOps
Snowflake recently announced Git integration:
This includes such features as Git integration (private preview), which provides easy integration of application code with git and git workflow. Users can view, run, edit, and collaborate with assets that exist in a Git repo, right inside of Snowflake.
It would be even better to see Vercel-like experience, where when a change script is merged into a branch, it will automatically trigger a rollout pipeline with optional approval flow.
Meet Bytebase
To tackle the CI/CD challenges, Snowflake creates schemachange. The other solution is Bytebase (schemachange vs. Bytebase).
Web-based streamlined review and rollout process
Bytebase provides a review and rollout interface. It's similar to Jira, but is specifically tailored for doing database rollout. A staged rollout to propagate changes from test to prod instance.
Record schema change history.
Detect schema drift caused by unexpected change.
SQL Review and API
Bytebase provides automatic SQL review rules to detect Snowflake SQL anti-patterns. Once configured, the SQL review will be activated during the review process above. Moreover, one can call the Bytebase API from the VCS CI pipeline.
Inline Check in GitHub PR.
Summary
Bytebase brings the DevOps methodology and GitHub/GitLab-like experience to the Snowflake world. To learn more, check out the step-by-step tutorial.