Data Masking with Classification Levels
Bytebase is an open-source database DevSecOps solution for Developer, Security, DBA, and Platform Engineering teams. The GitLab for database DevSecOps.
This tutorial guides you through setting up data classification and masking using Bytebase's API.
By the end of this tutorial, you will have accomplished the following:
Prerequisites
- Docker installed
- Download the api-example repository, you'll only need
data-classification
folder for this tutorial
Overview
This demo app simulates the process of fetching data from databases connected to Bytebase and setting classification levels. By integrating global masking configurations, different classification levels will correspond to varying degrees of data masking.
Workflow
- Run a Bytebase instance and add a service account user
- Import the classification
- Configure the data masking based on the classification level
- Configure the environment variables and run the demo app
- In the demo set the classification and see the data masking result
Run a Bytebase instance and add a service account user
-
Start Bytebase via Docker and register an account which will be granted
Workspace Admin
role.You'll need an API service account user too:
-
Go to IAM&Admin > Users&Groups, click +Add User.
-
Choose
Service Account
as the Type, fill in the Email withapi-sample@service.bytebase.com
, chooseWorkspace DBA
as Roles, and click Confirm. -
Copy the Service Key for later use.
Import the classification
- Go to Data Access > Data Classification, click Upload classification.
- Upload the
/public/classification.json
file within thedata-classification
repository, you'll see the classification is imported.
Configure the data masking based on the classification level
There are two ways to configure the data masking based on the classification level - via UI and API:
UI
-
Go to Data Access > Data Masking, click Add.
-
Give it a Condition Name, e.g.
Partial masking for Level 1
, click Add Condition. -
Here we only care about data on production environment, so we set the Environment ID equals
prod
. -
Add another condition with
AND
operator, and set the Classification Level inLevel 1
. -
Choose Masking Level as
Partial
and click Confirm. -
The same way, we can add another masking rule for
Level 2
with Masking Level asFull
.
API
-
Find the data masking configuration file within the
data-security
repository. -
Generate the token for the service account user:
-
Import the data masking configuration:
-
Login to Bytebase console and go to Data Access > Data Masking, you'll see the data masking is configured.
Configure the environment variables and run the data-classification
demo app
-
Go to the
data-classification
folder of theapi-example
repository, and copyenv-template.local
file as.env.local
. Replace the placeholders with yours. -
Run
pnpm i
andpnpm run dev
, you can run the demo app locally withlocalhost:3000
.
Set the classification and see the data masking result
-
In the demo app, select a table, here we select
salary
, it's possible to set the classification level for this table, but here we skip it. -
Choose classification
1-4 Health [Level 2]
for theamount
column and1-1 Basic [Level 1]
for thefrom_date
column. -
Go to Bytebase SQL Editor, double click the
salary
table, you'll see the data is masked accordingly.
Code explanation
Fetch database schema and classification
-
Bytebase provides the possibility to set classification on the table and column level. So the first step is to fetch the database schema.
-
Use the API
/v1/instances/${instance}/databases/${database}/metadata
to fetch the database schema information. In this demo, the instance is hardcoded astest-sample-instance
and the database istest-sample-database
. -
The metadata response includes the database schema under
schemas
:Meanwhile, it also includes the classification information under
schemaConfigs
:
Update the schema with classification
- To update the schema with classification, we need to use the API
/v1/instances/${instance}/databases/${database}/metadata
withPATCH
method.
Fetch defined classification
-
Log in Bytebase, go to Data Access > Data Classification. Upload the
classification.json
file. It will be parsed and saved as global classification. -
Use the API
/v1/settings/bb.workspace.data-classification
to fetch the defined classification, so it's always up to date.
Summary
Setting up data classification and masking with Bytebase via API is a powerful way to manage sensitive data across your organization. This approach ensures that sensitive data is protected according to your organization's security policies, while still allowing authorized users to access the data they need.