Data Classification
Data classification allows you to classify columns and apply masking to those columns via the Global Masking Rule. This allows you to manage masking policy for many columns by controlling only a small number of classifications.
In the above example, column first_name
and last_name
will be applied Default Partial Masking
, because:
- Column
first_name
andlast_name
are classified asContact Info
. Contact Info
corresponds tosecurity level 2
.Security level 2
applies semantic typeDefault Partial Masking
.
Step 1 - Define Classification
You upload a JSON file containing the classification definition. The definition contains 2 sections:
- Security levels. Usually you define 3 ~ 5 levels.
- Classes. You can define multi-level classes. You assign a security level to each leaf class.
Simple Classification
This is a simple classification showing the structure:
- There are 2 security levels.
- There are 2 top classes. Class 1 contains 4 sub-classes. Class 2 contains 2 sub-classes. Each subclass (leaf node) is assigned a security level.
Financial Industry Classification
A comprehensive data classification (English, Chinese) for the financial industry. It contains:
- 5 security levels.
- 14 top-categories.
- 300+ sub-categories.
Step 2 - Configure Global Masking Policy
From the Global Masking Policy, you can define the masking level for each classification level.
Step 3 - Classify Column
Manual Classification
If you turn off Sync classification from comment
, then you can manually set the classification for each column.
Go to the column definition and set the classification.
Comment Classification
If you turn on Sync classification from comment
(by default it's on), then the column classification is derived from the comment.
If the column format follows {classification id}-{comment}
such as 1-4-2-blabla
, then Bytebase will extract
1-4-2
as the classification id and assigns the column classification accordingly.
API Integration
Check API.