Data Classification

Data classification allows you to classify columns and apply masking to those columns via the Global Masking Rule. This allows you to manage masking policy for many columns by controlling only a small number of classifications.

overview

In the above example, column first_name and last_name will be applied Default Partial Masking, because:

Column first_name and last_name are classified as Contact Info.
Contact Info corresponds to security level 2.
Security level 2 applies semantic type Default Partial Masking.

Step 1 - Define Classification#

definition

You upload a JSON file containing the classification definition. The definition contains 2 sections:

Security levels. Usually you define 3 ~ 5 levels.
Classes. You can define multi-level classes. You assign a security level to each leaf class.

Simple Classification#

This is a simple classification showing the structure:

There are 2 security levels.
There are 2 top classes. Class 1 contains 4 sub-classes. Class 2 contains 2 sub-classes. Each subclass (leaf node) is assigned a security level.

{
  "title": "Classification Example",
  "levels": [
    {
      "id": "1",
      "title": "Level 1",
      "description": ""
    },
    {
      "id": "2",
      "title": "Level 2",
      "description": ""
    }
  ],
  "classification": {
    "1": {
      "id": "1",
      "title": "Basic",
      "description": ""
    },
    "1-1": {
      "id": "1-1",
      "title": "Basic",
      "description": "",
      "levelId": "1"
    },
    "1-2": {
      "id": "1-2",
      "title": "Assert",
      "description": "",
      "levelId": "1"
    },
    "1-3": {
      "id": "1-3",
      "title": "Contact",
      "description": "",
      "levelId": "2"
    },
    "1-4": {
      "id": "1-4",
      "title": "Health",
      "description": "",
      "levelId": "2"
    },
    "2": {
      "id": "2",
      "title": "Relationship",
      "description": ""
    },
    "2-1": {
      "id": "2-1",
      "title": "Social",
      "description": "",
      "levelId": "1"
    },
    "2-2": {
      "id": "2-2",
      "title": "Business",
      "description": "",
      "levelId": "1"
    }
  }
}

Financial Industry Classification#

A comprehensive data classification (English, Chinese) for the financial industry. It contains:

5 security levels.
14 top-categories.
300+ sub-categories.

Step 2 - Configure Global Masking Policy#

global

From the Global Masking Policy, you can define the masking level for each classification level.

Step 3 - Classify Column#

Manual Classification#

If you turn off Sync classification from comment, then you can manually set the classification for each column.

bb-classification-column-masking

Go to the column definition and set the classification.

Comment Classification#

bb-classification-definition

If you turn on Sync classification from comment (by default it's on), then the column classification is derived from the comment. If the column format follows {classification id}-{comment} such as 1-4-2-blabla, then Bytebase will extract 1-4-2 as the classification id and assigns the column classification accordingly.

API Integration#

Check API.

Edit this page on GitHub