Data Classification

Data classification allows you to classify columns and apply masking to those columns via the Global Masking Rule. This allows you to manage masking policy for many columns by controlling only a small number of classifications.

overview

In the above example, column first_name and last_name will be applied Default Partial Masking, because:

  • Column first_name and last_name are classified as Contact Info.
  • Contact Info corresponds to security level 2.
  • Security level 2 applies semantic type Default Partial Masking.

Step 1 - Define Classification

definition

You upload a JSON file containing the classification definition. The definition contains 2 sections:

  1. Security levels. Usually you define 3 ~ 5 levels.
  2. Classes. You can define multi-level classes. You assign a security level to each leaf class.

Simple Classification

This is a simple classification showing the structure:

  1. There are 2 security levels.
  2. There are 2 top classes. Class 1 contains 4 sub-classes. Class 2 contains 2 sub-classes. Each subclass (leaf node) is assigned a security level.
{
  "title": "Classification Example",
  "levels": [
    {
      "id": "1",
      "title": "Level 1",
      "description": ""
    },
    {
      "id": "2",
      "title": "Level 2",
      "description": ""
    }
  ],
  "classification": {
    "1": {
      "id": "1",
      "title": "Basic",
      "description": ""
    },
    "1-1": {
      "id": "1-1",
      "title": "Basic",
      "description": "",
      "levelId": "1"
    },
    "1-2": {
      "id": "1-2",
      "title": "Assert",
      "description": "",
      "levelId": "1"
    },
    "1-3": {
      "id": "1-3",
      "title": "Contact",
      "description": "",
      "levelId": "2"
    },
    "1-4": {
      "id": "1-4",
      "title": "Health",
      "description": "",
      "levelId": "2"
    },
    "2": {
      "id": "2",
      "title": "Relationship",
      "description": ""
    },
    "2-1": {
      "id": "2-1",
      "title": "Social",
      "description": "",
      "levelId": "1"
    },
    "2-2": {
      "id": "2-2",
      "title": "Business",
      "description": "",
      "levelId": "1"
    }
  }
}

Financial Industry Classification

A comprehensive data classification (English, Chinese) for the financial industry. It contains:

  • 5 security levels.
  • 14 top-categories.
  • 300+ sub-categories.

Step 2 - Configure Global Masking Policy

global

From the Global Masking Policy, you can define the masking level for each classification level.

Step 3 - Classify Column

Manual Classification

If you turn off Sync classification from comment, then you can manually set the classification for each column.

bb-classification-column-masking

Go to the column definition and set the classification.

Comment Classification

bb-classification-definition

If you turn on Sync classification from comment (by default it's on), then the column classification is derived from the comment. If the column format follows {classification id}-{comment} such as 1-4-2-blabla, then Bytebase will extract 1-4-2 as the classification id and assigns the column classification accordingly.

API Integration

Check API.

Edit this page on GitHub

Subscribe to Newsletter

By subscribing, you agree with Bytebase's Terms of Service and Privacy Policy.