Data Classification

Data classification allows you to classify columns and apply masking levels to the classified columns globally. This allows you to manage masking policy for many columns by controlling only a small number of classifications.

overview

In the above example, column first_name and last_name will be applied Partial Masking, because:

  • Column first_name and last_name are classified as Contact Info.
  • Contact Info corresponds to security level 2.
  • Security level 2 applies Partial Masking.

Step 1 - Define Classification

definition

You upload a JSON file containing the classification definition. The definition contains 2 sections:

  1. Security levels. Usually you define 3 ~ 5 levels.
  2. Classes. You can define multi-level classes. You assign a security level to each leaf class.

Simple Classification

This is a simple classification showing the structure:

  1. There are 2 security levels.
  2. There are 2 top classes. Class 1 contains 4 sub-classes. Class 2 contains 2 sub-classes. Each subclass (leaf node) is assigned a security level.
{
  "title": "Classification Example",
  "levels": [
    {
      "id": "1",
      "title": "Level 1",
      "description": ""
    },
    {
      "id": "2",
      "title": "Level 2",
      "description": ""
    }
  ],
  "classifications": [
    {
      "id": "1",
      "title": "Basic",
      "description": ""
    },
    {
      "id": "1-1",
      "title": "Basic",
      "description": "",
      "levelId": "1"
    },
    {
      "id": "1-2",
      "title": "Assert",
      "description": "",
      "levelId": "1"
    },
    {
      "id": "1-3",
      "title": "Contact",
      "description": "",
      "levelId": "2"
    },
    {
      "id": "1-4",
      "title": "Health",
      "description": "",
      "levelId": "2"
    },
    {
      "id": "2",
      "title": "Relationship",
      "description": ""
    },
    {
      "id": "2-1",
      "title": "Social",
      "description": "",
      "levelId": "1"
    },
    {
      "id": "2-2",
      "title": "Business",
      "description": "",
      "levelId": "1"
    }
  ]
}

Financial Industry Classification

A comprehensive data classification (English, Chinese) for the financial industry. It contains:

  • 5 security levels.
  • 14 top-categories.
  • 300+ sub-categories.

Step 2 - Configure Global Masking Policy

global

From the Global Masking Policy, you can define the masking level for each classification level.

Step 3 - Classify Column

Manual Classification

classify-manual

Go to the column definition and set the classification.

Comment Classification

classify-from-comment

If you turn on Sync classification from comment, then the column classification is derived from the comment. If the column format follows {classification id}-{comment} such as 1-4-2-blabla, then Bytebase will extract 1-4-2 as the classification id and assigns the column classification accordingly.

API Integration

Check API.

Edit this page on GitHub

Subscribe to Newsletter

By subscribing, you agree with Bytebase's Terms of Service and Privacy Policy.