Database Review Guide
Engine
InnoDB is the default storage engine for MySQL that provides transaction support. It also provides better performance for high-concurrency and low-latency scenarios, and supports online data backup and recovery. It is the preferred choice for OLTP businesses. Suggestion error level: Error
Table
Various data synchronization, comparison, and rollback tools require tables to have primary key. Suggestion error level: Error
The advantages and disadvantages of foreign key are highly controversial. Using foreign key may significantly increase the difficulty of database changes, scalability (such as sharding), etc. And may even prevent the use of some tools. Therefore, another option is to implement foreign key constraints at the application layer. Suggestion error level: Warning
For example, by requiring the "_del" suffix, it can effectively prevent accidental deletions. Suggestion error level: Error
Table name format (regex): _del$In some database engines, partitioned tables are not mature, and the use and maintenance are inconvenient. Therefore, it is more inclined to use manual data partitioning methods such as database and table sharding. Suggestion error level: Warning
This rule restricts the usage of triggers on tables. Triggers can introduce complexity and potential performance issues to database operations. By disallowing triggers, the system can maintain a simpler and more predictable behavior. Suggestion error level: Warning
Configure whether the table requires comments and the maximum comment length.
Require comment: trueRequire classification: falseMax length: 64This rule prohibits the creation of duplicate indexes on a table. Duplicate indexes consume extra storage space and can potentially reduce query performance. Suggestion error level: Warning
This rule limits the amount of data a table can hold, preventing excessive storage usage.
Maximum length: 1000It is recommended to set the charset at the database level. Setting the charset at finer granularity can bring unnecessary complexities. Suggested error level: Error.
Configure which tables are prohibited from executing DDL. Suggestion error level: Warning
Table names:Configure which tables are prohibited from executing DML. Suggestion error level: Warning
Table names:Configure the maximum number of rows in tables for which DDL can be executed. Recommended error level: warning
Max row count: 10000000The charset of the table must be specified. Suggestion error level: Warning
The collation of the table must be specified. Suggestion error level: Warning
Statement
SELECT * to fetch entire row data may cause unnecessary resource overhead and may also cause unexpected results in applications once the table adds or removes columns. Suggestion error level: Error
Queries without WHERE clause may cause huge uncessary resource overhead. Suggestion error level: Error
DMLs without WHERE clause may cause massive accidental data loss. Suggestion error level: Error
When using leading wildcard, such as "LIKE '%ABC'", the database optimizer cannot use fast index scan, and fallback to full table scan or full index scan, which may cause serious performance impact. Suggestion error level: Error
In some cases, multiple statements are required to be included in a transaction committed by the system, in order to quickly rerun in case of partial failure. Therefore, explicit "COMMIT" is not allowed. Suggestion error level: Warning
The "CASCADE" option in 'ON DELETE' can cause a large number of dependent objects to be deleted or modified, which may cause unexpected results. Suggestion error level: Error
Using the "CASCADE" option when removing a table can cause a large number of dependent objects to be deleted or modified, which may cause unexpected results. Suggestion error level: Error
If LIMIT is used in DML statements without an ORDER BY clause, the affected rows order are not fixed, which may cause data inconsistency between the primary and replica databases in some replication modes. Suggestion error level: Error
Sorting operations are extremely resource-intensive, so for update and delete operations, it is recommended to use a deterministic filtering condition as much as possible instead of using ORDER BY and LIMIT. Suggestion error level: Error
Every change to a table may cause a table-level lock and consume a large amount of resources. If there are multiple changes to the same table, they should be merged into a single change statement. Suggestion error level: Error
The "INSERT INTO table VALUES (...)" statement does not explicit list column names. Once the column order changes or columns are added or dropped, the statement may faile or generate unexpected data. Suggestion error level: Error
Randomly sorting the data to be inserted is meaningless and will only consume uncessary resources. Suggestion error level: Error
Reveal the number of rows to be inserted can help determine whether the statement meets business expectations. Suggestion error level: Warning
Maximum insert amount: 1000Reveal the number of rows to be updated or deleted can help determine whether the statement meets business expectations. Suggestion error level: Warning
Maximum affected rows: 1000When the syntax is correct, but the table name is incorrect or the permission is insufficient, it can be discovered by dry run before the actual execution. Suggestion error level: Warning
Before PostgreSQL 11, adding a column with a default value cause table locking and unable to read and write, which may cause business interruption. In PostgreSQL 11 and above, this issue has been optimized and there is no need to pay attention to this rule. Suggestion error level: Warning
Adding a CHECK constraint needs to verify the existing data and requires ACCESS EXCLUSIVE table lock. This blocks read and write, which may cause business interruption. It is recommended to add the "NOT VALID" option to validate new data and manually validate existing data after the change is completed. Suggestion error level: Warning
Adding NOT NULL constraint with default value before PostgreSQL 11 or adding NOT NULL constraint without default value requires to verify the existing data. This blocks read and write, which may cause business interruption. Suggestion error level: Warning
Full table scan is a resource-intensive operation and may cause serious performance impact. Suggestion error level: Error
If the schema is not specified, the object will be created in the default schema, which may cause unexpected results.
Failure to set the role statement properly at the beginning of a session may lead to unauthorized access or improper permissions assignment, potentially compromising data security and integrity. Suggestion error level: Warning
Filesort is a resource-intensive operation and may cause serious performance impact. Suggestion error level: Warning
Temporary tables are not recommended for use in production environments. They can cause resource contention and performance issues. Suggestion error level: Warning
The result of NULL equality comparison is always NULL, which may cause unexpected results. Suggestion error level: Warning
If you apply a function or perform a calculation on the indexed field, the database cannot use the index and has to scan the entire table instead.
SQL performance optimization strives for constant (const) access, aiming for reference (ref) level as a baseline, with acceptable performance at range level.
Maximum level: INDEXThis prevents performance degradation due to extensive comparisons and resource limitations.
Maximum count: 10Limiting the number of rows through LIMIT ensures the database processes manageable chunks, improving query execution speed. A capped LIMIT value prevents excessive memory usage, safeguarding overall system stability and preventing performance degradation.
Maximum value: 1000The more tables you join, the more complex the query becomes and the longer it takes to execute. In general, it is best to keep the number of joins to a minimum.
Maximum value: 10Large transactions can significantly impact database performance. If a large number of statements are involved and one fails, rolling back the entire transaction becomes complex. Limiting statements minimizes the potential damage caused by a single failure and simplifies rollback procedures.
Maximum value: 10If the data types, character sets of the join columns are not identical, the database may not be able to correctly identify matching rows, leading to inaccurate or incomplete results.
For optimal performance and data integrity, avoid combining DDL and DML statements within the same script.
For optimal performance and data integrity, avoid combining DDL and DML statements within the same script.
In some cases, using FIRST/AFTER to add columns will cause data reorganization (rewriting all data). Suggested error level: Warning
To prevent database changes from impacting your business, avoid using Offline DDL.
Cross-database queries increase system coupling and can lead to efficiency issues. Suggested error level: Warning
Set the maximum execution time for SQL statements. If the execution time exceeds the limit, the statement will be terminated. Suggestion error level: Warning
Specifying the ALGORITHM option in ALTER TABLE statements ensures more control over how table changes are applied, minimizing potential disruptions by avoiding full table copies or locks. It helps optimize performance and reduce downtime during schema modifications. Suggested error level: Warning.
The LOCK option in ALTER TABLE statements allows you to control the level of locking during schema changes, helping to prevent unnecessary table locks and ensuring better concurrency. Proper use of this option can significantly reduce the impact of DDL operations on active queries. Suggested error level: Warning.
Adding foreign keys needs to verify the existing data and requires SHARE ROW EXCLUSIVE table lock. This blocks write, which may cause business interruption. It is recommended to add the "NOT VALID" option to validate new data and validate existing data after the change is completed. Suggestion error level: Warning
This rule checks whether the object owner for DDL is the same as the current user.
Naming
Enforce the use of fully qualified object names. For example, “schema.table”, suggested error level: Warning
The default format is all lowercase letters, separated by underscores between words, and no more than 63 characters long, such as "abc" and "abc_def". Suggestion error level: Warning
Table name format (regex): ^[a-z]+(_[a-z]+)*$Length limit: 63The default format is all lowercase letters, separated by underscores between words, which is no more than 63 characters long, such as "abc" and "abc_def". Suggestion error level: Warning
Column name format (regex): ^[a-z]+(_[a-z]+)*$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "uk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "uk_my_table_id_name". Suggestion error level: Warning
Unique key name format: ^$|^uk_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "pk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "pk_my_table_id_name". Suggestion error level: Warning
Primary key name format: ^$|^pk_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "idx_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "idx_my_table_id_name". Suggestion error level: Warning
Index name format: ^$|^idx_{{table}}_{{column_list}}$Length limit: 63The name is allowed to be empty and named by the database. If not empty, the default format is "fk_<table name>_<unique key column name combination>", which is no more than 63 characters long, such as "fk_my_table_id_name". Suggestion error level: Warning
Foreign key name format: ^$|^fk_{{referencing_table}}_{{referencing_column}}_{{referenced_table}}_{{referenced_column}}$Length limit: 63The default column name is "ID", and is no more than 63 characters long.
Auto-increment column name format (regex): ^id$Length limit: 63- Upper case: true
Column
Some common columns are helpful for better application maintenance. For example, adding a business-independent "ID" column as the primary key avoids primary key conflicts caused by business changes (such as business mergers), and in some scenarios can also bring better data insertion performance. Suggested error level: Warning
Required column names: id, created_ts, updated_ts, creator_id, updater_idAbusing column types can have serious negative effects on system maintainability and performance. For example, using "LOB" column to store large amounts of audio and video data may cause database performance to decrease, backup and recovery times to lengthen, and data synchronization tools incompatible. Suggested error level: Error
Disallow list: JSONColumns cannot have NULL value.
Modifying column types may affect system performance, maintainability, and even lead to data loss. Suggested error level: Warning
Prohibit dropping columns. Suggested error level: Warning
Prohibit dropping columns in index. Suggested error level: Error
For a 'NOT NULL' column, if a value is not assigned to the column when inserting a new row and the column does not have a default value, the database will reject the insertion of that row. Setting a default value for a new column can also ensure compatibility with legacy application. Suggested error level: Error
"CHANGE COLUMN" is unique to MySQL syntax and can be used to modify column names and other properties at the same time. However, it may cause the column name to be mistakenly changed when modifying properties. It is recommended to still use standard "RENAME" and "MODIFY" statements to distinguish between the two types of changes. Suggested error level: Error
Modifying the order of columns may cause some applications or views that depend on the default order of the original table to produce unexpected results, such as "select *". Suggested error level: Warning
MySQL's auto-increment column are generally used as business-independent primary key. Using integer types occupies less storage space and makes the primary key index structure more compact, bringing better query and DML performance. Suggested error level: Error
It is recommended to set the charset at the database level or table level. Setting the charset at finer granularity can bring unnecessary complexities. Suggested error level: Error.
Unsigned types do not store negative numbers, and the range of values that can be stored by the same type is doubled, which can avoid auto-increment columns overflow. Suggested error level: Warning
Adding comments to columns is a good development practice, but excessively long comments can decrease the readability of the schema. Suggested error level: Warning
Require comment: trueRequire classification: falseMax length: 64"CHAR" is a fixed-length type. For example, the CHAR(20) column will occupy 20 character spaces even if only one character is stored, causing waste. When the string is too long and the length is not fixed, consider using VARCHAR for MySQL and using TEXT for PostgreSQL. Suggestion error level: Error
Maximum length: 20- Maximum length: 2560
based on management requirements to limit the initial value of the auto-increment column. Suggestion error level: Warning
Initial value: 1Only columns recording the creation time of the record with "DEFAULT NOW()" and recording the update time of the record with "DEFAULT NOW() ON UPDATE" need to call function to get system time. It is meaningless and will increase resource overhead to record system time in other columns. Suggestion error level: Error
Setting default values that satisfy business logic can effectively improve the data quality of downstream analytical pipeline. This rule does not check "PRIMARY KEY", "JSON", "BLOB", "TEXT", "GEOMETRY", "AUTO_INCREMENT", "GENERATED" types. Suggestion error level: Warning
Volatile functions (e.g., clock_timestamp()) update each row with the value at the time of ALTER TABLE execution, potentially causing lengthy updates.
The charset of columns with text data types must be specified. Suggestion error level: Warning
The collation of columns with text data types must be specified. Suggestion error level: Warning
Schema
Some changes may affect running applications, such as modifying the name of database object, adding new constraints, etc. This rule can avoid careless changes that lead to the failure of existing application. Suggestion error level: Warning
Database
Deletion is only allowed when there are no tables in the database, which can greatly avoid accidental deletion. Suggested error level: Error
Index
Creating an index with duplicate columns will result in failure. Suggestion error level: Error
The "BLOB" type is usually used to store binary data and should not be used as a query condition. If an index is created on this column type by mistake, it will consume a lot of resources and cause serious performance impact. Suggestion error level: Error
Enforce the primary key type to be INT or BIGINT.
A composite index with over 5 columns does not significantly improve query performance, but it occupies a lot of space and reduces DML performance. Suggestion error level: Warning
Maximum column count: 5Although indexes can improve query performance, they also occupy a lot of space and reduce DML performance. Therefore, it is not recommended to create more than 5 indexes in a table. Suggestion error level: Warning
Maximum index count: 5The appropriate primary key type can optimize storage structure, reduce space usage, and beneficial for insert and query performance. Suggestion error level: Warning
Allow list: SERIAL, BIGSERIAL, INT, BIGINTIn PostgreSQL 11 and above, using the standard statement to create an index will cause table locking and unable to write. Using the "CONCURRENTLY" mode can avoid this problem. Suggestion error level: Warning
Different index types have different performance characteristics. For example, B-tree indexes are suitable for range queries, while hash indexes are suitable for equality queries. Suggestion error level: Warning
Allow list: BTREE, HASHRedundant index may result in performance loss and occupy additional space. For example, the index on columns (c1, c2) will be treated as redundant indexes if there is already a index on column (c1). Suggestion error level: Warning
System
The character set determines which characters can be stored in the table. Using the wrong character set may result in certain characters in the application being unable to be stored and displayed correctly, such as CJK and Emoji. Suggested error level: Error
Allow list: utf8mb4The collation determines the rules for character comparison and sorting. For example, when using a case-insensitive collation, "ABC" and "abc" will be treated as the same string in queries. Suggested error level: Error
Allow list: utf8mb4_0900_ai_ci- Maximum comment length: 64
This rule prohibits the execution of procedures within the database. System procedures often perform critical operations that could impact the stability and security of the database environment. By disallowing their execution, it helps prevent unintended changes and potential vulnerabilities. Suggested error level: Warning
This rule prohibits the creation of events within the database. System events often perform automated tasks that could affect the database environment. By disallowing their creation, it helps maintain control over database operations and prevents potential disruptions. Suggested error level: Warning
This rule prohibits the creation of views within the database. Views provide a virtual representation of data that can simplify queries and enhance data security. By disallowing their creation, it helps maintain control over database schema and prevents potential security risks. Suggested error level: Warning
This rule prohibits the creation of functions within the database. Functions provide reusable logic that can simplify queries and enhance data integrity. By disallowing their creation, it helps maintain control over database schema and prevents potential security risks. Suggested error level: Warning
This rule restricts the usage of specific functions within the database. By disallowing the use of these functions, it helps maintain data consistency and security. Suggested error level: Warning
Disallow list: RAND, UUID, SLEEP