Skip to main content

Database Guidelines for Golang Projects

Introduction

This document provides comprehensive guidelines for database design and interaction within Golang projects. It covers naming conventions, table design principles, SQL writing best practices, transaction management, and performance considerations. Adhering to these guidelines will promote consistency, maintainability, and optimal performance in your database interactions.

Scope

These guidelines primarily focus on MySQL databases, but many principles are applicable to other relational database systems as well. Specific instructions for Redis (caching layer) can be found in separate documentation.

Target Audience

These guidelines are intended for all developers working on Golang backend projects that interact with databases.

Naming Conventions

Strict adherence to naming conventions is mandatory. Review these conventions before designing database resources.

General Considerations:

  • Avoid using any MySQL keywords or reserved words for database, table, field, or index names. For example, don't use type as a field name; use user_type instead.
  • Use only English for naming and comments. Avoid non-English characters.
  • Keep names concise and descriptive.

Database (DB)

  • Format: lower_case_with_underscores with suffix _db. Example: account_db, rent_account_db, gop_txn_db, rent_admin_db
  • Prefix: Use the project code as a prefix. Example: rent_account_db
  • Region Specific: If the database is region-specific, add the region code before _db. Example: rent_admin_vn_db
  • Sharding: For sharded databases, append the shard ID as a suffix after _db, padding to 8 digits. Follow the naming conventions in the rent Database Design Guide. Example: rent_account_db_00000000, log_db_00002018
  • Length: The length of the database name should be less than 42 characters due to DNS restrictions when connecting to the database.

Table

  • Format: lower_case_with_underscores with suffix _tab. Example: user_tab
  • Sharding: For sharded tables, append the shard ID as a suffix after _tab, padding to 8 digits, starting from 0. Example: account_tab_00000000, order_tab_00001000, login_log_tab_00201801, audit_log_tab_20180101
  • Length: The length of the table name should be less than 48 characters.

Field

  • Format: lower_case_with_underscores. Example: order_id, create_time
  • Foreign Key Reference: If the field is a foreign key referencing another table's integer primary key, suffix it with _id.

Index

  • Format: lower_case_with_underscores, joined keys with prefix idx_. Example: idx_order_id, idx_key1_key2
  • Unique Index: For unique indices, use the prefix uniq_idx_. Example: uniq_idx_email

Table Design Guidelines

These guidelines are essential to follow before creating a database schema.

CategoryRuleDescription
Charset & EngineMUSTCharset: utf8mb4, Engine: InnoDB. utf8mb4 is essential for supporting a wide range of characters.
Table CollationRECOMMENDEDTable collation should typically use utf8mb4_unicode_ci (recommended for most cases) or latin1_general_ci (only for specific cases with legacy systems or performance reasons and after careful consideration).
Field CollationMUSTUse *_cs (case-sensitive) in the field name if the field requires case-sensitive collation. Example: email_cs. Be mindful of performance implications with case-sensitive collations.
Primary KeyMUSTIf the primary key is not sequential and data is not deleted frequently, create an auto-incrementing primary key column named id of type BIGINT UNSIGNED NOT NULL AUTO_INCREMENT. This aids in indexing and performance.
MUSTAvoid using foreign keys in MySQL due to performance overhead and complexities in table sharding. Handle relationships in application code.
Table ShardingRECOMMENDEDUse table sharding when data size exceeds 10GB or the number of records exceeds 10 million. Consider the sharding strategy carefully based on data access patterns.
Triggers/Stored ProceduresMUSTAvoid using events, triggers, or stored procedures. Complex logic should reside in the application layer for better maintainability and portability.
Common ColumnsMUSTColumns like user_id, shop_id, order_id, item_id must be defined as BIGINT(20) NOT NULL. Use BIGINT to accommodate large numbers.
TimestampMUSTUse INT UNSIGNED or BIGINT UNSIGNED for UNIX timestamps in seconds. If using UNIX timestamps in milliseconds, use BIGINT UNSIGNED. Choose the appropriate type based on the expected range of values.
String TypesMUSTUse CHAR for fixed-length strings and VARCHAR for variable-length strings. Use TEXT, MEDIUMTEXT, LONGTEXT, or BLOB for strings longer than 5000 ( or variable size ) characters. Match the text length to the most suitable type and minimize size growth.
RECOMMENDEDInnoDB page size: 16k. Consider separating frequently accessed columns from less frequently accessed large columns to optimize disk I/O.
Numeric TypesRECOMMENDEDUse numeric types that don't lose precision (INT, BIGINT, DECIMAL) over types that may cause precision loss (FLOAT, DOUBLE). Prefer UNSIGNED types where negative values are not required to increase the maximum representable positive value.
Default ValuesRECOMMENDEDDon't set default values unless genuinely necessary, especially when adding new fields to existing tables to prevent unnecessary schema changes during upgrades/rollbacks. Instead, handle defaults in application logic.
NullabilityRECOMMENDEDAvoid NULL unless it represents a meaningful "unknown" or "missing" state. Use NOT NULL with a default value where applicable (except for TEXT, MEDIUMTEXT, LONGTEXT, BLOB). This improves query performance and reduces complexity.
ENUM TypeMUSTDo not use ENUM type. Use VARCHAR with a CHECK constraint or a lookup table for better flexibility and maintainability.
Add ColumnMUSTWhen adding a new column, don't specify location (FIRST / AFTER). Append the new column as the last one. Explicitly specifying column order can cause issues with schema evolution.
IndexesRECOMMENDEDCreate indexes on columns with high selectivity (distinct values). Put the most selective column first in composite indexes. Avoid creating indexes on every column, as this can degrade write performance.
RECOMMENDEDCreate prefix indexes for long VARCHAR/TEXT columns. Example: INDEX idx_long_text (long_text(100)); This indexes only the first 100 characters, saving space and improving performance.
JSON ColumnsRECOMMENDEDMySQL 5.7+ supports JSON columns. Consider using them for storing semi-structured data, but be aware of performance implications. Index specific elements within the JSON document using virtual columns.
SQL Usage - QueriesMUSTAvoid implicit data-type conversion in WHERE clauses. This can prevent indexes from being used. Example: WHERE id = 123 (integer) is preferred over WHERE id = '123' (string).
MUSTDo not use INSERT ... ON DUPLICATE KEY UPDATE (can cause deadlocks in MySQL 5.7, especially older versions like 5.7.26. Fixed in later versions. Use INSERT IGNORE or separate SELECT and UPDATE statements instead to handle conflict resolution.).
MUSTAvoid using subqueries (especially correlated subqueries), WHERE NOT IN, !=, NOT LIKE (especially leading wildcards), and WHERE LIKE '%abc%'. These constructs often lead to poor performance. Explore alternatives like joins, EXISTS, and full-text search.
MUSTFor efficient pagination, avoid using OFFSET with large LIMIT values. Use cursor-based pagination (e.g., filtering based on the last seen ID or timestamp). Always include an ORDER BY clause with LIMIT. Ensure that the ORDER BY clause includes a unique column to prevent skipping or duplicating results.
RECOMMENDEDAlways use ORDER BY for consistent sorting of records when ordering is logically required. The order of records is not guaranteed without an explicit ORDER BY clause.
Cache LayerRECOMMENDEDFor user-facing applications, implement a caching layer (e.g., Redis) to reduce database load and improve response times. Implement rate limiting to prevent abuse. Consider read-write separation to direct reads to replicas and writes to the primary database instance.
Delete/UpdateRECOMMENDEDWhen writing scripts for data manipulation, ensure operations don't exceed 1000 rows per second and maintain a 1-second interval between operations. Ensure that the MySQL instance's QPS (Queries Per Second) remains below 3k/s to avoid overloading the server.
Transaction ManagementMUSTFor transactions, split large transactions into smaller batches of updates to minimize lock contention and improve concurrency.
RECOMMENDEDMinimize external dependencies (e.g., RPC calls, Codis/Copi2 operations) within transactions to reduce the risk of failures and long-running transactions.
Transaction IsolationRECOMMENDEDEnsure appropriate isolation levels for transactions, adhering to the ACID principles. The default isolation level for MySQL is REPEATABLE READ (RR). Consider using READ COMMITTED for read-heavy workloads and shorter transactions, but be aware of the possibility of non-repeatable reads.
Data ArchivingRECOMMENDEDImplement a data archiving strategy to move infrequently accessed data to a separate storage system. This will improve the performance of the primary database.
Data Masking / AnonymizationMUSTWhen dealing with sensitive data (e.g., PII), implement data masking or anonymization techniques to protect user privacy and comply with regulations.
MonitoringMUSTImplement robust monitoring systems to track database performance metrics such as query execution time, connection pool usage, and resource utilization. Set up alerts to notify administrators of potential issues.

SQL Writing Best Practices

Beyond the table design guidelines, here are some best practices for writing SQL queries:

  • Use Prepared Statements: Always use prepared statements to prevent SQL injection vulnerabilities and improve performance (by reusing query execution plans). Most Golang database drivers provide good support for prepared statements.

  • Explain Your Queries: Use EXPLAIN to analyze query execution plans and identify potential bottlenecks. Pay attention to indexes being used, the number of rows examined, and the join types.

  • Limit Result Sets: Avoid selecting all columns (SELECT *) unless absolutely necessary. Specify only the columns you need to reduce data transfer and memory usage.

  • Use Joins Wisely: Understand the different types of joins (INNER, LEFT, RIGHT, FULL) and choose the appropriate join based on the desired result set. Optimize join performance by ensuring that joined columns are indexed.

  • Optimize WHERE Clauses: Write efficient WHERE clauses that use indexes effectively. Avoid using functions or calculations in the WHERE clause that prevent index usage.

  • Batch Operations: When inserting, updating, or deleting multiple rows, use batch operations (e.g., INSERT INTO ... VALUES (...), (...), (...)) to reduce the number of round trips to the database.

  • Connection Pooling: Use connection pooling to manage database connections efficiently. This avoids the overhead of creating a new connection for each query. The database/sql package in Go provides built-in connection pooling.

Transaction Management

Transactions are crucial for maintaining data integrity.

  • Start and Commit/Rollback Explicitly: Always start transactions explicitly and either commit or rollback the transaction based on the outcome of the operations within the transaction.

  • Keep Transactions Short: Long-running transactions can lead to lock contention and performance issues. Keep transactions as short as possible by performing only the necessary operations within the transaction.

  • Handle Errors Properly: Properly handle errors within transactions and rollback the transaction if any error occurs.

  • Use Appropriate Isolation Levels: Choose the appropriate isolation level based on the application's requirements. The default isolation level (REPEATABLE READ) provides good consistency, but higher isolation levels can impact performance.

Examples

This section provides examples of common SQL patterns and how to optimize them:

Example 1: Efficient Pagination

Inefficient (using OFFSET):

SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 100000;