Database_tuning
Chapter 75: Database Performance Tuning
Section titled “Chapter 75: Database Performance Tuning”Overview
Section titled “Overview”Database performance tuning is critical for DevOps and SRE roles. This chapter covers comprehensive tuning strategies for MySQL/MariaDB and PostgreSQL, including memory configuration, query optimization, indexing strategies, connection pooling, and performance monitoring. Understanding these topics is essential for maintaining high-performance database systems in production environments.
75.1 MySQL/MariaDB Tuning
Section titled “75.1 MySQL/MariaDB Tuning”Architecture Overview
Section titled “Architecture Overview”┌─────────────────────────────────────────────────────────────────────────┐│ MySQL/MariaDB ARCHITECTURE │├─────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ CONNECTION LAYER │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ││ │ │ Connection │ │ Connection │ │ Connection │ │ ││ │ │ Thread 1 │ │ Thread 2 │ │ Thread N │ │ ││ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ ││ │ │ │ │ │ ││ │ └────────────────┼────────────────┘ │ ││ │ ▼ │ ││ │ ┌────────────────────────┐ │ ││ │ │ Thread Pool/ │ │ ││ │ │ Connection Pool │ │ ││ │ └────────────────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ SQL LAYER │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ Parser → Optimizer → Query Cache → Execution Engine │ ││ │ │ ││ │ Key Operations: │ ││ │ - Query parsing and validation │ ││ │ - Query optimization (EXPLAIN) │ ││ │ - Index selection │ ││ │ - Execution plan generation │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ STORAGE ENGINE LAYER │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ┌─────────────────┐ ┌─────────────────┐ │ ││ │ │ InnoDB │ │ MyISAM │ │ ││ │ │ (Default) │ │ (Legacy) │ │ ││ │ ├─────────────────┤ ├─────────────────┤ │ ││ │ │ - ACID │ │ - Full-text │ │ ││ │ │ - Row-level │ │ - Table-lock │ │ ││ │ │ - MVCC │ │ - No transaction│ │ ││ │ │ - Transaction │ │ - Fast reads │ │ ││ │ │ - Crash-safe │ │ │ │ ││ │ └─────────────────┘ └─────────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ MEMORY STRUCTURES │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ┌─────────────────┐ ┌─────────────────┐ │ ││ │ │ Buffer Pool │ │ Query Cache │ (MariaDB 10.x) │ ││ │ │ (InnoDB) │ │ (Deprecated) │ │ ││ │ │ - Data cache │ │ - Results │ │ ││ │ │ - Index cache │ │ - Query │ │ ││ │ └─────────────────┘ └─────────────────┘ │ ││ │ │ ││ │ ┌─────────────────┐ ┌─────────────────┐ │ ││ │ │ Key Buffer │ │ Sort Buffer │ │ ││ │ │ (MyISAM) │ │ │ │ ││ │ └─────────────────┘ └─────────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────┘Memory Configuration
Section titled “Memory Configuration”# ============================================================# MySQL/MariaDB Memory Configuration - postgresql.conf equivalent# ============================================================
[mysqld]# ========================================# INNODB SETTINGS (Most Important)# ========================================
# Buffer Pool - Most critical setting# Should be 70-80% of available RAM for dedicated DB serverinnodb_buffer_pool_size = 4Ginnodb_buffer_pool_instances = 8 # Multiple instances reduce contentioninnodb_buffer_pool_chunk_size = 128M
# Transaction logsinnodb_log_file_size = 512M # Larger = better write performanceinnodb_log_buffer_size = 16M # For transaction commitsinnodb_flush_log_at_trx_commit = 1 # 1 = full ACID, 2 = some data loss possible# 1: ACID compliant (default, safest)# 2: Better performance, ~1 sec data loss on crash# 0: Best performance, up to 1 sec data loss
# Thread handlinginnodb_thread_concurrency = 0 # 0 = unlimited (usually best)innodb_thread_sleep_delay = 10000 # Microseconds to sleep
# I/O settingsinnodb_io_capacity = 2000 # SSD: 7000+, HDD: 200-800innodb_io_capacity_max = 4000innodb_read_io_threads = 4 # Parallel read threadsinnodb_write_io_threads = 4 # Parallel write threadsinnodb_flush_method = O_DIRECT # Avoid double buffering (Linux)
# File formatinnodb_file_per_table = 1 # One file per tableinnodb_file_format = Barracuda # For compression
# ========================================# CONNECTION SETTINGS# ========================================
max_connections = 200 # Adjust based on expected load# Each connection uses ~256KB-1MB RAM minimum# For 200 connections + buffer pool, ensure enough RAM
# Thread cachethread_cache_size = 50 # Threads to keep cachedtable_open_cache = 4000 # Table cachetable_definition_cache = 2000 # Table definitions
# ========================================# QUERY CACHE (MySQL 5.7 - DEPRECATED in 8.0)# MariaDB 10.x still has it# ========================================# query_cache_type = 0 # Disabled in MySQL 8.0+# query_cache_size = 0 # Disabled in MySQL 8.0+# MariaDB alternative: query_cache_type = 1 (but still has overhead)
# ========================================# TEMP TABLES AND SORTS# ========================================
tmp_table_size = 64M # Max size for in-memory temp tablesmax_heap_table_size = 64M # Max size for MEMORY tablessort_buffer_size = 2M # Per-connection sort bufferjoin_buffer_size = 2M # Per-connection join bufferread_buffer_size = 2M # Sequential scan bufferread_rnd_buffer_size = 8M # Random read buffer
# ========================================# LOGGING# ========================================
slow_query_log = 1slow_query_log_file = /var/log/mysql/slow.loglong_query_time = 2 # Log queries > 2 secondslog_queries_not_using_indexes = 1 # Log queries without indexeslog_throttle_queries_not_using_indexes = 10
general_log = 0 # Disable in productiongeneral_log_file = /var/log/mysql/general.log
# Binary logging (for replication/point-in-time recovery)server_id = 1log_bin = /var/log/mysql/mysql-binbinlog_format = ROW # ROW, STATEMENT, or MIXEDbinlog_row_image = FULL # FULL, MINIMAL, NOBLOBexpire_logs_days = 7max_binlog_size = 100M
# ========================================# CHARACTER SET# ========================================
character-set-server = utf8mb4collation-server = utf8mb4_unicode_ciQuery Optimization
Section titled “Query Optimization”-- ============================================================-- MySQL QUERY OPTIMIZATION-- ============================================================
-- EXPLAIN - Analyze query executionEXPLAIN SELECT * FROM users WHERE email = 'user@example.com';EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';
-- Understanding EXPLAIN output:-- type: ALL (full scan) < index < range < ref < eq_ref < const-- key: Index being used-- rows: Estimated rows to examine-- Extra: Using filesort, Using temporary, Using index
-- Show index usageSHOW INDEX FROM users;SHOW TABLE STATUS FROM database;
-- Find missing indexes-- Enable slow query log and analyze-- Or use: EXPLAIN for slow queries
-- Query optimization examples:
-- BAD: SELECT * FROM orders WHERE YEAR(created_at) = 2024;-- GOOD: SELECT * FROM orders WHERE created_at >= '2024-01-01'-- AND created_at < '2025-01-01';
-- BAD: SELECT * FROM users WHERE email LIKE '%gmail.com';-- GOOD: Can't optimize leading wildcard-- Alternative: Full-text search or Elasticsearch
-- Index optimization:ALTER TABLE orders ADD INDEX idx_created (created_at);ALTER TABLE orders ADD INDEX idx_status_created (status, created_at);ALTER TABLE orders ADD INDEX idx_customer_order (customer_id, order_id);
-- Composite index order matters!-- Index on (a, b, c) helps queries with:-- WHERE a = x-- WHERE a = x AND b = y-- WHERE a = x AND b = y AND c = z-- But NOT: WHERE b = y OR WHERE c = z
-- Covering index (includes all needed columns)ALTER TABLE users ADD INDEX idx_covering (id, email, name);
-- Check table statisticsANALYZE TABLE users;OPTIMIZE TABLE users;Monitoring and Troubleshooting
Section titled “Monitoring and Troubleshooting”-- ============================================================-- MySQL MONITORING QUERIES-- ============================================================
-- Current connectionsSHOW PROCESSLIST;SELECT * FROM information_schema.PROCESSLIST;
-- Connection statusSHOW STATUS LIKE 'Threads_connected';SHOW STATUS LIKE 'Threads_running';SHOW STATUS LIKE 'Max_used_connections';
-- Query cache (MySQL 5.7)SHOW STATUS LIKE 'Qcache%';-- Qcache_hits, Qcache_inserts, Qcache_not_cached
-- InnoDB statusSHOW ENGINE INNODB STATUS;-- Look for: lock waits, buffer pool statistics
-- InnoDB metricsSHOW GLOBAL STATUS LIKE 'Innodb_%';
-- Table cacheSHOW GLOBAL STATUS LIKE 'Open_%';
-- Slow query analysisSELECT * FROM mysql.slow_log ORDER BY start_time DESC LIMIT 10;
-- Lock analysisSELECT * FROM information_schema.INNODB_LOCKS;SELECT * FROM information_schema.INNODB_TRX;SELECT * FROM information_schema.INNODB_LOCK_WAITS;
-- Kill problematic queryKILL QUERY 12345; -- Kill specific query in connectionKILL 12345; -- Kill entire connection75.2 PostgreSQL Tuning
Section titled “75.2 PostgreSQL Tuning”PostgreSQL Architecture
Section titled “PostgreSQL Architecture”┌─────────────────────────────────────────────────────────────────────────┐│ PostgreSQL ARCHITECTURE │├─────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ PROCESS MODEL │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ┌──────────────┐ ┌──────────────┐ │ ││ │ │ Postgres │ │ Background │ │ ││ │ │ (Master) │────►│ Processes │ │ ││ │ │ │ │ - Writer │ │ ││ │ │ - Listens │ │ - WAL │ │ ││ │ │ for │ │ - Checkpt │ │ ││ │ │ requests │ │ - AutoVac │ │ ││ │ └──────┬───────┘ │ - Logger │ │ ││ │ │ └──────────────┘ │ ││ │ │ connections │ ││ │ ▼ │ ││ │ ┌─────────────────────────────────────────────────────────┐ │ ││ │ │ Per-Connection Process (Backend) │ │ ││ │ │ ┌─────────────────────────────────────────────────┐ │ │ ││ │ │ │ Parser → Analyzer → Rewriter → Planner → │ │ │ ││ │ │ │ Executor │ │ │ ││ │ │ └─────────────────────────────────────────────────┘ │ │ ││ │ └─────────────────────────────────────────────────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ MEMORY STRUCTURES │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ┌─────────────────────────────────────────────────────────┐ │ ││ │ │ Shared Memory │ ││ │ ├─────────────────────────────────────────────────────────┤ │ ││ │ │ │ │ ││ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ ││ │ │ │ Shared │ │ WAL │ │ Stats │ │ │ ││ │ │ │ Buffers │ │ Buffers │ │ Shared │ │ │ ││ │ │ │ (data+idx) │ │ (writes) │ │ Memory │ │ │ ││ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ ││ │ │ │ │ ││ │ └─────────────────────────────────────────────────────────┘ │ ││ │ │ ││ │ ┌─────────────────────────────────────────────────────────┐ │ ││ │ │ Per-Process Memory │ ││ │ ├─────────────────────────────────────────────────────────┤ │ ││ │ │ │ │ ││ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ ││ │ │ │ work_mem │ │ maintenance │ │ temp │ │ │ ││ │ │ │ (sort, │ │ _work_mem │ │ buffers │ │ │ ││ │ │ │ hash, │ │ (VACUUM, │ │ (complex │ │ │ ││ │ │ │ merge) │ │ CREATE │ │ queries) │ │ │ ││ │ │ │ │ │ INDEX) │ │ │ │ │ ││ │ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ ││ │ │ │ │ ││ │ └─────────────────────────────────────────────────────────┘ │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────┘PostgreSQL Configuration
Section titled “PostgreSQL Configuration”# ============================================================# PostgreSQL Memory Configuration - postgresql.conf# ============================================================
# ========================================# MEMORY SETTINGS (Most Critical)# ========================================
# Shared memory - for caching table data# Should be 25% of RAM for dedicated DB servershared_buffers = 2GB
# Memory for sorting/hashing per operation# Increase for complex queries, but watch total RAMwork_mem = 16MB
# Memory for maintenance operations (VACUUM, CREATE INDEX)# Should be larger than work_memmaintenance_work_mem = 256MB
# Memory for temporary tablestemp_buffers = 8MB
# Effective cache size - hint to planner# Set to 75% of available RAMeffective_cache_size = 6GB
# ========================================# CONNECTION SETTINGS# ========================================
# Max connections - balance between connections and RAM# Each connection uses work_mem + overheadmax_connections = 100
# Connection pooling (recommended: PgBouncer)# pg_hba.conf for connection auth
# ========================================# WRITE AHEAD LOG (WAL) SETTINGS# ========================================
# WAL level: minimal, replica, logicalwal_level = replica # For replication# minimal: Basic crash recovery# replica: Point-in-time recovery + streaming# logical: Logical replication
# WAL buffers - critical for write performancewal_buffers = 16MB # 3% of shared_buffers
# Checkpointscheckpoint_completion_target = 0.9 # Spread checkpoint writescheckpoint_timeout = 10min # Time between checkpointsmin_wal_size = 1GB # Minimum WAL to keepmax_wal_size = 2GB # WAL max before checkpoint
# Synchronous commitsynchronous_commit = on # ACID: on, off, local, remote_write# on: Wait for WAL (safest)# off: Better performance, potential data loss# remote_write: Wait for replica to receive
# ========================================# QUERY PLANNER# ========================================
# Random page cost - lower for SSDsrandom_page_cost = 1.1 # SSD: 1.1, HDD: 4.0effective_io_concurrency = 200 # Parallel I/O (SSD)
# Parallel queriesmax_worker_processes = 8 # CPU coresmax_parallel_workers_per_gather = 4 # Parallel workersmax_parallel_workers = 8max_parallel_maintenance_workers = 4
# Statisticsdefault_statistics_target = 100 # Planner accuracy (100-10000)
# ========================================# LOGGING# ========================================
log_destination = 'stderr'logging_collector = onlog_directory = 'log'log_filename = 'postgresql-%Y-%m-%d.log'log_rotation_age = 1dlog_rotation_size = 100MB
# Slow query logginglog_min_duration_statement = 1000 # Log queries > 1 secondlog_connections = onlog_disconnections = onlog_lock_waits = on # Log lock waits > deadlock_timeoutlog_temp_files = 0 # Log temp file usage
# Statement statisticstrack_activities = ontrack_counts = ontrack_io_timing = ontrack_functions = none # or pl for functions
# ========================================# AUTOVACUUM# ========================================
autovacuum = onautovacuum_max_workers = 3autovacuum_naptime = 1minautovacuum_vacuum_threshold = 50autovacuum_analyze_threshold = 50autovacuum_vacuum_scale_factor = 0.2 # Vac when 20% updatedautovacuum_analyze_scale_factor = 0.1 # Analyze when 10% changedQuery Optimization
Section titled “Query Optimization”-- ============================================================-- PostgreSQL QUERY OPTIMIZATION-- ============================================================
-- EXPLAIN - Query plan analysisEXPLAIN SELECT * FROM users WHERE email = 'user@example.com';EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';
-- Understanding the output:-- Seq Scan: Full table scan (bad for large tables)-- Index Scan: Uses index (good)-- Index Only Scan: Index covers all columns (best)-- Bitmap Scan: Uses bitmap index (good for large datasets)-- Hash Join, Merge Join, Nested Loop: Join strategies
-- Index creationCREATE INDEX idx_users_email ON users(email);CREATE INDEX idx_orders_status_date ON orders(status, created_at);CREATE INDEX idx_users_email ON users(email) WHERE status = 'active';
-- Partial index (smaller, faster)CREATE INDEX idx_orders_completed ON orders(created_at) WHERE status = 'completed';
-- Covering index (INCLUDE)CREATE INDEX idx_users_email_name ON users(email) INCLUDE (name, phone);
-- BRIN index (for time-series, append-only)CREATE INDEX idx_logs_time ON logs USING BRIN(created_at);
-- Reindex for performanceREINDEX INDEX idx_users_email;REINDEX TABLE users;
-- Table statisticsVACUUM ANALYZE users;-- VACUUM: Reclaims space, updates statistics-- ANALYZE: Updates statistics (needed for planner)
-- Check index usageSELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetchFROM pg_stat_user_indexesORDER BY idx_scan DESC;
-- Find unused indexesSELECT indexrelname, idx_scanFROM pg_stat_user_indexesWHERE idx_scan = 0AND indexrelname NOT LIKE '%pkey%';
-- Find missing indexes-- Enable auto_explain and check logs-- Or use: pg_stat_statementsCREATE EXTENSION pg_stat_statements;SELECT query, calls, mean_time, total_timeFROM pg_stat_statementsORDER BY mean_time DESC LIMIT 10;Monitoring
Section titled “Monitoring”-- ============================================================-- PostgreSQL MONITORING QUERIES-- ============================================================
-- Current activitySELECT * FROM pg_stat_activity;-- Useful for finding long-running queries
-- Database-level statsSELECT * FROM pg_stat_database;
-- Table statisticsSELECT * FROM pg_stat_user_tables;-- seq_scan, idx_scan, n_tup_ins/upd/del, n_live_tup
-- Index statisticsSELECT * FROM pg_stat_user_indexes;
-- Cache hit ratio (should be > 95%)SELECT sum(heap_blks_read) as heap_read, sum(heap_blks_hit) as heap_hit, round(sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100, 2) as ratioFROM pg_statio_user_tables;
-- Index hit ratioSELECT sum(idx_blks_read) as idx_read, sum(idx_blks_hit) as idx_hit, round(sum(idx_blks_hit) / (sum(idx_blks_hit) + sum(idx_blks_read)) * 100, 2) as ratioFROM pg_statio_user_indexes;
-- Table sizesSELECT relname, pg_size_pretty(pg_total_relation_size(relid))FROM pg_catalog.pg_statio_user_tablesORDER BY pg_total_relation_size(relid) DESC;
-- Lock analysisSELECT * FROM pg_locks;SELECT pg_blocking_pids(pid) as blocked_by FROM pg_stat_activity;
-- WAL usageSELECT * FROM pg_stat_bgwriter;-- checkpoints_timed, checkpoints_req, buffers_checkpoint
-- ConfigurationSHOW max_connections;SHOW shared_buffers;SHOW work_mem;75.3 Connection Pooling
Section titled “75.3 Connection Pooling”Connection Pooling Architecture
Section titled “Connection Pooling Architecture”┌─────────────────────────────────────────────────────────────────────────┐│ CONNECTION POOLING │├─────────────────────────────────────────────────────────────────────────┤│ ││ WITHOUT POOLING: ││ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ││ │Request │ ───► │ Conn 1 │ ───► │ Conn 2 │ ───► │ Conn 3 │ ... ││ │ 1 │ │ Create │ │ Create │ │ Create │ ││ └────────┘ └────────┘ └────────┘ └────────┘ ││ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ││ │Request │ ───► │ Close │ ───► │ Close │ ───► │ Close │ ││ │ 2 │ │ Create │ │ Create │ │ Create │ ││ └────────┘ └────────┘ └────────┘ └────────┘ ││ ││ Each request: Create connection → Query → Close connection ││ Overhead: ~50-500ms per connection ││ ││ WITH POOLING: ││ ┌────────┐ ┌────────────────────────────────────────┐ ││ │Request │ ───► │ Connection Pool │ ││ │ 1 │ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ ││ └────────┘ │ │ P1 │ │ P2 │ │ P3 │ │ P4 │ │ ││ │ └────┘ └────┘ └────┘ └────┘ │ ││ ┌────────┐ │ (reused) (reused) │ ││ │Request │ ───► │ │ ││ │ 2 │ └────────────────────────────────────────┘ ││ └────────┘ ││ ││ Connections created once, reused for multiple requests ││ │└─────────────────────────────────────────────────────────────────────────┘PgBouncer Configuration
Section titled “PgBouncer Configuration”# ============================================================# PgBouncer Configuration# ============================================================
# pgbouncer.ini
[databases]mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]listen_addr = 127.0.0.1listen_port = 6432auth_type = md5auth_file = /etc/pgbouncer/userlist.txt
# Pool sizespool_mode = transaction # transaction, session, statementmax_client_conn = 1000 # Max connections to pgbouncerdefault_pool_size = 20 # Connections per user/databasemin_pool_size = 5 # Always keep this manyreserve_pool_size = 5 # Extra connections for burstsreserve_pool_timeout = 5 # Seconds before extra connection
# Timeoutsserver_login_retry = 3server_idle_timeout = 600server_lifetime = 3600server_round_robin = 1
# Performanceserver_check_delay = 30query_timeout = 0 # 0 = disabledclient_login_timeout = 60
# Logginglog_connections = 0log_disconnections = 0log_pooler_errors = 1
# ============================================================# pg_hba.conf (for PgBouncer)# ========================================# host all all 127.0.0.1/32 md5MySQL Proxy/Connection Pool
Section titled “MySQL Proxy/Connection Pool”# ============================================================# MySQL Connection Pooling Options# ============================================================
# Option 1: MySQL Enterprise Pool (MySQL 8.0)# Built-in connection pool in MySQL 8.0.13+
# Option 2: ProxySQL# Create /etc/proxysql.cnf
datadir="/var/lib/proxysql"admin_variables={ admin_credentials="admin:admin" mysql_ifaces="127.0.0.1:6032;/var/lib/proxysql/proxysql.db"}
mysql_variables={ listeners=( { addr="0.0.0.0" port=6033 schema="mydb" hostgroup_id=1 } ) max_connections=1000 default_schema="mydb"}
mysql_servers =( { address="127.0.0.1" port=3306 hostgroup_id=1 status="ONLINE" })
mysql_users =( { username="app_user" password="password" default_hostgroup=1 })
# Option 3: MariaDB MaxScale# Similar to ProxySQL
# Option 4: Application-level pooling# - HikariCP (Java)# - PGBouncer alternative: MySQL doesn't have native pooling75.4 Indexing Strategies
Section titled “75.4 Indexing Strategies”Index Design Guidelines
Section titled “Index Design Guidelines”┌─────────────────────────────────────────────────────────────────────────┐│ INDEXING STRATEGIES │├─────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ WHEN TO INDEX │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ ✓ WHERE clauses: column used in filtering │ ││ │ ✓ JOIN conditions: columns used to join tables │ ││ │ ✓ ORDER BY: columns used for sorting │ ││ │ ✓ GROUP BY: columns used for grouping │ ││ │ ✓ DISTINCT: columns with DISTINCT │ ││ │ ✓ Aggregate functions: columns in MIN/MAX/COUNT │ ││ │ │ ││ │ ✗ Low cardinality: (e.g., boolean flags) │ ││ │ ✗ Rarely queried columns │ ││ │ ✗ Tables with mostly writes (indexes slow writes) │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ INDEX COLUMN ORDER │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ Index: (A, B, C) │ ││ │ │ ││ │ Query: │ Uses Index? │ ││ │ ──────────────────────────┼─────────────────────────────── │ ││ │ WHERE A = x │ ✓ Uses A │ ││ │ WHERE A = x AND B = y │ ✓ Uses A, B │ ││ │ WHERE A = x AND B = y │ ✓ Uses A, B, C │ ││ │ AND C = z │ │ ││ │ WHERE B = y │ ✗ Can't use (A not specified) │ ││ │ WHERE A = x AND C = z │ ✗ Uses A only (C not consecutive)│ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ INDEX TYPES │ ││ ├─────────────────────────────────────────────────────────────────┤ ││ │ │ ││ │ B-Tree (Default) │ ││ │ - =, <, >, <=, >=, BETWEEN, LIKE 'prefix%' │ ││ │ │ ││ │ Hash │ ││ │ - = only (no range queries) │ ││ │ - Memory tables │ ││ │ │ ││ │ GiST (PostgreSQL) │ ││ │ - Spatial, full-text, range types │ ││ │ │ ││ │ GIN (PostgreSQL) │ ││ │ - Inverted indexes, arrays, JSONB │ ││ │ │ ││ │ BRIN (PostgreSQL) │ ││ │ - Block range indexes, time-series data │ ││ │ │ ││ │ Full-Text │ ││ │ - Text search in MySQL/MariaDB and PostgreSQL │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────┘75.5 Interview Questions
Section titled “75.5 Interview Questions”┌─────────────────────────────────────────────────────────────────────────┐│ DATABASE TUNING INTERVIEW QUESTIONS │├─────────────────────────────────────────────────────────────────────────┤ │Q1: What is the most important MySQL/InnoDB configuration setting? │ │A1: │innodb_buffer_pool_size is the most critical setting: │- Should be 70-80% of available RAM for dedicated DB server │- Caches both data and indexes │- Reduces disk I/O significantly │- Formula: Available RAM - (max_connections × work_mem) │- For 8GB RAM, use ~6GB for buffer pool │ │─────────────────────────────────────────────────────────────────────────┤ │Q2: Explain the difference between MySQL and PostgreSQL. │ │A2: │| Feature | MySQL | PostgreSQL | │|-------------------|----------------------|------------------------| || Type | MySQL AB → Oracle | Open source (PGDG) | || ACID | InnoDB: Full | Full | || Concurrency | Per-table locks | MVCC | || Indexes | B-tree, full-text, | B-tree, GiST, GIN, | || | spatial, hash | BRIN, Hash | || JSON | JSON, JSONB | JSONB (native) | || Query optimizer | Simple | Cost-based (advanced) | || Replication | Binlog | WAL-based | || Extensions | Limited | Rich (PostGIS, etc.) | || Strengths | Simplicity, speed | Features, standards | | │─────────────────────────────────────────────────────────────────────────┤ │Q3: How do you optimize a slow SQL query? │ │A3: │1. Run EXPLAIN to analyze execution plan │2. Check for full table scans (Seq Scan) │3. Add appropriate indexes │4. Verify index usage in query │5. Check for function usage in WHERE (can't use index) │6. Look for implicit type conversions │7. Check JOIN order (PostgreSQL optimizer handles, MySQL doesn't) │8. Review query structure - avoid SELECT * │9. Consider query rewrite │10. Use covering indexes if needed │ │─────────────────────────────────────────────────────────────────────────┤ │Q4: What are the different isolation levels? │ │A4: │- READ UNCOMMITTED: Can read uncommitted data (dirty reads) │- READ COMMITTED: Only see committed data (Oracle, PostgreSQL) │- REPEATABLE READ: Same query returns same results (MySQL default) │- SERIALIZABLE: Highest isolation, prevents all anomalies │ │MySQL/InnoDB: Uses REPEATABLE READ by default │PostgreSQL: Uses READ COMMITTED by default │ │─────────────────────────────────────────────────────────────────────────┤ │Q5: What is the difference between a covering index and a regular index?│ │A5: │Regular index: Contains indexed column values + pointer to row │Covering index: Contains all columns needed by the query │- Query can be satisfied entirely from the index │- "Index Only Scan" in PostgreSQL │- "Using index" in MySQL EXPLAIN │ │Example: │Query: SELECT id, email FROM users WHERE email = 'x'; │Index: CREATE INDEX idx_email_id ON users(email, id); -- Covering │ │─────────────────────────────────────────────────────────────────────────┤ │Q6: How does connection pooling improve performance? │ │A6: │- Eliminates connection overhead (create/teardown ~50-500ms) │- Limits total connections to database (prevents overload) │- Reuses existing connections (saves memory) │- Manages connection lifecycle (health checks, reconnection) │- Tools: PgBouncer (PostgreSQL), ProxySQL, HikariCP (app-level) │ │─────────────────────────────────────────────────────────────────────────┤ │Q7: What is the difference between shared_buffers and work_mem? │ │A7: │- shared_buffers: │ - Shared memory, used by all connections │ - Caches table data and indexes │ - Set to 25% of RAM │ │- work_mem: │ - Per-operation memory (sort, hash, merge) │ - Each query can use multiple work_mem (multiple sorts) │ - Set based on query complexity and available RAM │ - Too low = slow sorts, too high = OOM │ │─────────────────────────────────────────────────────────────────────────┤ │Q8: How do you monitor PostgreSQL query performance? │ │A8: │- pg_stat_statements: Track query performance │ CREATE EXTENSION pg_stat_statements; │ SELECT query, calls, mean_time FROM pg_stat_statements; │ │- EXPLAIN ANALYZE: Execute and analyze query │- pg_stat_activity: Current running queries │- pg_stat_user_tables: Table access statistics │- pg_stat_user_indexes: Index usage statistics │- Cache hit ratios: pg_statio_* views │- auto_explain: Log slow query plans │ │─────────────────────────────────────────────────────────────────────────┤ │Q9: What is MVCC and how does it affect database performance? │ │A9: │MVCC (Multi-Version Concurrency Control): │- Allows concurrent reads without blocking writes │- Each transaction sees a snapshot of data │- Writers don't block readers │- PostgreSQL uses MVCC heavily (visible in pg_stat_activity) │- VACUUM needed to reclaim old row versions │- Trade-off: Table bloat, VACUUM overhead │ │─────────────────────────────────────────────────────────────────────────┤ │Q10: What is the difference between InnoDB and MyISAM? │ │A10: │| Feature | InnoDB | MyISAM | ||----------------|---------------------|----------------------------| || Transaction | ACID | No | || Locking | Row-level | Table-level | || Crash recovery | Automatic | Requires repair | || Foreign keys | Supported | No | || Full-text | Supported (5.6+) | Native | || Speed | Better for writes | Better for reads | || Storage | Tablespace | Files per table | || Use case | Production | Read-only, archival | | │└─────────────────────────────────────────────────────────────────────────┘Quick Reference
Section titled “Quick Reference”# MySQLSHOW VARIABLES LIKE '%buffer%'; # Buffer settingsSHOW STATUS LIKE 'Innodb_%'; # InnoDB statsEXPLAIN SELECT ... # Query plan
# PostgreSQLSHOW shared_buffers;EXPLAIN ANALYZE SELECT ... # Query plan with timingSELECT pg_stat_statements_reset(); # Reset statsVACUUM ANALYZE; # Reclaim space + update stats
# CommonEXPLAIN <query> # Analyze querySHOW INDEXES FROM table; # Show indexesSHOW PROCESSLIST; # Active connectionsKILL <connection_id>; # Kill querySummary
Section titled “Summary”- MySQL: Buffer pool is critical (70-80% RAM), use InnoDB
- PostgreSQL: shared_buffers (25% RAM), work_mem per query
- Query Optimization: EXPLAIN is key, add appropriate indexes
- Connection Pooling: Use PgBouncer or ProxySQL
- Monitoring: Use built-in stats views and slow query logs
Next Chapter
Section titled “Next Chapter”Last Updated: February 2026