Database_tuning

Chapter 75: Database Performance Tuning

Overview

Database performance tuning is critical for DevOps and SRE roles. This chapter covers comprehensive tuning strategies for MySQL/MariaDB and PostgreSQL, including memory configuration, query optimization, indexing strategies, connection pooling, and performance monitoring. Understanding these topics is essential for maintaining high-performance database systems in production environments.

75.1 MySQL/MariaDB Tuning

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                    MySQL/MariaDB ARCHITECTURE                           │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                     CONNECTION LAYER                           │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │   │
│   │   │  Connection │  │  Connection │  │  Connection │          │   │
│   │   │  Thread 1  │  │  Thread 2   │  │  Thread N   │          │   │
│   │   └──────┬──────┘  └──────┬──────┘  └──────┬──────┘          │   │
│   │          │                │                │                   │   │
│   │          └────────────────┼────────────────┘                   │   │
│   │                           ▼                                      │   │
│   │              ┌────────────────────────┐                         │   │
│   │              │    Thread Pool/        │                         │   │
│   │              │    Connection Pool     │                         │   │
│   │              └────────────────────────┘                         │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                      │
│                                   ▼                                      │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                      SQL LAYER                                 │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   Parser → Optimizer → Query Cache → Execution Engine         │   │
│   │                                                                  │   │
│   │   Key Operations:                                               │   │
│   │   - Query parsing and validation                                │   │
│   │   - Query optimization (EXPLAIN)                               │   │
│   │   - Index selection                                             │   │
│   │   - Execution plan generation                                  │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                      │
│                                   ▼                                      │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                    STORAGE ENGINE LAYER                        │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   ┌─────────────────┐    ┌─────────────────┐                   │   │
│   │   │     InnoDB      │    │      MyISAM    │                   │   │
│   │   │   (Default)     │    │    (Legacy)    │                   │   │
│   │   ├─────────────────┤    ├─────────────────┤                   │   │
│   │   │ - ACID          │    │ - Full-text    │                   │   │
│   │   │ - Row-level     │    │ - Table-lock   │                   │   │
│   │   │ - MVCC          │    │ - No transaction│                   │   │
│   │   │ - Transaction   │    │ - Fast reads   │                   │   │
│   │   │ - Crash-safe   │    │                 │                   │   │
│   │   └─────────────────┘    └─────────────────┘                   │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                      │
│                                   ▼                                      │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                    MEMORY STRUCTURES                           │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   ┌─────────────────┐  ┌─────────────────┐                     │   │
│   │   │  Buffer Pool   │  │  Query Cache   │  (MariaDB 10.x)     │   │
│   │   │  (InnoDB)      │  │  (Deprecated)  │                     │   │
│   │   │  - Data cache  │  │  - Results     │                     │   │
│   │   │  - Index cache │  │  - Query       │                     │   │
│   │   └─────────────────┘  └─────────────────┘                     │   │
│   │                                                                  │   │
│   │   ┌─────────────────┐  ┌─────────────────┐                     │   │
│   │   │  Key Buffer     │  │   Sort Buffer   │                     │   │
│   │   │  (MyISAM)       │  │                 │                     │   │
│   │   └─────────────────┘  └─────────────────┘                     │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Memory Configuration

# ============================================================
# MySQL/MariaDB Memory Configuration - postgresql.conf equivalent
# ============================================================

[mysqld]
# ========================================
# INNODB SETTINGS (Most Important)
# ========================================

# Buffer Pool - Most critical setting
# Should be 70-80% of available RAM for dedicated DB server
innodb_buffer_pool_size = 4G
innodb_buffer_pool_instances = 8           # Multiple instances reduce contention
innodb_buffer_pool_chunk_size = 128M

# Transaction logs
innodb_log_file_size = 512M              # Larger = better write performance
innodb_log_buffer_size = 16M             # For transaction commits
innodb_flush_log_at_trx_commit = 1       # 1 = full ACID, 2 = some data loss possible
# 1: ACID compliant (default, safest)
# 2: Better performance, ~1 sec data loss on crash
# 0: Best performance, up to 1 sec data loss

# Thread handling
innodb_thread_concurrency = 0             # 0 = unlimited (usually best)
innodb_thread_sleep_delay = 10000         # Microseconds to sleep

# I/O settings
innodb_io_capacity = 2000                # SSD: 7000+, HDD: 200-800
innodb_io_capacity_max = 4000
innodb_read_io_threads = 4                # Parallel read threads
innodb_write_io_threads = 4               # Parallel write threads
innodb_flush_method = O_DIRECT           # Avoid double buffering (Linux)

# File format
innodb_file_per_table = 1                # One file per table
innodb_file_format = Barracuda           # For compression

# ========================================
# CONNECTION SETTINGS
# ========================================

max_connections = 200                    # Adjust based on expected load
# Each connection uses ~256KB-1MB RAM minimum
# For 200 connections + buffer pool, ensure enough RAM

# Thread cache
thread_cache_size = 50                   # Threads to keep cached
table_open_cache = 4000                  # Table cache
table_definition_cache = 2000           # Table definitions

# ========================================
# QUERY CACHE (MySQL 5.7 - DEPRECATED in 8.0)
# MariaDB 10.x still has it
# ========================================
# query_cache_type = 0                   # Disabled in MySQL 8.0+
# query_cache_size = 0                  # Disabled in MySQL 8.0+
# MariaDB alternative: query_cache_type = 1 (but still has overhead)

# ========================================
# TEMP TABLES AND SORTS
# ========================================

tmp_table_size = 64M                    # Max size for in-memory temp tables
max_heap_table_size = 64M               # Max size for MEMORY tables
sort_buffer_size = 2M                   # Per-connection sort buffer
join_buffer_size = 2M                   # Per-connection join buffer
read_buffer_size = 2M                  # Sequential scan buffer
read_rnd_buffer_size = 8M              # Random read buffer

# ========================================
# LOGGING
# ========================================

slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 2                     # Log queries > 2 seconds
log_queries_not_using_indexes = 1      # Log queries without indexes
log_throttle_queries_not_using_indexes = 10

general_log = 0                         # Disable in production
general_log_file = /var/log/mysql/general.log

# Binary logging (for replication/point-in-time recovery)
server_id = 1
log_bin = /var/log/mysql/mysql-bin
binlog_format = ROW                     # ROW, STATEMENT, or MIXED
binlog_row_image = FULL                # FULL, MINIMAL, NOBLOB
expire_logs_days = 7
max_binlog_size = 100M

# ========================================
# CHARACTER SET
# ========================================

character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

Query Optimization

-- ============================================================
-- MySQL QUERY OPTIMIZATION
-- ============================================================

-- EXPLAIN - Analyze query execution
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

-- Understanding EXPLAIN output:
-- type: ALL (full scan) < index < range < ref < eq_ref < const
-- key: Index being used
-- rows: Estimated rows to examine
-- Extra: Using filesort, Using temporary, Using index

-- Show index usage
SHOW INDEX FROM users;
SHOW TABLE STATUS FROM database;

-- Find missing indexes
-- Enable slow query log and analyze
-- Or use: EXPLAIN for slow queries

-- Query optimization examples:

-- BAD: SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- GOOD: SELECT * FROM orders WHERE created_at >= '2024-01-01'
--       AND created_at < '2025-01-01';

-- BAD: SELECT * FROM users WHERE email LIKE '%gmail.com';
-- GOOD: Can't optimize leading wildcard
-- Alternative: Full-text search or Elasticsearch

-- Index optimization:
ALTER TABLE orders ADD INDEX idx_created (created_at);
ALTER TABLE orders ADD INDEX idx_status_created (status, created_at);
ALTER TABLE orders ADD INDEX idx_customer_order (customer_id, order_id);

-- Composite index order matters!
-- Index on (a, b, c) helps queries with:
-- WHERE a = x
-- WHERE a = x AND b = y
-- WHERE a = x AND b = y AND c = z
-- But NOT: WHERE b = y OR WHERE c = z

-- Covering index (includes all needed columns)
ALTER TABLE users ADD INDEX idx_covering (id, email, name);

-- Check table statistics
ANALYZE TABLE users;
OPTIMIZE TABLE users;

Monitoring and Troubleshooting

-- ============================================================
-- MySQL MONITORING QUERIES
-- ============================================================

-- Current connections
SHOW PROCESSLIST;
SELECT * FROM information_schema.PROCESSLIST;

-- Connection status
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Threads_running';
SHOW STATUS LIKE 'Max_used_connections';

-- Query cache (MySQL 5.7)
SHOW STATUS LIKE 'Qcache%';
-- Qcache_hits, Qcache_inserts, Qcache_not_cached

-- InnoDB status
SHOW ENGINE INNODB STATUS;
-- Look for: lock waits, buffer pool statistics

-- InnoDB metrics
SHOW GLOBAL STATUS LIKE 'Innodb_%';

-- Table cache
SHOW GLOBAL STATUS LIKE 'Open_%';

-- Slow query analysis
SELECT * FROM mysql.slow_log ORDER BY start_time DESC LIMIT 10;

-- Lock analysis
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_TRX;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

-- Kill problematic query
KILL QUERY 12345;  -- Kill specific query in connection
KILL 12345;        -- Kill entire connection

75.2 PostgreSQL Tuning

PostgreSQL Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                   PostgreSQL ARCHITECTURE                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                      PROCESS MODEL                              │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   ┌──────────────┐     ┌──────────────┐                        │   │
│   │   │   Postgres   │     │  Background  │                        │   │
│   │   │   (Master)  │────►│  Processes   │                        │   │
│   │   │              │     │  - Writer   │                        │   │
│   │   │   - Listens │     │  - WAL      │                        │   │
│   │   │     for     │     │  - Checkpt  │                        │   │
│   │   │   requests  │     │  - AutoVac  │                        │   │
│   │   └──────┬───────┘     │  - Logger   │                        │   │
│   │          │             └──────────────┘                         │   │
│   │          │ connections                                          │   │
│   │          ▼                                                       │   │
│   │   ┌─────────────────────────────────────────────────────────┐   │   │
│   │   │           Per-Connection Process (Backend)             │   │   │
│   │   │   ┌─────────────────────────────────────────────────┐   │   │   │
│   │   │   │  Parser → Analyzer → Rewriter → Planner →     │   │   │   │
│   │   │   │  Executor                                    │   │   │   │
│   │   │   └─────────────────────────────────────────────────┘   │   │   │
│   │   └─────────────────────────────────────────────────────────┘   │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                    MEMORY STRUCTURES                            │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │   ┌─────────────────────────────────────────────────────────┐   │   │
│   │   │                    Shared Memory                        │   │
│   │   ├─────────────────────────────────────────────────────────┤   │   │
│   │   │                                                          │   │   │
│   │   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │   │   │
│   │   │  │   Shared    │  │    WAL     │  │   Stats    │    │   │   │
│   │   │  │   Buffers   │  │   Buffers   │  │   Shared   │    │   │   │
│   │   │  │  (data+idx) │  │   (writes)  │  │   Memory   │    │   │   │
│   │   │  └─────────────┘  └─────────────┘  └─────────────┘    │   │   │
│   │   │                                                          │   │   │
│   │   └─────────────────────────────────────────────────────────┘   │   │
│   │                                                                  │   │
│   │   ┌─────────────────────────────────────────────────────────┐   │   │
│   │   │                    Per-Process Memory                    │   │
│   │   ├─────────────────────────────────────────────────────────┤   │   │
│   │   │                                                          │   │   │
│   │   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │   │   │
│   │   │  │  work_mem  │  │ maintenance │  │   temp     │    │   │   │
│   │   │  │  (sort,    │  │ _work_mem  │  │  buffers   │    │   │   │
│   │   │  │   hash,    │  │  (VACUUM,  │  │ (complex   │    │   │   │
│   │   │  │   merge)   │  │   CREATE   │  │  queries)  │    │   │   │
│   │   │  │            │  │   INDEX)   │  │            │    │   │   │
│   │   │  └─────────────┘  └─────────────┘  └─────────────┘    │   │   │
│   │   │                                                          │   │   │
│   │   └─────────────────────────────────────────────────────────┘   │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

PostgreSQL Configuration

# ============================================================
# PostgreSQL Memory Configuration - postgresql.conf
# ============================================================

# ========================================
# MEMORY SETTINGS (Most Critical)
# ========================================

# Shared memory - for caching table data
# Should be 25% of RAM for dedicated DB server
shared_buffers = 2GB

# Memory for sorting/hashing per operation
# Increase for complex queries, but watch total RAM
work_mem = 16MB

# Memory for maintenance operations (VACUUM, CREATE INDEX)
# Should be larger than work_mem
maintenance_work_mem = 256MB

# Memory for temporary tables
temp_buffers = 8MB

# Effective cache size - hint to planner
# Set to 75% of available RAM
effective_cache_size = 6GB

# ========================================
# CONNECTION SETTINGS
# ========================================

# Max connections - balance between connections and RAM
# Each connection uses work_mem + overhead
max_connections = 100

# Connection pooling (recommended: PgBouncer)
# pg_hba.conf for connection auth

# ========================================
# WRITE AHEAD LOG (WAL) SETTINGS
# ========================================

# WAL level: minimal, replica, logical
wal_level = replica                     # For replication
# minimal: Basic crash recovery
# replica: Point-in-time recovery + streaming
# logical: Logical replication

# WAL buffers - critical for write performance
wal_buffers = 16MB                     # 3% of shared_buffers

# Checkpoints
checkpoint_completion_target = 0.9     # Spread checkpoint writes
checkpoint_timeout = 10min            # Time between checkpoints
min_wal_size = 1GB                    # Minimum WAL to keep
max_wal_size = 2GB                    # WAL max before checkpoint

# Synchronous commit
synchronous_commit = on                # ACID: on, off, local, remote_write
# on: Wait for WAL (safest)
# off: Better performance, potential data loss
# remote_write: Wait for replica to receive

# ========================================
# QUERY PLANNER
# ========================================

# Random page cost - lower for SSDs
random_page_cost = 1.1                 # SSD: 1.1, HDD: 4.0
effective_io_concurrency = 200         # Parallel I/O (SSD)

# Parallel queries
max_worker_processes = 8               # CPU cores
max_parallel_workers_per_gather = 4   # Parallel workers
max_parallel_workers = 8
max_parallel_maintenance_workers = 4

# Statistics
default_statistics_target = 100         # Planner accuracy (100-10000)

# ========================================
# LOGGING
# ========================================

log_destination = 'stderr'
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d.log'
log_rotation_age = 1d
log_rotation_size = 100MB

# Slow query logging
log_min_duration_statement = 1000      # Log queries > 1 second
log_connections = on
log_disconnections = on
log_lock_waits = on                    # Log lock waits > deadlock_timeout
log_temp_files = 0                     # Log temp file usage

# Statement statistics
track_activities = on
track_counts = on
track_io_timing = on
track_functions = none                  # or pl for functions

# ========================================
# AUTOVACUUM
# ========================================

autovacuum = on
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.2   # Vac when 20% updated
autovacuum_analyze_scale_factor = 0.1  # Analyze when 10% changed

Query Optimization

-- ============================================================
-- PostgreSQL QUERY OPTIMIZATION
-- ============================================================

-- EXPLAIN - Query plan analysis
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

-- Understanding the output:
-- Seq Scan: Full table scan (bad for large tables)
-- Index Scan: Uses index (good)
-- Index Only Scan: Index covers all columns (best)
-- Bitmap Scan: Uses bitmap index (good for large datasets)
-- Hash Join, Merge Join, Nested Loop: Join strategies

-- Index creation
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_status_date ON orders(status, created_at);
CREATE INDEX idx_users_email ON users(email) WHERE status = 'active';

-- Partial index (smaller, faster)
CREATE INDEX idx_orders_completed ON orders(created_at)
    WHERE status = 'completed';

-- Covering index (INCLUDE)
CREATE INDEX idx_users_email_name ON users(email) INCLUDE (name, phone);

-- BRIN index (for time-series, append-only)
CREATE INDEX idx_logs_time ON logs USING BRIN(created_at);

-- Reindex for performance
REINDEX INDEX idx_users_email;
REINDEX TABLE users;

-- Table statistics
VACUUM ANALYZE users;
-- VACUUM: Reclaims space, updates statistics
-- ANALYZE: Updates statistics (needed for planner)

-- Check index usage
SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

-- Find unused indexes
SELECT indexrelname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%pkey%';

-- Find missing indexes
-- Enable auto_explain and check logs
-- Or use: pg_stat_statements
CREATE EXTENSION pg_stat_statements;
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY mean_time DESC LIMIT 10;

Monitoring

-- ============================================================
-- PostgreSQL MONITORING QUERIES
-- ============================================================

-- Current activity
SELECT * FROM pg_stat_activity;
-- Useful for finding long-running queries

-- Database-level stats
SELECT * FROM pg_stat_database;

-- Table statistics
SELECT * FROM pg_stat_user_tables;
-- seq_scan, idx_scan, n_tup_ins/upd/del, n_live_tup

-- Index statistics
SELECT * FROM pg_stat_user_indexes;

-- Cache hit ratio (should be > 95%)
SELECT
    sum(heap_blks_read) as heap_read,
    sum(heap_blks_hit) as heap_hit,
    round(sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100, 2) as ratio
FROM pg_statio_user_tables;

-- Index hit ratio
SELECT
    sum(idx_blks_read) as idx_read,
    sum(idx_blks_hit) as idx_hit,
    round(sum(idx_blks_hit) / (sum(idx_blks_hit) + sum(idx_blks_read)) * 100, 2) as ratio
FROM pg_statio_user_indexes;

-- Table sizes
SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;

-- Lock analysis
SELECT * FROM pg_locks;
SELECT pg_blocking_pids(pid) as blocked_by FROM pg_stat_activity;

-- WAL usage
SELECT * FROM pg_stat_bgwriter;
-- checkpoints_timed, checkpoints_req, buffers_checkpoint

-- Configuration
SHOW max_connections;
SHOW shared_buffers;
SHOW work_mem;

75.3 Connection Pooling

Connection Pooling Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                   CONNECTION POOLING                                    │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   WITHOUT POOLING:                                                     │
│   ┌────────┐      ┌────────┐      ┌────────┐      ┌────────┐        │
│   │Request │ ───► │ Conn 1 │ ───► │ Conn 2 │ ───► │ Conn 3 │  ...   │
│   │   1   │      │ Create │      │ Create │      │ Create │        │
│   └────────┘      └────────┘      └────────┘      └────────┘        │
│   ┌────────┐      ┌────────┐      ┌────────┐      ┌────────┐        │
│   │Request │ ───► │ Close  │ ───► │ Close  │ ───► │ Close  │        │
│   │   2   │      │ Create │      │ Create │      │ Create │        │
│   └────────┘      └────────┘      └────────┘      └────────┘        │
│                                                                         │
│   Each request: Create connection → Query → Close connection         │
│   Overhead: ~50-500ms per connection                                  │
│                                                                         │
│   WITH POOLING:                                                        │
│   ┌────────┐      ┌────────────────────────────────────────┐         │
│   │Request │ ───► │         Connection Pool               │         │
│   │   1   │      │  ┌────┐  ┌────┐  ┌────┐  ┌────┐     │         │
│   └────────┘      │  │ P1 │  │ P2 │  │ P3 │  │ P4 │     │         │
│                   │  └────┘  └────┘  └────┘  └────┘     │         │
│   ┌────────┐      │   (reused)    (reused)              │         │
│   │Request │ ───► │                                        │         │
│   │   2   │      └────────────────────────────────────────┘         │
│   └────────┘                                                         │
│                                                                         │
│   Connections created once, reused for multiple requests              │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

PgBouncer Configuration

# ============================================================
# PgBouncer Configuration
# ============================================================

# pgbouncer.ini

[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Pool sizes
pool_mode = transaction            # transaction, session, statement
max_client_conn = 1000            # Max connections to pgbouncer
default_pool_size = 20            # Connections per user/database
min_pool_size = 5                # Always keep this many
reserve_pool_size = 5             # Extra connections for bursts
reserve_pool_timeout = 5          # Seconds before extra connection

# Timeouts
server_login_retry = 3
server_idle_timeout = 600
server_lifetime = 3600
server_round_robin = 1

# Performance
server_check_delay = 30
query_timeout = 0                 # 0 = disabled
client_login_timeout = 60

# Logging
log_connections = 0
log_disconnections = 0
log_pooler_errors = 1

# ============================================================
# pg_hba.conf (for PgBouncer)
# ========================================
# host all all 127.0.0.1/32 md5

MySQL Proxy/Connection Pool

# ============================================================
# MySQL Connection Pooling Options
# ============================================================

# Option 1: MySQL Enterprise Pool (MySQL 8.0)
# Built-in connection pool in MySQL 8.0.13+

# Option 2: ProxySQL
# Create /etc/proxysql.cnf

datadir="/var/lib/proxysql"
admin_variables=
{
    admin_credentials="admin:admin"
    mysql_ifaces="127.0.0.1:6032;/var/lib/proxysql/proxysql.db"
}

mysql_variables=
{
    listeners=(
        {
            addr="0.0.0.0"
            port=6033
            schema="mydb"
            hostgroup_id=1
        }
    )
    max_connections=1000
    default_schema="mydb"
}

mysql_servers =
(
    {
        address="127.0.0.1"
        port=3306
        hostgroup_id=1
        status="ONLINE"
    }
)

mysql_users =
(
    {
        username="app_user"
        password="password"
        default_hostgroup=1
    }
)

# Option 3: MariaDB MaxScale
# Similar to ProxySQL

# Option 4: Application-level pooling
# - HikariCP (Java)
# - PGBouncer alternative: MySQL doesn't have native pooling

75.4 Indexing Strategies

Index Design Guidelines

┌─────────────────────────────────────────────────────────────────────────┐
│                       INDEXING STRATEGIES                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                    WHEN TO INDEX                                 │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │  ✓ WHERE clauses: column used in filtering                     │   │
│   │  ✓ JOIN conditions: columns used to join tables                │   │
│   │  ✓ ORDER BY: columns used for sorting                           │   │
│   │  ✓ GROUP BY: columns used for grouping                         │   │
│   │  ✓ DISTINCT: columns with DISTINCT                             │   │
│   │  ✓ Aggregate functions: columns in MIN/MAX/COUNT               │   │
│   │                                                                  │   │
│   │  ✗ Low cardinality: (e.g., boolean flags)                     │   │
│   │  ✗ Rarely queried columns                                      │   │
│   │  ✗ Tables with mostly writes (indexes slow writes)            │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                 INDEX COLUMN ORDER                              │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │  Index: (A, B, C)                                              │   │
│   │                                                                  │   │
│   │  Query:                    │  Uses Index?                      │   │
│   │  ──────────────────────────┼───────────────────────────────    │   │
│   │  WHERE A = x               │  ✓ Uses A                          │   │
│   │  WHERE A = x AND B = y    │  ✓ Uses A, B                       │   │
│   │  WHERE A = x AND B = y    │  ✓ Uses A, B, C                    │   │
│   │       AND C = z           │                                    │   │
│   │  WHERE B = y               │  ✗ Can't use (A not specified)    │   │
│   │  WHERE A = x AND C = z    │  ✗ Uses A only (C not consecutive)│   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                   INDEX TYPES                                   │   │
│   ├─────────────────────────────────────────────────────────────────┤   │
│   │                                                                  │   │
│   │  B-Tree (Default)                                               │   │
│   │  - =, <, >, <=, >=, BETWEEN, LIKE 'prefix%'                    │   │
│   │                                                                  │   │
│   │  Hash                                                           │   │
│   │  - = only (no range queries)                                    │   │
│   │  - Memory tables                                                │   │
│   │                                                                  │   │
│   │  GiST (PostgreSQL)                                              │   │
│   │  - Spatial, full-text, range types                             │   │
│   │                                                                  │   │
│   │  GIN (PostgreSQL)                                               │   │
│   │  - Inverted indexes, arrays, JSONB                             │   │
│   │                                                                  │   │
│   │  BRIN (PostgreSQL)                                              │   │
│   │  - Block range indexes, time-series data                      │   │
│   │                                                                  │   │
│   │  Full-Text                                                      │   │
│   │  - Text search in MySQL/MariaDB and PostgreSQL               │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

75.5 Interview Questions

┌─────────────────────────────────────────────────────────────────────────┐
│                 DATABASE TUNING INTERVIEW QUESTIONS                      │
├─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q1: What is the most important MySQL/InnoDB configuration setting?       │
                                                                         │
A1:                                                                       │
innodb_buffer_pool_size is the most critical setting:                  │
- Should be 70-80% of available RAM for dedicated DB server           │
- Caches both data and indexes                                         │
- Reduces disk I/O significantly                                        │
- Formula: Available RAM - (max_connections × work_mem)                │
- For 8GB RAM, use ~6GB for buffer pool                               │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q2: Explain the difference between MySQL and PostgreSQL.               │
                                                                         │
A2:                                                                       │
| Feature           | MySQL                 | PostgreSQL              |  │
|-------------------|----------------------|------------------------|  |
| Type              | MySQL AB → Oracle    | Open source (PGDG)    |  |
| ACID              | InnoDB: Full         | Full                  |  |
| Concurrency       | Per-table locks      | MVCC                  |  |
| Indexes           | B-tree, full-text,   | B-tree, GiST, GIN,    |  |
|                   | spatial, hash         | BRIN, Hash            |  |
| JSON              | JSON, JSONB          | JSONB (native)        |  |
| Query optimizer   | Simple               | Cost-based (advanced) |  |
| Replication       | Binlog               | WAL-based             |  |
| Extensions        | Limited              | Rich (PostGIS, etc.)  |  |
| Strengths         | Simplicity, speed    | Features, standards   |  |
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q3: How do you optimize a slow SQL query?                               │
                                                                         │
A3:                                                                       │
1. Run EXPLAIN to analyze execution plan                               │
2. Check for full table scans (Seq Scan)                              │
3. Add appropriate indexes                                             │
4. Verify index usage in query                                         │
5. Check for function usage in WHERE (can't use index)               │
6. Look for implicit type conversions                                  │
7. Check JOIN order (PostgreSQL optimizer handles, MySQL doesn't)    │
8. Review query structure - avoid SELECT *                           │
9. Consider query rewrite                                              │
10. Use covering indexes if needed                                     │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q4: What are the different isolation levels?                           │
                                                                         │
A4:                                                                       │
- READ UNCOMMITTED: Can read uncommitted data (dirty reads)           │
- READ COMMITTED: Only see committed data (Oracle, PostgreSQL)       │
- REPEATABLE READ: Same query returns same results (MySQL default)   │
- SERIALIZABLE: Highest isolation, prevents all anomalies            │
                                                                         │
MySQL/InnoDB: Uses REPEATABLE READ by default                         │
PostgreSQL: Uses READ COMMITTED by default                            │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q5: What is the difference between a covering index and a regular index?│
                                                                         │
A5:                                                                       │
Regular index: Contains indexed column values + pointer to row         │
Covering index: Contains all columns needed by the query              │
- Query can be satisfied entirely from the index                     │
- "Index Only Scan" in PostgreSQL                                     │
- "Using index" in MySQL EXPLAIN                                       │
                                                                         │
Example:                                                               │
Query: SELECT id, email FROM users WHERE email = 'x';                │
Index: CREATE INDEX idx_email_id ON users(email, id);  -- Covering  │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q6: How does connection pooling improve performance?                  │
                                                                         │
A6:                                                                       │
- Eliminates connection overhead (create/teardown ~50-500ms)          │
- Limits total connections to database (prevents overload)           │
- Reuses existing connections (saves memory)                          │
- Manages connection lifecycle (health checks, reconnection)         │
- Tools: PgBouncer (PostgreSQL), ProxySQL, HikariCP (app-level)     │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q7: What is the difference between shared_buffers and work_mem?        │
                                                                         │
A7:                                                                       │
- shared_buffers:                                                       │
  - Shared memory, used by all connections                            │
  - Caches table data and indexes                                     │
  - Set to 25% of RAM                                                 │
                                                                         │
- work_mem:                                                             │
  - Per-operation memory (sort, hash, merge)                         │
  - Each query can use multiple work_mem (multiple sorts)           │
  - Set based on query complexity and available RAM                  │
  - Too low = slow sorts, too high = OOM                             │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q8: How do you monitor PostgreSQL query performance?                  │
                                                                         │
A8:                                                                       │
- pg_stat_statements: Track query performance                        │
  CREATE EXTENSION pg_stat_statements;                               │
  SELECT query, calls, mean_time FROM pg_stat_statements;            │
                                                                         │
- EXPLAIN ANALYZE: Execute and analyze query                         │
- pg_stat_activity: Current running queries                          │
- pg_stat_user_tables: Table access statistics                      │
- pg_stat_user_indexes: Index usage statistics                      │
- Cache hit ratios: pg_statio_* views                                │
- auto_explain: Log slow query plans                                │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q9: What is MVCC and how does it affect database performance?        │
                                                                         │
A9:                                                                       │
MVCC (Multi-Version Concurrency Control):                            │
- Allows concurrent reads without blocking writes                  │
- Each transaction sees a snapshot of data                           │
- Writers don't block readers                                         │
- PostgreSQL uses MVCC heavily (visible in pg_stat_activity)         │
- VACUUM needed to reclaim old row versions                          │
- Trade-off: Table bloat, VACUUM overhead                            │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q10: What is the difference between InnoDB and MyISAM?                 │
                                                                         │
A10:                                                                      │
| Feature        | InnoDB               | MyISAM                     |  |
|----------------|---------------------|----------------------------| |
| Transaction    | ACID                 | No                        |  |
| Locking        | Row-level            | Table-level               |  |
| Crash recovery | Automatic            | Requires repair           |  |
| Foreign keys   | Supported            | No                        |  |
| Full-text      | Supported (5.6+)     | Native                    |  |
| Speed          | Better for writes    | Better for reads         |  |
| Storage        | Tablespace           | Files per table          |  |
| Use case       | Production           | Read-only, archival      |  |
                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Quick Reference

# MySQL
SHOW VARIABLES LIKE '%buffer%';              # Buffer settings
SHOW STATUS LIKE 'Innodb_%';                 # InnoDB stats
EXPLAIN SELECT ...                           # Query plan

# PostgreSQL
SHOW shared_buffers;
EXPLAIN ANALYZE SELECT ...                  # Query plan with timing
SELECT pg_stat_statements_reset();          # Reset stats
VACUUM ANALYZE;                             # Reclaim space + update stats

# Common
EXPLAIN <query>                             # Analyze query
SHOW INDEXES FROM table;                    # Show indexes
SHOW PROCESSLIST;                           # Active connections
KILL <connection_id>;                       # Kill query

Summary

MySQL: Buffer pool is critical (70-80% RAM), use InnoDB
PostgreSQL: shared_buffers (25% RAM), work_mem per query
Query Optimization: EXPLAIN is key, add appropriate indexes
Connection Pooling: Use PgBouncer or ProxySQL
Monitoring: Use built-in stats views and slow query logs

Next Chapter

Chapter 76: BIND DNS Server

Last Updated: February 2026