Postgresql
Chapter 72: PostgreSQL Administration
Section titled “Chapter 72: PostgreSQL Administration”Comprehensive Database Management
Section titled “Comprehensive Database Management”72.1 PostgreSQL Architecture
Section titled “72.1 PostgreSQL Architecture”How PostgreSQL Works
Section titled “How PostgreSQL Works”PostgreSQL is an advanced, open-source relational database system known for reliability, feature richness, and performance. It follows a client-server architecture with a multi-process model.
PostgreSQL Architecture+------------------------------------------------------------------+| || PostgreSQL Process Model || || +-------------------------------------------------------------+|| | Client Applications ||| | psql | pgAdmin | Django | Node.js | Java | etc. ||| +--------------------------+----------------------------------+|| | || v || +-------------------------------------------------------------+|| | PostgreSQL Server (Postmaster) ||| | +------------------------------------------------------+ ||| | | Shared Memory | || | | - Shared Buffer Cache | || | | - WAL Buffer | || | | - Lock Manager | || | +------------------------------------------------------+ || | || | +------------------------------------------------------+ ||| | | Background Workers | || | | - Writer Process (WAL writer) | || | | - Checkpointer Process | || | | - Autovacuum Launchers | || | | - Statistics Collector | || | | - Log Writer | || | | - Archiver | || | +------------------------------------------------------+ || | || +---------------------------+-----------------------------------+| | || +---------------------------v-----------------------------------+|| | Backend Processes (Per Connection) ||| | +----------+ +----------+ +----------+ +----------+ ||| | | Backend | | Backend | | Backend | | Backend | ||| | | 1 | | 2 | | 3 | | N | ||| | +----------+ +----------+ +----------+ +----------+ ||| | Each client connection gets its own backend process ||| +-------------------------------------------------------------+|| || Storage Layer || +-------------------------------------------------------------+|| | Data Files | WAL Files | Tablespaces | CLOG | Statistics ||| +-------------------------------------------------------------+|| |+------------------------------------------------------------------+Key Components
Section titled “Key Components” PostgreSQL Components+------------------------------------------------------------------+| || Component | Description || -------------------|--------------------------------------------|| Postmaster | Main daemon, accepts connections || Shared Memory | Caches, buffers shared across processes || WAL Writer | Writes Write-Ahead Logging || Checkpoint | Writes dirty pages to disk || Autovacuum | Automatic vacuum and analyze || Stats Collector | Collects usage statistics || Archiver | Archives WAL files for backup || Background Worker | Extension framework for background tasks || pg_stat_* views | System statistics views || || Storage: || +----------------------------------------------------------+ || | /var/lib/postgresql/<version>/main/ | || | base/ - Database files | || | pg_wal/ - Write-Ahead Log | || | pg_xact/ - Transaction status | || | global/ - Cluster-wide tables | || | pg_dynshmem/- Dynamic shared memory | || | pg_notify/ - LISTEN/NOTIFY | || | pg_serial/ - Serializable information | || | pg_snapshots/- Exported snapshots | || | pg_subtrans/- Subtransaction status | || | pg_tblspc/ - Tablespace symbolic links | || | pg_twophase/- Prepared transactions | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+72.2 Installation and Setup
Section titled “72.2 Installation and Setup”Installing PostgreSQL
Section titled “Installing PostgreSQL”# Debian/Ubuntusudo apt updatesudo apt install postgresql postgresql-contrib
# RHEL/CentOS/Fedorasudo dnf install postgresql-server postgresql-contrib
# Initialize database (RHEL)sudo postgresql-setup --initdb# orsudo /usr/pgsql-14/bin/postgresql-setup initdb
# Arch Linuxsudo pacman -S postgresqlsudo su - postgres -c "initdb -D /var/lib/postgres/data"
# Start servicesudo systemctl start postgresqlsudo systemctl enable postgresql
# Check statussudo systemctl status postgresqlps aux | grep postgresInitial Configuration
Section titled “Initial Configuration”# Connect as postgres usersudo -u postgres psql# orsudo -u postgres psql -d postgres
# Or connect as current user (if peer auth configured)psql
# Show versionSELECT version();
# Show current databaseSELECT current_database();
# List databases\l\list
# List users\du\du+
# Exit\q72.3 User and Role Management
Section titled “72.3 User and Role Management”Creating and Managing Users
Section titled “Creating and Managing Users”-- Create user (role with login privilege)CREATE USER myuser WITH PASSWORD 'secure_password';
-- Create superuserCREATE USER myadmin WITH PASSWORD 'secure_password' SUPERUSER;
-- Create role with login and passwordCREATE ROLE readonly WITH LOGIN PASSWORD 'password' VALID UNTIL '2025-12-31';
-- Alter userALTER USER myuser WITH PASSWORD 'new_password';
-- Grant privilegesGRANT ALL PRIVILEGES ON DATABASE mydb TO myuser;GRANT ALL ON SCHEMA public TO myuser;
-- Grant role to userGRANT readonly TO myuser;
-- Revoke privilegesREVOKE ALL PRIVILEGES ON DATABASE mydb FROM myuser;
-- Drop userDROP USER myuser;
-- Lock user (prevent login)ALTER USER myuser WITH NOLOGIN;
-- Unlock userALTER USER myuser WITH LOGIN;Authentication Configuration
Section titled “Authentication Configuration”# /etc/postgresql/<version>/main/pg_hba.conf
# Local connectionslocal all all peer# For psql -U username (without sudo)
# For sudo -u postgres psql (peer auth - default)local all postgres peerlocal all all peer
# IPv4 local connectionshost all all 127.0.0.1/32 scram-sha-256
# IPv6 local connectionshost all all ::1/128 scram-sha-256
# Remote connections (MD5 - password auth)host all all 0.0.0.0/0 md5host all all ::/0 md5
# Specific networkhost myapp myuser 192.168.1.0/24 scram-sha-256
# Replicationhost replication replica 192.168.1.0/24 scram-sha-25672.4 Database Management
Section titled “72.4 Database Management”Creating and Managing Databases
Section titled “Creating and Managing Databases”-- Create databaseCREATE DATABASE myapp;
-- Create database with ownerCREATE DATABASE myapp OWNER myuser;
-- Create database with encodingCREATE DATABASE myapp OWNER myuser ENCODING 'UTF8';
-- Create database with templateCREATE DATABASE myapp TEMPLATE template0;
-- Drop databaseDROP DATABASE myapp;
-- Drop database (with IF EXISTS)DROP DATABASE IF EXISTS myapp;
-- Rename databaseALTER DATABASE myapp RENAME TO newdb;
-- Change ownerALTER DATABASE myapp OWNER TO newowner;
-- Disconnect all users before dropping-- In pg_stat_activity, see connected sessions-- Then: SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'mydb';Schema Management
Section titled “Schema Management”-- Default schema is public-- Create schemaCREATE SCHEMA myapp;
-- Create schema with ownerCREATE SCHEMA myapp AUTHORIZATION myuser;
-- Grant usage on schemaGRANT USAGE ON SCHEMA myapp TO myuser;
-- Grant all on schemaGRANT ALL ON SCHEMA myapp TO myuser;
-- Create objects in schemaCREATE TABLE myapp.users (...);
-- Search pathSHOW search_path;SET search_path TO myapp, public;
-- Drop schema (cascade drops all objects)DROP SCHEMA myapp CASCADE;72.5 Table Management
Section titled “72.5 Table Management”Creating Tables
Section titled “Creating Tables”-- Basic tableCREATE TABLE users ( id SERIAL PRIMARY KEY, username VARCHAR(50) UNIQUE NOT NULL, email VARCHAR(100) UNIQUE NOT NULL, password_hash VARCHAR(255) NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
-- Table with constraintsCREATE TABLE orders ( id SERIAL PRIMARY KEY, user_id INTEGER NOT NULL REFERENCES users(id), total DECIMAL(10,2) NOT NULL, status VARCHAR(20) DEFAULT 'pending', created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, CONSTRAINT positive_total CHECK (total >= 0));
-- Partitioned table (PostgreSQL 10+)CREATE TABLE orders ( id SERIAL, created_at DATE, amount DECIMAL(10,2)) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2024_01 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
-- Table with indexesCREATE TABLE products ( id SERIAL PRIMARY KEY, name VARCHAR(100), category_id INTEGER, price DECIMAL(10,2));
CREATE INDEX idx_products_category ON products(category_id);CREATE INDEX idx_products_name ON products(name);CREATE INDEX idx_products_price ON products(price) WHERE price > 0;Managing Tables
Section titled “Managing Tables”-- Add columnALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Drop columnALTER TABLE users DROP COLUMN phone;
-- Rename columnALTER TABLE users RENAME COLUMN phone TO mobile;
-- Change column typeALTER TABLE users ALTER COLUMN mobile TYPE VARCHAR(30);
-- Add constraintALTER TABLE orders ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id);
-- Rename tableALTER TABLE users RENAME TO app_users;
-- Truncate tableTRUNCATE TABLE app_users CASCADE;
-- Drop tableDROP TABLE orders;72.6 Queries and Data Manipulation
Section titled “72.6 Queries and Data Manipulation”Basic Queries
Section titled “Basic Queries”-- SELECTSELECT * FROM users;SELECT id, username, email FROM users;SELECT * FROM users WHERE id = 1;
-- DISTINCTSELECT DISTINCT category FROM products;
-- ORDER BYSELECT * FROM users ORDER BY created_at DESC;
-- LIMIT/OFFSETSELECT * FROM users LIMIT 10 OFFSET 20;
-- LIKESELECT * FROM users WHERE email LIKE '%@example.com';
-- INSELECT * FROM users WHERE id IN (1, 2, 3);
-- BETWEENSELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
-- AggregatesSELECT COUNT(*) FROM users;SELECT SUM(amount) FROM orders;SELECT AVG(price) FROM products;SELECT MIN(price), MAX(price) FROM products;
-- GROUP BYSELECT user_id, COUNT(*) FROM orders GROUP BY user_id;SELECT category, COUNT(*) FROM products GROUP BY HAVING COUNT(*) > 5;
-- JOINsSELECT o.id, u.username, o.totalFROM orders oJOIN users u ON o.user_id = u.id;
SELECT * FROM users LEFT JOIN orders ON users.id = orders.user_id;Advanced Queries
Section titled “Advanced Queries”-- SubqueriesSELECT * FROM users WHERE id IN ( SELECT user_id FROM orders WHERE total > 100);
-- Common Table Expression (CTE)WITH recent_orders AS ( SELECT * FROM orders WHERE created_at > '2024-01-01')SELECT u.username, COUNT(ro.id) as order_countFROM users uLEFT JOIN recent_orders ro ON u.id = ro.user_idGROUP BY u.username;
-- Window FunctionsSELECT username, created_at, ROW_NUMBER() OVER (ORDER BY created_at) as row_num, RANK() OVER (ORDER BY created_at) as rank, LAG(username) OVER (ORDER BY created_at) as prev_userFROM users;
-- CASESELECT name, price, CASE WHEN price > 100 THEN 'expensive' WHEN price > 50 THEN 'moderate' ELSE 'cheap' END as price_categoryFROM products;
-- Upsert (PostgreSQL 9.5+)INSERT INTO users (username, email)VALUES ('john', 'john@example.com')ON CONFLICT (username)DO UPDATE SET email = EXCLUDED.email;72.7 Indexes
Section titled “72.7 Indexes”Index Types and Usage
Section titled “Index Types and Usage”-- B-tree index (default)CREATE INDEX idx_users_email ON users(email);
-- Unique indexCREATE UNIQUE INDEX idx_users_username ON users(username);
-- Composite indexCREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Partial indexCREATE INDEX idx_active_users ON users(email) WHERE status = 'active';
-- Index on expressionCREATE INDEX idx_users_lower_email ON users(LOWER(email));
-- GIN index (for JSON, arrays, full-text)CREATE INDEX idx_products_data ON products USING GIN(data);
-- GiST index (for spatial data)CREATE INDEX idx_locations ON locations USING GIST(geom);
-- BRIN index (for time-series)CREATE INDEX idx_orders_date ON orders USING BRIN(created_at);
-- Drop indexDROP INDEX idx_users_email;
-- ReindexREINDEX TABLE users;REINDEX DATABASE mydb;72.8 Backup and Restore
Section titled “72.8 Backup and Restore”pg_dump
Section titled “pg_dump”# Backup single databasepg_dump -U postgres -Fc mydb > mydb.dump# -Fc = custom format (compressed)# -Fp = plain SQL# -Fd = directory format# -Ft = tar format
# Backup with compressionpg_dump -U postgres -Fc mydb | gzip > mydb.sql.gz
# Backup plain SQLpg_dump -U postgres mydb > mydb.sql
# Backup schema onlypg_dump -U postgres -s mydb > mydb_schema.sql
# Backup data onlypg_dump -U postgres -a mydb > mydb_data.sql
# Backup specific tablespg_dump -U postgres -t users -t orders mydb > tables.sql
# Backup with DROP statementspg_dump -U postgres --clean mydb > mydb.sqlpg_restore
Section titled “pg_restore”# Restore custom formatpg_restore -U postgres -d mydb mydb.dump
# Restore to different databasepg_restore -U postgres -d newdb mydb.dump
# Restore with CREATE DATABASEpg_restore -U postgres --create --dbname=mydb mydb.dump
# Restore schema onlypg_restore -U postgres -d mydb --schema-only mydb.dump
# Restore data onlypg_restore -U postgres -d mydb --data-only mydb.dump
# Drop objects before restorepg_restore -U postgres -d mydb --clean mydb.dumppg_dumpall
Section titled “pg_dumpall”# Backup all databases and rolespg_dumpall -U postgres > all_dbs.sql
# Backup only rolespg_dumpall -U postgres --roles-only > roles.sql
# Backup only tablespacespg_dumpall -U postgres --tablespaces-only > tablespaces.sql
# Restore allpsql -U postgres -f all_dbs.sqlPoint-in-Time Recovery
Section titled “Point-in-Time Recovery”# Base backuppg_basebackup -U postgres -D /var/lib/postgresql/14/main -Xf -P
# Configure recovery# /etc/postgresql/14/main/postgresql.confrestore_command = 'cp /archive/%f %p'recovery_target_time = '2024-01-15 10:00:00 UTC'
# Trigger recoverytouch /var/lib/postgresql/14/main/recovery.signal
# Or for replicatouch /var/lib/postgresql/14/main/standby.signal72.9 Performance Optimization
Section titled “72.9 Performance Optimization”Configuration Tuning
Section titled “Configuration Tuning”# Memoryshared_buffers = 256MB # 25% of RAMeffective_cache_size = 768MB # 75% of RAMwork_mem = 64MB # Per sort operationmaintenance_work_mem = 128MB # For VACUUM, CREATE INDEX
# Query plannerrandom_page_cost = 1.1 # SSD: 1.1, HDD: 4.0effective_io_concurrency = 200 # Parallel I/Odefault_statistics_target = 100 # Planner statistics
# Concurrencymax_connections = 100max_worker_processes = 8max_parallel_workers_per_gather = 4
# WALwal_buffers = 16MBmin_wal_size = 1GBmax_wal_size = 4GB
# Logginglog_min_duration_statement = 1000 # Log queries > 1slog_connections = onlog_disconnections = onlog_lock_waits = on
# Autovacuumautovacuum = onautovacuum_max_workers = 3autovacuum_naptime = 1minQuery Analysis
Section titled “Query Analysis”-- EXPLAIN ANALYZEEXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 1;
-- EXPLAIN (BUFFERS, FORMAT JSON)EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)SELECT * FROM orders WHERE status = 'pending';
-- Query statisticsSELECT query, calls, total_time, mean_time, rowsFROM pg_stat_statementsORDER BY total_time DESCLIMIT 10;
-- Table statisticsSELECT * FROM pg_stat_user_tablesORDER BY seq_scan DESC;
-- Index usageSELECT * FROM pg_stat_user_indexesORDER BY idx_scan DESC;
-- Slow queries (requires pg_stat_statements)-- Enable in postgresql.conf:-- shared_preload_libraries = 'pg_stat_statements'-- CREATE EXTENSION pg_stat_statements;72.10 Replication
Section titled “72.10 Replication”Streaming Replication
Section titled “Streaming Replication”# On primary (postgresql.conf)wal_level = replicamax_wal_senders = 3max_replication_slots = 3wal_keep_size = 1GB
# On primary (pg_hba.conf)host replication replica 192.168.1.0/24 scram-sha-256
# On replica# Clone from primarypg_basebackup -U replication -h primary-host -D /var/lib/postgresql/14/main -Xs -P
# Create replication slotSELECT * FROM pg_create_physical_replication_slot('replica_slot');
# Configure replica (postgresql.conf)primary_conninfo = 'host=primary-host port=5432 user=replication'
# Start replicasudo systemctl start postgresqlReplication Monitoring
Section titled “Replication Monitoring”-- On primarySELECT * FROM pg_stat_replication;
-- On replicaSELECT * FROM pg_stat_wal_receiver;
-- Check replication lagSELECT now() - pg_last_xact_replay_timestamp() AS replication_lag;
-- Replication slotsSELECT * FROM pg_replication_slots;
-- Show slots on primarySELECT slot_name, plugin, slot_type, active FROM pg_replication_slots;72.11 High Availability
Section titled “72.11 High Availability”pgpool-II
Section titled “pgpool-II”# Installsudo apt install pgpool2
# Configure pgpool.confbackend_hostname0 = 'primary'backend_port0 = 5432backend_weight0 = 1backend_hostname1 = 'replica1'backend_port1 = 5432backend_weight1 = 1
# Start pgpoolsudo systemctl start pgpool2Patroni
Section titled “Patroni”scope: postgresnamespace: /servicename: postgresql0
restapi: listen: 0.0.0.0:8008 connect_address: postgresql0.example.com:8008
etcd: hosts: etcd0.example.com:2379
bootstrap: dcs: postgresql: parameters: wal_level: replica max_wal_senders: 10 max_replication_slots: 10 slots: patroni: type: physical
postgresql: data_dir: /data/postgresql pgpass: /tmp/pgpass parameters: hot_standby: "on" wal_keep_size: 1GB72.12 Troubleshooting
Section titled “72.12 Troubleshooting”Common Issues
Section titled “Common Issues”-- Check connectionsSELECT count(*) FROM pg_stat_activity;
-- Kill idle connectionsSELECT pg_terminate_backend(pid)FROM pg_stat_activityWHERE state = 'idle'AND query_start < now() - interval '10 minutes';
-- Check locksSELECT pg_blocking_pids(pid) as blocked_by, pid, usename, query, state, wait_event_typeFROM pg_stat_activityWHERE state != 'idle';
-- Find long queriesSELECT pid, now() - pg_stat_activity.query_start as duration, queryFROM pg_stat_activityWHERE state = 'active'AND now() - pg_stat_activity.query_start > interval '5 minutes';
-- Check database sizeSELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname))FROM pg_database;
-- Check table sizeSELECT relname, pg_size_pretty(pg_total_relation_size(relid))FROM pg_catalog.pg_statio_user_tablesORDER BY pg_total_relation_size(relid) DESCLIMIT 10;
-- Check index sizeSELECT indexrelname, pg_size_pretty(pg_relation_size(indexrelid))FROM pg_stat_user_indexesORDER BY pg_relation_size(indexrelid) DESCLIMIT 10;Log Analysis
Section titled “Log Analysis”# View logstail -f /var/log/postgresql/postgresql-14-main.log
# Enable query logging# postgresql.conflog_statement = 'all'log_duration = onlog_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
# Find slow queriesgrep "duration:" /var/log/postgresql/postgresql-14-main.log | sort -k4 -n | tail -2072.13 Interview Questions
Section titled “72.13 Interview Questions”Basic Questions
Section titled “Basic Questions”-
What is PostgreSQL?
- Advanced open-source relational database system
-
How do you connect to PostgreSQL?
psql -U username -d database
-
What is the difference between a user and a role?
- Role with LOGIN attribute is a user
-
How do you create a database?
- CREATE DATABASE name;
-
What is pg_dump used for?
- Backing up PostgreSQL databases
Intermediate Questions
Section titled “Intermediate Questions”-
Explain the different join types in PostgreSQL
- INNER, LEFT, RIGHT, FULL OUTER, CROSS
-
What is a primary key vs unique constraint?
- Primary: one per table, not null; unique: can have multiple, can be null
-
How does indexing improve performance?
- Creates data structures for faster lookups
-
What is VACUUM in PostgreSQL?
- Reclaims storage and updates statistics
-
Explain ACID properties
- Atomicity, Consistency, Isolation, Durability
Advanced Questions
Section titled “Advanced Questions”-
What is Write-Ahead Logging (WAL)?
- Transactions logged before being applied
-
How does streaming replication work?
- Replica connects to primary and receives WAL in real-time
-
What is a partitioned table?
- Table divided into smaller pieces based on range/list/hash
-
How do you optimize a slow query?
- EXPLAIN ANALYZE, indexes, configuration tuning
-
What is pg_stat_statements?
- Extension for tracking query execution statistics
Summary
Section titled “Summary”PostgreSQL is a powerful, enterprise-grade database:
Quick Reference+------------------------------------------------------------------+| || Connection: || +----------------------------------------------------------+ || | psql -U user -d dbname | || | sudo -u postgres psql | || +----------------------------------------------------------+ || || Key Commands: || +----------------------------------------------------------+ || | \l | List databases | || | \du | List users | || | \dt | List tables | || | \d table | Describe table | || | \q | Quit | || +----------------------------------------------------------+ || || Backup/Restore: || +----------------------------------------------------------+ || | pg_dump -U user dbname > backup.sql | || | psql -U user dbname < backup.sql | || | pg_dumpall -U postgres > all.sql | || +----------------------------------------------------------+ || || Performance: || +----------------------------------------------------------+ || | EXPLAIN ANALYZE query; | || | CREATE INDEX idx ON table(col); | || | VACUUM ANALYZE table; | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+