Skip to content

Delete_data

Safely Removing Data from PostgreSQL Tables

Section titled “Safely Removing Data from PostgreSQL Tables”

The DELETE statement removes rows from a table. Understanding MVCC (Multi-Version Concurrency Control) is important - DELETE marks rows as deleted but doesn’t immediately reclaim space.

DELETE vs TRUNCATE vs DROP
========================================================================
┌─────────────────────────────────────────────────────────────────────┐
│ Operation │ Effect │ Use Case │
├───────────────┼────────────────────────────┼──────────────────────┤
│ DELETE │ Remove rows (MVCC) │ Conditional removal│
├───────────────┼────────────────────────────┼──────────────────────┤
│ TRUNCATE │ Fast removal, resets seq │ Empty entire table │
├───────────────┼────────────────────────────┼──────────────────────┤
│ DROP TABLE │ Remove table completely │ Remove table + data│
└───────────────┴────────────────────────────┴──────────────────────┘
MVCC Impact:
=============
DELETE marks rows as deleted but keeps them in the table until VACUUM
runs to reclaim the space.
✓ DELETE is MVCC-safe (concurrent readers see original data)
✓ Can be rolled back in transactions
✓ Supports WHERE conditions
✗ Space not immediately reclaimed
✗ Slower than TRUNCATE for large tables
========================================================================

-- Sample table
CREATE TABLE logs (
id SERIAL PRIMARY KEY,
message TEXT,
severity VARCHAR(20),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO logs (message, severity) VALUES
('User logged in', 'info'),
('Error occurred', 'error'),
('Warning: low memory', 'warning'),
('System started', 'info');
-- Delete with WHERE clause (always use WHERE!)
DELETE FROM logs WHERE severity = 'warning';
-- Delete specific row
DELETE FROM logs WHERE id = 1;
-- Delete multiple rows
DELETE FROM logs WHERE severity IN ('info', 'warning');
-- Delete with pattern matching
DELETE FROM logs WHERE message LIKE '%password%';
-- Delete with subquery
DELETE FROM logs WHERE severity = 'error'
AND created_at < (SELECT created_at FROM logs ORDER BY created_at DESC LIMIT 1);

-- Return deleted rows
DELETE FROM logs WHERE severity = 'error'
RETURNING id, message, created_at;
-- Useful for audit trails
DELETE FROM orders WHERE status = 'cancelled'
RETURNING id, user_id, total_amount, deleted_at;
-- After deletion, you know what was removed
-- Example application pattern:
-- 1. DELETE ... RETURNING *
-- 2. Use returned data to notify user, update caches, etc.
-- Delete and count
DELETE FROM users WHERE is_active = FALSE
RETURNING id, username;
-- In applications, returns affected row count
-- Example output: DELETE 10

-- DELETE with FROM (PostgreSQL syntax)
-- Equivalent to DELETE with subquery
-- Delete using another table
DELETE FROM logs l
USING users u
WHERE l.user_id = u.id
AND u.is_active = FALSE;
-- More efficient than:
-- DELETE FROM logs WHERE user_id IN (SELECT id FROM users WHERE is_active = FALSE)
-- Delete using complex join
DELETE FROM order_items oi
USING orders o
WHERE oi.order_id = o.id
AND o.status = 'cancelled';
-- Delete with multiple tables
DELETE FROM logs l
USING events e, users u
WHERE l.event_id = e.id
AND l.user_id = u.id
AND e.type = 'deleted'
AND u.is_active = FALSE;

-- Using CTE to identify rows
WITH old_logs AS (
SELECT id FROM logs WHERE created_at < NOW() - INTERVAL '90 days'
)
DELETE FROM logs WHERE id IN (SELECT id FROM old_logs)
RETURNING id;
-- CTE with aggregation
WITH stats AS (
SELECT severity, COUNT(*) as cnt
FROM logs
GROUP BY severity
HAVING COUNT(*) > 100
)
DELETE FROM logs WHERE severity IN (SELECT severity FROM stats)
RETURNING severity, id;
-- Complex delete with multiple CTEs
WITH
to_keep AS (
SELECT id FROM logs WHERE severity = 'critical'
),
to_archive AS (
SELECT * FROM logs WHERE id NOT IN (SELECT id FROM to_keep)
)
DELETE FROM logs WHERE id IN (SELECT id FROM to_archive)
RETURNING id, message;
-- Using recursive CTE
WITH RECURSIVE user_org AS (
-- Start with specific user
SELECT id, name, manager_id, 1 as level
FROM employees WHERE name = 'John Doe'
UNION ALL
-- Find all subordinates
SELECT e.id, e.name, e.manager_id, uo.level + 1
FROM employees e
JOIN user_org uo ON e.manager_id = uo.id
)
DELETE FROM employees
WHERE id IN (SELECT id FROM user_org WHERE level > 1);

-- TRUNCATE is faster than DELETE
-- Removes all rows, resets sequences
-- Basic TRUNCATE
TRUNCATE TABLE logs;
-- TRUNCATE multiple tables
TRUNCATE TABLE orders, order_items, products;
-- RESTART IDENTITY - resets auto-increment sequences
TRUNCATE TABLE users RESTART IDENTITY;
-- CONTINUE IDENTITY (default) - keeps current sequence value
TRUNCATE TABLE logs CONTINUE IDENTITY;
-- TRUNCATE with CASCADE (truncates dependent tables)
TRUNCATE TABLE users CASCADE;
-- TRUNCATE with RESTRICT (default - fails if foreign keys)
TRUNCATE TABLE users RESTRICT;
-- PostgreSQL 13+: TRUNCATE with identity
-- Automatically handles foreign key relationships
TRUNCATE TABLE orders CASCADE;

-- DELETE: Remove specific rows
DELETE FROM table WHERE condition;
-- ✓ Row-level deletion
-- ✓ Can use WHERE
-- ✓ Can RETURNING
-- ✓ Can be rolled back
-- ✓ Fires triggers
-- ✗ Slower for large datasets
-- TRUNCATE: Remove all rows
TRUNCATE TABLE table_name;
-- ✗ Cannot use WHERE
-- ✓ Very fast (O(1) vs O(n))
-- ✓ Resets sequences (optional)
-- ✓ Can RETURNING (PostgreSQL 14+)
-- ✗ Can be rolled back (within transaction)
-- ✗ Fires triggers (only for 14+)
-- DROP: Remove entire table
DROP TABLE table_name;
-- ✗ Removes table completely
-- ✗ Cannot recover (unless from backup)
-- ✓ Removes all data and structure
-- When to use each:
-- DELETE: Remove specific rows based on conditions
-- TRUNCATE: Empty entire table quickly
-- DROP: Remove table entirely (development only!)

-- DELETE within transaction
BEGIN;
-- Save point before risky delete
SAVEPOINT before_cleanup;
DELETE FROM logs WHERE created_at < '2023-01-01';
-- Check affected rows
-- If satisfied:
COMMIT;
-- If not satisfied:
ROLLBACK TO SAVEPOINT before_cleanup;
-- Safe delete with verification
BEGIN;
-- First, see what will be deleted
SELECT COUNT(*) as rows_to_delete
FROM logs WHERE severity = 'debug';
-- If acceptable, proceed
DELETE FROM logs WHERE severity = 'debug';
COMMIT;
-- Using advisory locks for safe concurrent deletes
SELECT pg_advisory_lock(12345); -- Acquire lock
DELETE FROM inventory WHERE product_id = 1 AND quantity > 0;
SELECT pg_advisory_unlock(12345); -- Release lock

-- Check existing foreign keys
SELECT
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name,
rc.delete_rule
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
JOIN information_schema.referential_constraints AS rc
ON rc.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY';
-- Delete order matters with foreign keys
-- First delete child records
DELETE FROM order_items WHERE order_id IN (SELECT id FROM orders WHERE user_id = 1);
-- Then delete parent records
DELETE FROM orders WHERE user_id = 1;
-- Finally delete users
DELETE FROM users WHERE id = 1;
-- Or use CASCADE
-- (Add ON DELETE CASCADE to foreign key constraint)

-- Instead of hard delete, use soft delete
-- Add is_deleted column
ALTER TABLE users ADD COLUMN is_deleted BOOLEAN DEFAULT FALSE;
-- Use soft delete
UPDATE users SET is_deleted = TRUE WHERE id = 1;
-- Query only active records
SELECT * FROM users WHERE is_deleted = FALSE;
-- Create view for convenience
CREATE VIEW active_users AS
SELECT * FROM users WHERE is_deleted = FALSE;
-- Add index for performance
CREATE INDEX idx_users_not_deleted ON users(id) WHERE is_deleted = FALSE;
-- When you really need to delete
-- Create archive table
CREATE TABLE users_archive (LIKE users INCLUDING ALL);
-- Move to archive
INSERT INTO users_archive
SELECT * FROM users WHERE is_deleted = TRUE;
-- Remove from main table
DELETE FROM users WHERE is_deleted = TRUE;

DELETE Best Practices
========================================================================
1. Always Use WHERE Clause
┌──────────────────────────────────────────────────────────────┐
│ ❌ DELETE FROM users; │
│ (Deletes ALL rows!) │
│ │
│ ✓ DELETE FROM users WHERE id = 1; │
└──────────────────────────────────────────────────────────────┘
2. Test with SELECT First
┌──────────────────────────────────────────────────────────────┐
│ -- Always test: │
│ SELECT * FROM table WHERE condition; │
│ -- Then: │
│ DELETE FROM table WHERE condition; │
└──────────────────────────────────────────────────────────────┘
3. Use Soft Deletes for Critical Data
┌──────────────────────────────────────────────────────────────┐
│ • Use is_deleted flag for audit-critical tables │
│ • Keep historical data accessible │
│ • Add index for active records only │
└──────────────────────────────────────────────────────────────┘
4. Use TRUNCATE for Bulk Removal
┌──────────────────────────────────────────────────────────────┐
│ • Faster than DELETE for removing all rows │
│ • Use RESTART IDENTITY to reset auto-increment │
│ • Consider CASCADE for dependent tables │
└──────────────────────────────────────────────────────────────┘
5. Handle Foreign Keys
┌──────────────────────────────────────────────────────────────┐
│ • Delete child records before parent records │
│ • Use ON DELETE CASCADE for automated deletion │
│ • Or manually enforce referential integrity │
└──────────────────────────────────────────────────────────────┘
6. Use RETURNING
┌──────────────────────────────────────────────────────────────┐
│ • Track what was deleted for audit/complete operations │
│ • Useful for cascade operations │
└──────────────────────────────────────────────────────────────┘
========================================================================

Chapter 12: Understanding JOINs


Last Updated: February 2026