Pruning
Chapter 30: Pruning & Data Management
Section titled “Chapter 30: Pruning & Data Management”Overview
Section titled “Overview”Pruning is an essential technique for managing blockchain node storage requirements. As blockchains grow, the amount of historical data increases exponentially. Pruning allows nodes to remove unnecessary data while maintaining full functionality for most use cases.
30.1 Types of Blockchain Nodes
Section titled “30.1 Types of Blockchain Nodes”┌─────────────────────────────────────────────────────────────────────────────┐│ BLOCKCHAIN NODE TYPES │├─────────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ FULL NODE │ ││ │ - Stores all blocks and state │ ││ │ - Can verify any transaction │ ││ │ - Storage: ~1-1.2 TB (Ethereum) │ ││ │ - Can prune old state data │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ LIGHT NODE │ ││ │ - Stores block headers only │ ││ │ - Validates consensus │ ││ │ - Storage: ~50 MB │ ││ │ - Requests state data on demand │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────┐ ││ │ ARCHIVE NODE │ ││ │ - Stores complete blockchain history │ ││ │ - Enables historical state queries │ ││ │ - Storage: ~12+ TB (Ethereum) │ ││ │ - Cannot prune (needs all data) │ ││ └─────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Storage Requirements Comparison
Section titled “Storage Requirements Comparison”| Node Type | Ethereum Mainnet | Use Case |
|---|---|---|
| Pruned Full Node | ~500-800 GB | Staking, DApps, development |
| Full Node | ~1-1.2 TB | Maximum compatibility |
| Archive Node | ~12+ TB | Historical analysis, indexers |
30.2 Pruning Types Explained
Section titled “30.2 Pruning Types Explained”Geth Pruning Modes
Section titled “Geth Pruning Modes”┌─────────────────────────────────────────────────────────────────┐│ GETH PRUNING TYPES │├─────────────────────────────────────────────────────────────────┤│ ││ DEFAULT (Automatic): ││ ━━━━━━━━━━━━━━━━ ││ ││ - Prunes state trie automatically during normal operation ││ - Keeps recent state accessible ││ - Runs in background ││ - Minimal performance impact ││ ││ PRUNE ANCIENT STORE: ││ ━━━━━━━━━━━━━━━━━━━━━ ││ ││ - Removes ancient block data ││ - Keeps recent blocks for reorg handling ││ - geth --pruneancientstore ││ ││ MANUAL PRUNING: ││ ━━━━━━━━━━━━━━━━ ││ ││ - Stop the node ││ - geth removedb ││ - Selectively remove data ││ │└─────────────────────────────────────────────────────────────────┘Erigon Pruning Options
Section titled “Erigon Pruning Options”| Flag | Description | Example |
|---|---|---|
--prune h | Prune history (receipts, logs) | --prune h |
--prune t | Prune trie data | --prune t |
--prune c | Prune call traces | --prune c |
--prune.h.older | Keep history newer than N blocks | --prune h --prune.h.older 90000 |
30.3 Implementing Pruning
Section titled “30.3 Implementing Pruning”Geth Configuration
Section titled “Geth Configuration”# Automatic pruning during operationgeth \ --mainnet \ --syncmode "snap" \ --cache 8192 \ --datadir /data/ethereum
# Manual pruning# 1. Stop the nodesystemctl stop geth
# 2. Run manual prunegeth removedb --datadir /data/ethereum
# Output:# Remove database? [y/n]# y# Remove ancient database? [y/n]# y
# 3. Restart nodesystemctl start gethErigon Configuration
Section titled “Erigon Configuration”# Erigon with pruningerigon \ --chain mainnet \ --datadir /data/erigon \ --prune htc \ --prune.h.older 90000 \ --prune.t.older 90000 \ --prune.c.older 90000
# Explanation:# h = history (receipts, logs)# t = trie (state)# c = call traces# older 90000 = keep last ~12 days of dataPruning Configuration File
Section titled “Pruning Configuration File”[Eth]# Enable automatic pruningPruning = true# Pruning threshold (blocks)PruningThreshold = 8192
[Database]# Keep ancient data (blocks)KeepBlocks = 90000
[RPC]# Limit storage for RPC# ...30.4 Storage Optimization Strategies
Section titled “30.4 Storage Optimization Strategies”Database Engine Selection
Section titled “Database Engine Selection”┌─────────────────────────────────────────────────────────────────┐│ DATABASE ENGINE OPTIONS │├─────────────────────────────────────────────────────────────────┤│ ││ LEVELDB (Default in Geth): ││ ━━━━━━━━━━━━━━━━━━━━━━ ││ - Mature and stable ││ - Good for most use cases ││ - Single-threaded write ││ ││ PEBBLEDB (Faster in Erigon): ││ ━━━━━━━━━━━━━━━━━━━━━━━ ││ - Written in Go ││ - Better concurrent write performance ││ - Recommended for high-throughput nodes ││ ││ ROCKSDB (Nethermind): ││ ━━━━━━━━━━━━━━━━━━━ ││ - C++ based ││ - Excellent performance ││ - Used by Nethermind and Polygon ││ │└─────────────────────────────────────────────────────────────────┘Disk Space Management
Section titled “Disk Space Management”# Monitor disk usagedf -h
# Check specific directorydu -sh /data/ethereum/
# View detailed breakdowndu -h --max-depth=2 /data/ethereum/
# Set up alerts# Add to /etc/crontab0 * * * * df -h /data | tail -1 | awk '{if($5 > 80) print "Warning: Disk usage " $5}' | mail -s "Disk Alert" admin@example.com30.5 Pruning Best Practices
Section titled “30.5 Pruning Best Practices”When to Prune
Section titled “When to Prune”┌─────────────────────────────────────────────────────────────────┐│ WHEN TO PRUNE YOUR NODE │├─────────────────────────────────────────────────────────────────┤│ ││ RECOMMENDED: ││ ━━━━━━━━━━━━━ ││ ││ ✅ During low-traffic periods ││ ✅ When disk space is running low ││ ✅ During scheduled maintenance windows ││ ✅ Before node upgrades ││ ││ AVOID: ││ ━━━━━━ ││ ❌ During initial sync ││ ❌ During network upgrades ││ ❌ When node is under heavy load ││ ❌ Right after a chain reorg ││ ││ WARNING: ││ ━━━━━━━ ││ - Always backup your node data before manual pruning ││ - Archive nodes cannot be pruned ││ - Pruning may take several hours ││ │└─────────────────────────────────────────────────────────────────┘Automation Scripts
Section titled “Automation Scripts”#!/bin/bash# Check disk spaceDISK_USAGE=$(df -h /data | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt 80 ]; then echo "Disk usage at ${DISK_USAGE}%, starting pruning..."
# Stop node systemctl stop geth
# Run pruning geth removedb --datadir /data/ethereum --oldest-block-target 100000
# Restart node systemctl start geth
echo "Pruning complete"else echo "Disk usage at ${DISK_USAGE}%, no pruning needed"fi30.6 Interview Questions
Section titled “30.6 Interview Questions”| Question | Answer |
|---|---|
| What is blockchain pruning? | Removing unnecessary historical data while maintaining node functionality |
| What data gets pruned? | Old state trie, receipts, logs, call traces |
| Can archive nodes prune? | No, archive nodes must keep all historical data |
| What’s the difference between full and pruned nodes? | Pruned nodes only keep recent state, full nodes keep more history |
| How much storage does a pruned Ethereum node use? | ~500-800 GB |
Summary
Section titled “Summary”- Pruning is essential for managing storage growth
- Different clients have different pruning options
- Regular pruning prevents disk space issues
- Never prune during sync or upgrades
- Monitor disk space and automate pruning
Next Chapter
Section titled “Next Chapter”In Chapter 31: Node Security Fundamentals, we’ll explore node security.
Last Updated: 2026-02-20