Pruning
Chapter 30: Pruning & Data Management
Section titled βChapter 30: Pruning & Data ManagementβOverview
Section titled βOverviewβPruning is an essential technique for managing blockchain node storage requirements. As blockchains grow, the amount of historical data increases exponentially. Pruning allows nodes to remove unnecessary data while maintaining full functionality for most use cases.
30.1 Types of Blockchain Nodes
Section titled β30.1 Types of Blockchain Nodesβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ BLOCKCHAIN NODE TYPES ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ β FULL NODE β ββ β - Stores all blocks and state β ββ β - Can verify any transaction β ββ β - Storage: ~1-1.2 TB (Ethereum) β ββ β - Can prune old state data β ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ β ββ βΌ ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ β LIGHT NODE β ββ β - Stores block headers only β ββ β - Validates consensus β ββ β - Storage: ~50 MB β ββ β - Requests state data on demand β ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ β ββ βΌ ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ β ARCHIVE NODE β ββ β - Stores complete blockchain history β ββ β - Enables historical state queries β ββ β - Storage: ~12+ TB (Ethereum) β ββ β - Cannot prune (needs all data) β ββ βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββStorage Requirements Comparison
Section titled βStorage Requirements Comparisonβ| Node Type | Ethereum Mainnet | Use Case |
|---|---|---|
| Pruned Full Node | ~500-800 GB | Staking, DApps, development |
| Full Node | ~1-1.2 TB | Maximum compatibility |
| Archive Node | ~12+ TB | Historical analysis, indexers |
30.2 Pruning Types Explained
Section titled β30.2 Pruning Types ExplainedβGeth Pruning Modes
Section titled βGeth Pruning Modesβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ GETH PRUNING TYPES ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β ββ DEFAULT (Automatic): ββ ββββββββββββββββ ββ ββ - Prunes state trie automatically during normal operation ββ - Keeps recent state accessible ββ - Runs in background ββ - Minimal performance impact ββ ββ PRUNE ANCIENT STORE: ββ βββββββββββββββββββββ ββ ββ - Removes ancient block data ββ - Keeps recent blocks for reorg handling ββ - geth --pruneancientstore ββ ββ MANUAL PRUNING: ββ ββββββββββββββββ ββ ββ - Stop the node ββ - geth removedb ββ - Selectively remove data ββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββErigon Pruning Options
Section titled βErigon Pruning Optionsβ| Flag | Description | Example |
|---|---|---|
--prune h | Prune history (receipts, logs) | --prune h |
--prune t | Prune trie data | --prune t |
--prune c | Prune call traces | --prune c |
--prune.h.older | Keep history newer than N blocks | --prune h --prune.h.older 90000 |
30.3 Implementing Pruning
Section titled β30.3 Implementing PruningβGeth Configuration
Section titled βGeth Configurationβ# Automatic pruning during operationgeth \ --mainnet \ --syncmode "snap" \ --cache 8192 \ --datadir /data/ethereum
# Manual pruning# 1. Stop the nodesystemctl stop geth
# 2. Run manual prunegeth removedb --datadir /data/ethereum
# Output:# Remove database? [y/n]# y# Remove ancient database? [y/n]# y
# 3. Restart nodesystemctl start gethErigon Configuration
Section titled βErigon Configurationβ# Erigon with pruningerigon \ --chain mainnet \ --datadir /data/erigon \ --prune htc \ --prune.h.older 90000 \ --prune.t.older 90000 \ --prune.c.older 90000
# Explanation:# h = history (receipts, logs)# t = trie (state)# c = call traces# older 90000 = keep last ~12 days of dataPruning Configuration File
Section titled βPruning Configuration Fileβ[Eth]# Enable automatic pruningPruning = true# Pruning threshold (blocks)PruningThreshold = 8192
[Database]# Keep ancient data (blocks)KeepBlocks = 90000
[RPC]# Limit storage for RPC# ...30.4 Storage Optimization Strategies
Section titled β30.4 Storage Optimization StrategiesβDatabase Engine Selection
Section titled βDatabase Engine Selectionβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ DATABASE ENGINE OPTIONS ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β ββ LEVELDB (Default in Geth): ββ ββββββββββββββββββββββ ββ - Mature and stable ββ - Good for most use cases ββ - Single-threaded write ββ ββ PEBBLEDB (Faster in Erigon): ββ βββββββββββββββββββββββ ββ - Written in Go ββ - Better concurrent write performance ββ - Recommended for high-throughput nodes ββ ββ ROCKSDB (Nethermind): ββ βββββββββββββββββββ ββ - C++ based ββ - Excellent performance ββ - Used by Nethermind and Polygon ββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββDisk Space Management
Section titled βDisk Space Managementβ# Monitor disk usagedf -h
# Check specific directorydu -sh /data/ethereum/
# View detailed breakdowndu -h --max-depth=2 /data/ethereum/
# Set up alerts# Add to /etc/crontab0 * * * * df -h /data | tail -1 | awk '{if($5 > 80) print "Warning: Disk usage " $5}' | mail -s "Disk Alert" admin@example.com30.5 Pruning Best Practices
Section titled β30.5 Pruning Best PracticesβWhen to Prune
Section titled βWhen to Pruneβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ WHEN TO PRUNE YOUR NODE ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€β ββ RECOMMENDED: ββ βββββββββββββ ββ ββ β
During low-traffic periods ββ β
When disk space is running low ββ β
During scheduled maintenance windows ββ β
Before node upgrades ββ ββ AVOID: ββ ββββββ ββ β During initial sync ββ β During network upgrades ββ β When node is under heavy load ββ β Right after a chain reorg ββ ββ WARNING: ββ βββββββ ββ - Always backup your node data before manual pruning ββ - Archive nodes cannot be pruned ββ - Pruning may take several hours ββ ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββAutomation Scripts
Section titled βAutomation Scriptsβ#!/bin/bash# Check disk spaceDISK_USAGE=$(df -h /data | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt 80 ]; then echo "Disk usage at ${DISK_USAGE}%, starting pruning..."
# Stop node systemctl stop geth
# Run pruning geth removedb --datadir /data/ethereum --oldest-block-target 100000
# Restart node systemctl start geth
echo "Pruning complete"else echo "Disk usage at ${DISK_USAGE}%, no pruning needed"fi30.6 Interview Questions
Section titled β30.6 Interview Questionsβ| Question | Answer |
|---|---|
| What is blockchain pruning? | Removing unnecessary historical data while maintaining node functionality |
| What data gets pruned? | Old state trie, receipts, logs, call traces |
| Can archive nodes prune? | No, archive nodes must keep all historical data |
| Whatβs the difference between full and pruned nodes? | Pruned nodes only keep recent state, full nodes keep more history |
| How much storage does a pruned Ethereum node use? | ~500-800 GB |
Summary
Section titled βSummaryβ- Pruning is essential for managing storage growth
- Different clients have different pruning options
- Regular pruning prevents disk space issues
- Never prune during sync or upgrades
- Monitor disk space and automate pruning
Next Chapter
Section titled βNext ChapterβIn Chapter 31: Node Security Fundamentals, weβll explore node security.
Last Updated: 2026-02-20