Sync_issues
Chapter 46: Sync Problems & Solutions
Section titled “Chapter 46: Sync Problems & Solutions”Overview
Section titled “Overview”Synchronization issues are among the most common problems encountered when running blockchain nodes. A node that fails to sync properly cannot participate in the network, serve RPC requests, or validate transactions. Understanding the various sync issues, their symptoms, and solutions is essential for maintaining a healthy blockchain infrastructure. This chapter provides comprehensive troubleshooting guidance for Ethereum and other blockchain networks.
46.1 Understanding Blockchain Synchronization
Section titled “46.1 Understanding Blockchain Synchronization”How Sync Works
Section titled “How Sync Works”┌─────────────────────────────────────────────────────────────┐│ BLOCKCHAIN SYNC PROCESS │├─────────────────────────────────────────────────────────────┤│ ││ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ││ │ Network │───►│ Discovery │───►│ Download │ ││ │ Peers │ │ Blocks │ │ Headers │ ││ └─────────────┘ └─────────────┘ └─────────────┘ ││ │ ││ ▼ ││ ┌─────────────┐ ││ │ Execute │ ││ │ State │ ││ └─────────────┘ ││ │ ││ ▼ ││ ┌─────────────┐ ││ │ Verify │ ││ │ Results │ ││ └─────────────┘ ││ │ ││ ▼ ││ ┌─────────────┐ ││ │ Finalized │ ││ │ State │ ││ └─────────────┘ ││ │└─────────────────────────────────────────────────────────────┘Sync Stages
Section titled “Sync Stages”| Stage | Description | Duration |
|---|---|---|
| Header Sync | Download block headers | Minutes to hours |
| Body Sync | Download block bodies | Hours |
| State Sync | Download state trie | Hours to days |
| Catchup | Near HEAD, executing | Minutes |
46.2 Stuck Sync
Section titled “46.2 Stuck Sync”Symptoms
Section titled “Symptoms”┌─────────────────────────────────────────┐│ STUCK SYNC INDICATORS │├─────────────────────────────────────────┤│ ││ ✓ Block height not progressing ││ ✓ Peer count appears healthy ││ ✓ No error messages in logs ││ ✓ CPU usage normal ││ ✓ Memory usage normal ││ ││ Example: ││ Current Block: 18500000 ││ Target Block: 18500100 ││ (No change for 30+ minutes) ││ │└─────────────────────────────────────────┘Diagnosis Commands
Section titled “Diagnosis Commands”# Check current blockcurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
# Check sync statuscurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
# Check peer countcurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}'
# View recent logsjournalctl -u geth --since "30 minutes ago" | tail -50Root Causes
Section titled “Root Causes”| Cause | Description | Probability |
|---|---|---|
| Database corruption | LevelDB/RocksDB corruption | High |
| Peer issues | Connected to stale peers | Medium |
| Consensus issues | Fork in chain | Low |
| Resource constraints | Memory/disk pressure | Medium |
| Network partition | Isolated from network | Low |
Solutions
Section titled “Solutions”1. Fresh Sync (Recommended)
Section titled “1. Fresh Sync (Recommended)”# Stop the nodesudo systemctl stop geth
# Backup important data (keystore only)cp -r /data/ethereum/keystore /backup/keystore
# Remove chaindatarm -rf /data/ethereum/geth/chaindata
# Remove state datarm -rf /data/ethereum/geth/triecache
# Optional: Remove snapshots if corruptedrm -rf /data/ethereum/geth/snapshots
# Start the nodesudo systemctl start geth
# Monitor sync progressgeth attach http://localhost:8545> eth.syncing2. Try Different Bootnodes
Section titled “2. Try Different Bootnodes”# Ethereum Mainnet Bootnodesgeth --bootnodes \ "enr:-KG4QOtcLhT1LioJW5XHmhLGr9jnoJ5XF8J8p TzW7yGrqDzoP3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8=-BMgBFYHr7tJ3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8@bootnode1.mainnet.ethdisco.net:30303,\ enr:-Ly4QFn-6sJ8tJ3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8=-BMgBFYHr7tJ3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8@bootnode2.mainnet.ethdisco.net:30303,\ enr:-Ku4QO7sJ8tJ3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8=-BMgBFYHr7tJ3z6E1T5C9LwK3uK8Q6G7F9B2M1K3W8@bootnode3.mainnet.ethdisco.net:30303"3. Check System Resources
Section titled “3. Check System Resources”# Check memoryfree -h
# Check disk spacedf -h
# Check disk I/Oiostat -x 5
# Check CPUtophtop
# Check for OOM killerdmesg | grep -i "out of memory"4. Reset Database
Section titled “4. Reset Database”# Use geth removedb utilitygeth removedb --datadir /data/ethereum
# Or specify individual databasesgeth removedb --datadir /data/ethereum --chaindatageth removedb --datadir /data/ethereum --ancient46.3 Slow Sync
Section titled “46.3 Slow Sync”Symptoms
Section titled “Symptoms”┌─────────────────────────────────────────┐│ SLOW SYNC INDICATORS │├─────────────────────────────────────────┤│ ││ ✓ Sync progressing but very slowly ││ ✓ Block height increases slowly ││ ✓ High resource usage ││ ✓ Many peer disconnections ││ ││ Example Progress: ││ Before: 100 blocks/hour ││ After: 10 blocks/hour ││ │└─────────────────────────────────────────┘Performance Comparison
Section titled “Performance Comparison”| Sync Mode | Speed | Resource Usage |
|---|---|---|
| Full Sync | Slowest | High |
| Fast Sync | Medium | Medium |
| Snap Sync | Fast | Medium |
| Light Sync | Fastest | Low |
Solutions
Section titled “Solutions”1. Increase Peer Count
Section titled “1. Increase Peer Count”# Increase max peersgeth --maxpeers 100 --datadir /data/ethereum
# Check current peer countgeth attach http://localhost:8545> net.peerCount2. Optimize Cache
Section titled “2. Optimize Cache”# Increase cache for 32GB+ RAM systemgeth --cache 8192 --datadir /data/ethereum
# For 64GB+ systemgeth --cache 16384 --datadir /data/ethereum
# Check current cache usagegeth attach http://localhost:8545> debug.getBlockChainInfo()3. Use Faster Storage
Section titled “3. Use Faster Storage”# Use NVMe instead of SATA SSD# Check current storage typelsblk -o NAME,TYPE,SIZE,MODEL
# Benchmark storagefio --name=seqread --readonly --filename=/tmp/fiotest --ioengine=libaio --iodepth=1 --bs=4M --direct=1 --size=1G --numjobs=1 --runtime=60 --time_based
# Recommended: Samsung 990 Pro, WD Black SN850X4. Switch to Snap Sync
Section titled “4. Switch to Snap Sync”# Use snap sync (default in recent Geth versions)geth --syncmode snap --datadir /data/ethereum
# Verify sync modegeth attach http://localhost:8545> eth.syncing46.4 No Progress (Not Syncing)
Section titled “46.4 No Progress (Not Syncing)”Diagnosis
Section titled “Diagnosis”# 1. Check if actually syncingcurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
# Response should show:# - currentBlock# - highestBlock# - knownStates# - pulledStates
# If not syncing, response is "false"
# 2. Check network connectivitycurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"net_listening","params":[],"id":1}'
# 3. Check peerscurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"admin_peers","params":[],"id":1}'Solutions
Section titled “Solutions”1. Verify Network Connection
Section titled “1. Verify Network Connection”# Test internet connectivityping -c 4 8.8.8.8ping -c 4 google.com
# Test specific portsnc -zv bootnode.mainnet.ethdisco.net 30303
# Check firewallsudo ufw statussudo iptables -L -n2. Check Logs for Errors
Section titled “2. Check Logs for Errors”# View recent errorsjournalctl -u geth --since "1 hour ago" | grep -i "error"
# Check for specific issuesjournalctl -u geth | grep -i "no peers"journalctl -u geth | grep -i "connection refused"journalctl -u geth | grep -i "timeout"3. Restart with Fresh State
Section titled “3. Restart with Fresh State”# Stop nodesudo systemctl stop geth
# Backup state (optional)cp -r /data/ethereum/geth/chaindata /backup/chaindata-backup
# Remove staterm -rf /data/ethereum/geth/chaindatarm -rf /data/ethereum/geth/triecache
# Restart with new peerssudo systemctl start geth46.5 Fork Issues
Section titled “46.5 Fork Issues”Detecting Forks
Section titled “Detecting Forks”# Check if you're on a fork# Compare your block with external block explorers
# Get your current blockcurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
# Compare with:# - https://etherscan.io/blocks# - https://blockchain.info/
# Check for reorgsgeth attach http://localhost:8545> debug.traceBlockByNumber(eth.blockNumber)Resolving Fork Issues
Section titled “Resolving Fork Issues”# 1. Check which fork you're ongeth attach http://localhost:8545> eth.getBlock(eth.blockNumber).hash
# 2. Compare with canonical chain# Etherscan APIcurl "https://api.etherscan.io/api?module=block&action=getblocknobytime×tamp=$(date +%s)&closest=before"
# 3. If on wrong fork, resyncsudo systemctl stop gethrm -rf /data/ethereum/geth/chaindatasudo systemctl start geth46.6 Era Merge / History Issues
Section titled “46.6 Era Merge / History Issues”Understanding Erigon/Post-Merge Issues
Section titled “Understanding Erigon/Post-Merge Issues”After The Merge, syncing became more complex:
┌─────────────────────────────────────────────────────────────┐│ POST-MERGE SYNC CHALLENGES │├─────────────────────────────────────────────────────────────┤│ ││ Execution Layer (EL) ◄──► Consensus Layer (CL) ││ ││ Issues: ││ ✓ CL node must be running for EL to sync ││ ✓ EL needs authenticated connection to CL ││ ✓ Different sync modes for EL vs CL ││ ✓ Beacon chain data required ││ │└─────────────────────────────────────────────────────────────┘Fix EL/CL Sync Issues
Section titled “Fix EL/CL Sync Issues”# 1. Ensure both EL and CL are runningsystemctl status gethsystemctl status lighthousesystemctl status prysmsystemctl status teku
# 2. Check EL/CL connectiongeth attach http://localhost:8545> debug.getExecutionEnginePayload()
# 3. Check JWT secretcat /data/ethereum/geth/jwtsecret# Should match CL configuration
# 4. If using Lighthouse:lighthouse --execution-endpoint http://localhost:8551 \ --execution-jwt /data/ethereum/geth/jwtsecret
# 5. Check for sync progresscurl -X GET http://localhost:5052/eth/v1/node/syncing46.7 Erigon-Specific Sync Issues
Section titled “46.7 Erigon-Specific Sync Issues”Erigon Common Problems
Section titled “Erigon Common Problems”# Erigon requires specific flagserigon \ --datadir /data/ethereum \ --chain mainnet \ --http \ --http.addr 0.0.0.0 \ --http.port 8545 \ --http.api eth,net,debug,trace,txpool \ --ws
# Check Erigon sync statuscurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
# Erigon flags for sync issues# --prune.mode=minimal # Keep all history# --prune.mode=archive # Full archive# --externalcl # Use external consensus client46.8 Validator Sync Issues
Section titled “46.8 Validator Sync Issues”Lighthouse/Prysm Specific
Section titled “Lighthouse/Prysm Specific”# Check Lighthouse sync statuscurl -X GET http://localhost:5052/eth/v1/node/syncing | jq
# Check block proposedcurl -X GET http://localhost:5052/eth/v1/beacon/headers | jq
# Check validator dutiescurl -X GET http://localhost:5052/eth/v1/validator/duties/proposer/1200 | jq
# Check if beacon chain syncedcurl -X GET http://localhost:5052/eth/v1/node/health | jqFix Validator Sync
Section titled “Fix Validator Sync”# 1. Check if beacon chain is syncedlighthouse bn --testnet=mainnet status
# 2. Check EL connectioncurl -X POST http://localhost:8551 \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $(cat /data/ethereum/geth/jwtsecret)" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}'
# 3. Reset beacon chain datalighthouse bn --testnet=mainnet db clear
# 4. Restartsystemctl restart lighthouse46.9 Network-Specific Issues
Section titled “46.9 Network-Specific Issues”Testnet vs Mainnet
Section titled “Testnet vs Mainnet”# Testnet Sync (Sepolia)geth --sepolia \ --syncmode snap \ --bootnodes "enr:-KG4QO..."
# Testnet Sync (Goerli - deprecated)geth --goerli \ --syncmode snap
# Mainnet Syncgeth --mainnet \ --syncmode snapPrivate Network Issues
Section titled “Private Network Issues”# Check genesis file matches networkcat /data/ethereum/geth/genesis.json | jq '.config'
# Verify network IDcurl -X POST http://localhost:8545 \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"net_version","params":[],"id":1}'
# Check if bootnodes are correctgeth --bootnodes "enr:..." --network-id 1234546.10 Monitoring & Prevention
Section titled “46.10 Monitoring & Prevention”Prometheus Alerts
Section titled “Prometheus Alerts”groups:- name: sync_alerts rules: - alert: NodeNotSyncing expr: ethereum_block_height{job="geth"} == 0 for: 5m labels: severity: critical annotations: summary: "Node {{ $labels.instance }} is not syncing"
- alert: SyncStalled expr: rate(ethereum_block_height[5m]) == 0 for: 15m labels: severity: warning annotations: summary: "Node {{ $labels.instance }} sync stalled"
- alert: SyncTooSlow expr: rate(ethereum_block_height[5m]) < 10 for: 30m labels: severity: warning annotations: summary: "Node {{ $labels.instance }} sync too slow"
- alert: NodeBehindNetwork expr: (ethereum_network_block - ethereum_block_height) > 100 for: 10m labels: severity: warning annotations: summary: "Node {{ $labels.instance }} is behind network by 100+ blocks"Health Check Script
Section titled “Health Check Script”#!/bin/bashRPC_URL="http://localhost:8545"MAX_BLOCK_DIFF=10
# Get current blockCURRENT=$(curl -s -X POST $RPC_URL \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \ | jq -r '.result' | printf "%d\n" "0x$(cat)")
# Get network block (from block explorers)NETWORK_BLOCK=$(curl -s "https://api.etherscan.io/api?module=block&action=getblocknobytime×tamp=$(date +%s)&closest=before" \ | jq -r '.result[0].blockNumber')
DIFF=$((NETWORK_BLOCK - CURRENT))
if [ $DIFF -gt $MAX_BLOCK_DIFF ]; then echo "WARNING: Node is $DIFF blocks behind network" exit 1else echo "OK: Node is synced (diff: $DIFF)" exit 0fi46.11 Quick Reference
Section titled “46.11 Quick Reference”Common Commands
Section titled “Common Commands”| Issue | Solution |
|---|---|
| Stuck at block X | rm -rf /data/ethereum/geth/chaindata && systemctl restart geth |
| No peers | Add bootnodes, check firewall |
| Slow sync | Increase peers, use snap sync, upgrade hardware |
| Database corruption | geth removedb then resync |
| Fork issues | Compare with Etherscan, resync if needed |
| Post-Merge issues | Ensure CL node running, check JWT |
Emergency Commands
Section titled “Emergency Commands”# Nuclear option - delete everything except keyssystemctl stop gethrm -rf /data/ethereum/geth/*cp -r /backup/keystore /data/ethereum/keystoresystemctl start gethSummary
Section titled “Summary”- Stuck sync is often solved by fresh sync or database reset
- Slow sync requires hardware upgrades or configuration tuning
- No progress is usually network or resource related
- Post-Merge requires both EL and CL to be synced
- Monitoring is critical for early problem detection
- Prevention: Regular health checks, adequate resources, updated software
Next Chapter
Section titled “Next Chapter”In Chapter 47: Memory & CPU Optimization, we’ll explore performance optimization techniques.
Last Updated: 2026-02-22