Backup

Chapter 42: Backup & Recovery

Overview

Data backup and recovery is a critical aspect of blockchain node operations. Proper backup strategies protect against data loss due to hardware failures, software bugs, human error, or security incidents. This chapter provides comprehensive guidance on what to backup, how to backup, and how to recover from various failure scenarios.

42.1 What to Backup

Backup Priority Matrix

┌─────────────────────────────────────────────────────────────┐
│                 BACKUP PRIORITY MATRIX                         │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  CRITICAL (Must Backup):                                    │
│  ┌─────────────────────────────────────────────────────┐  │
│  │ • Keystore files (private keys)                     │  │
│  │ • Validator keys                                    │  │
│  │ • Mnemonic phrases                                  │  │
│  │ • Node private key                                  │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                              │
│  HIGH (Should Backup):                                      │
│  ┌─────────────────────────────────────────────────────┐  │
│  │ • Configuration files                               │  │
│  │ • JWT secret (for CL-EL communication)            │  │
│  │ • BLS keys (Ethereum 2)                             │  │
│  │ • Database (if practical)                           │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                              │
│  MEDIUM (Nice to Have):                                     │
│  ┌─────────────────────────────────────────────────────┐  │
│  │ • Historical blockchain data                        │  │
│  │ • Snapshots                                         │  │
│  │ • State data                                        │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                              │
│  LOW (Not Needed):                                          │
│  ┌─────────────────────────────────────────────────────┐  │
│  │ • Peer connections                                  │  │
│  │ • Cache data                                        │  │
│  │ • Log files                                         │  │
│  └─────────────────────────────────────────────────────┘  │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Ethereum Backup Checklist

Data	Priority	Size	Backup Method
Keystore	Critical	< 1 MB	Encrypted, multiple copies
Validator Keys	Critical	< 1 MB	Hardware wallet + encrypted
nodekey	High	32 bytes	Single encrypted copy
JWT Secret	High	32 bytes	Same as validator
Config Files	High	< 1 MB	Version control
Database	Medium	1-12 TB	Incremental, offsite

42.2 Backup Commands

Ethereum Key Backup

# Create backup directory
mkdir -p /backup/ethereum/$(date +%Y%m%d)
BACKUP_DIR="/backup/ethereum/$(date +%Y%m%d)"

# Backup keystore (CRITICAL)
tar -czf $BACKUP_DIR/keystore.tar.gz -C /data/ethereum/ keystore/

# Backup node key
cp /data/ethereum/geth/nodekey $BACKUP_DIR/nodekey

# Backup JWT secret
cp /data/ethereum/geth/jwtsecret $BACKUP_DIR/jwtsecret

# Backup configuration
tar -czf $BACKUP_DIR/config.tar.gz -C /data/ethereum/ geth/

# Verify backup
ls -la $BACKUP_DIR/

# Encrypt critical backup
gpg --symmetric --cipher-algo AES256 $BACKUP_DIR/keystore.tar.gz
gpg --symmetric --cipher-algo AES256 $BACKUP_DIR/nodekey

Automated Backup Script

#!/bin/bash
# Configuration
BACKUP_DIR="/backup/ethereum"
SOURCE_DATA="/data/ethereum"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

# Create timestamped backup
mkdir -p $BACKUP_DIR/$DATE

echo "=== Starting Backup: $DATE ==="

# 1. Backup keystore (critical)
echo "Backing up keystore..."
tar -czf $BACKUP_DIR/$DATE/keystore.tar.gz \
    -C $SOURCE_DATA keystore/

# 2. Backup node key
echo "Backing up node key..."
if [ -f "$SOURCE_DATA/geth/nodekey" ]; then
    cp $SOURCE_DATA/geth/nodekey $BACKUP_DIR/$DATE/nodekey
fi

# 3. Backup JWT secret
echo "Backing up JWT secret..."
if [ -f "$SOURCE_DATA/geth/jwtsecret" ]; then
    cp $SOURCE_DATA/geth/jwtsecret $BACKUP_DIR/$DATE/jwtsecret
fi

# 4. Backup configuration
echo "Backing up configuration..."
tar -czf $BACKUP_DIR/$DATE/config.tar.gz \
    -C $SOURCE_DATA geth/config.toml

# 5. Verify
echo "Verifying backup..."
for file in $BACKUP_DIR/$DATE/*; do
    if [ -f "$file" ]; then
        SIZE=$(du -h "$file" | cut -f1)
        echo "  ✓ $file ($SIZE)"
    fi
done

# 6. Clean old backups
echo "Cleaning old backups..."
find $BACKUP_DIR -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \;

# 7. Create backup manifest
cat > $BACKUP_DIR/$DATE/manifest.txt << EOF
Backup Date: $DATE
Hostname: $(hostname)
Backup Contents:
- keystore.tar.gz
- nodekey
- jwtsecret
- config.tar.gz

Node Version: $(geth version 2>/dev/null || echo "unknown")
EOF

echo "=== Backup Complete ==="
echo "Backup location: $BACKUP_DIR/$DATE"

Cosmos Backup

# Backup validator keys
cp -r ~/.gaia/config/priv_validator_key.json /backup/gaia/validator_key.json

# Backup node key
cp -r ~/.gaia/config/node_key.json /backup/gaia/node_key.json

# Backup gentx
cp -r ~/.gaia/config/gentx /backup/gaia/gentx/

# Backup Cosmos SDK app state (optional)
cp -r ~/.gaia/data /backup/gaia/data/

# Create compressed backup
tar -czf gaia_backup_$(date +%Y%m%d).tar.gz -C /backup gaia/

42.3 Database Backup

Incremental Backup

#!/bin/bash
# For large databases, use rsync for incremental backup
SOURCE="/data/ethereum/geth"
DEST="/backup/ethereum/database"

# Create destination
mkdir -p $DEST

# Use rsync for incremental backup
rsync -avh --progress \
    --exclude='*.log' \
    --exclude='*.tmp' \
    $SOURCE/chaindata $DEST/

# Create snapshot backup
tar -czf /backup/ethereum/chaindata_$(date +%Y%m%d).tar.gz \
    -C /data/ethereum/geth chaindata/

Snapshot Backup (for small databases)

# For pruned nodes (~100GB), full snapshot is practical
tar -czf /backup/ethereum/full_snapshot_$(date +%Y%m%d).tar.gz \
    -C /data/ethereum geth/

42.4 Secure Offsite Backup

Encrypted Remote Backup

#!/bin/bash
# Configure AWS S3
S3_BUCKET="ethereum-backups"
DATE=$(date +%Y%m%d)

# Create encrypted archive
tar -czf - /data/ethereum/keystore | \
    openssl enc -aes-256-cbc -salt -pbkdf2 | \
    aws s3 cp - s3://$S3_BUCKET/keystore_$DATE.tar.enc

# Verify upload
aws s3 ls s3://$S3_BUCKET/

Using Rclone

# Install rclone
curl https://rclone.org/install.sh | sudo bash

# Configure rclone
rclone config

# Create encrypted backup
rclone copy /data/ethereum/keystore remote:backups/keystore \
    --encrypt \
    --exclude "*.log" \
    -v

# Schedule with cron
# 0 2 * * * /path/to/backup_script.sh

42.5 Recovery

Restoring from Backup

#!/bin/bash
BACKUP_FILE=$1

if [ -z "$BACKUP_FILE" ]; then
    echo "Usage: $0 <backup_file>"
    exit 1
fi

echo "=== Starting Recovery ==="

# Stop node
echo "Stopping node..."
sudo systemctl stop geth

# Verify backup integrity
echo "Verifying backup..."
if file $BACKUP_FILE | grep -q "gzip"; then
    echo "Valid gzip archive"
else
    echo "ERROR: Invalid backup file"
    exit 1
fi

# Extract to temporary location
echo "Extracting backup..."
mkdir -p /tmp/restore
tar -xzf $BACKUP_FILE -C /tmp/restore

# Restore keystore
echo "Restoring keystore..."
sudo cp /tmp/restore/keystore/* /data/ethereum/keystore/
sudo chown -R ethereum:ethereum /data/ethereum/keystore

# Restore node key (if present)
if [ -f "/tmp/restore/nodekey" ]; then
    echo "Restoring node key..."
    sudo cp /tmp/restore/nodekey /data/ethereum/geth/nodekey
fi

# Restore config
if [ -f "/tmp/restore/config.toml" ]; then
    echo "Restoring configuration..."
    sudo cp /tmp/restore/config.toml /data/ethereum/geth/config.toml
fi

# Clean up
rm -rf /tmp/restore

# Start node
echo "Starting node..."
sudo systemctl start geth

# Verify
sleep 30
echo "Verifying node status..."
geth attach http://localhost:8545 > /dev/null 2>&1
if [ $? -eq 0 ]; then
    echo "✓ Node restored successfully"
else
    echo "⚠ Warning: Node may need attention"
fi

echo "=== Recovery Complete ==="

Full System Recovery

# Recovery from complete failure

# 1. Install fresh OS
# 2. Install node software
sudo apt-get update
sudo apt-get install -y geth

# 3. Restore from backup
./restore.sh /backup/ethereum/keystore_20240101.tar.gz

# 4. Start node
sudo systemctl start geth

# 5. Verify
journalctl -fu geth
geth attach http://localhost:8545
> eth.blockNumber

42.6 Validator Recovery

Ethereum Validator Recovery

┌─────────────────────────────────────────────────────────────┐
│              VALIDATOR RECOVERY PROCEDURE                      │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Scenario: Validator hardware failure                        │
│                                                              │
│  1. New server ready                                        │
│  2. Install execution client (Geth)                       │
│  3. Install consensus client (Lighthouse)                  │
│  4. Restore from backup:                                   │
│     - Keystore password                                     │
│     - Validator keys (deposit_data.json)                   │
│     - BLS keys                                              │
│     - JWT secret                                           │
│  5. Start both clients                                     │
│  6. Verify validator is active                             │
│  7. Monitor attestation duties                            │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Emergency Mnemonic Recovery

# If all validator keys lost, recover from mnemonic

# Generate keys from mnemonic
eth2valtools mnemonic-entropy \
    --mnemonic="your 24 word mnemonic" \
    --validator-keys-dir=/validator/keys \
    --deposit-data-dir=/validator/deposit

# This will regenerate your validator keys

42.7 Backup Verification

Test Restore

#!/bin/bash
BACKUP_DIR="/backup/ethereum"

echo "=== Backup Verification ==="

# Check backup files exist
echo "Checking backup files..."
ls -la $BACKUP_DIR/*/keystore.tar.gz

# Verify archive integrity
echo "Verifying archive integrity..."
for backup in $(ls -d $BACKUP_DIR/*/ | tail -3); do
    echo "Testing: $backup"
    if tar -tzf $backup/keystore.tar.gz > /dev/null 2>&1; then
        echo "  ✓ $backup is valid"
    else
        echo "  ✗ $backup is corrupted!"
    fi
done

# Test extraction (to temp)
echo "Testing extraction..."
mkdir -p /tmp/verify
tar -xzf $BACKUP_DIR/$(ls -t $BACKUP_DIR | head -1)/keystore.tar.gz -C /tmp/verify
ls /tmp/verify/keystore/
rm -rf /tmp/verify

echo "=== Verification Complete ==="

42.8 Disaster Recovery Plan

Comprehensive DR Plan

---
recovery_time_objective: 4 hours
recovery_point_objective: 24 hours

critical_systems:
  - name: Validator
    backup_frequency: hourly
    recovery_steps:
      - Restore from latest keystore backup
      - Reinstall validator software
      - Configure and start
      - Verify validator status

  - name: RPC Node
    backup_frequency: daily
    recovery_steps:
      - Provision new server
      - Install node software
      - Restore config
      - Sync from snapshot

data_retention:
  keystore: permanent
  config: 90 days
  database: 14 days
  logs: 30 days

Runbook Template

# Disaster Recovery Runbook

## Scenario: Complete Server Failure

### Detection
- Monitoring alert: Node unreachable
- Manual report from team

### Immediate Actions (0-30 min)
1. Acknowledge alert
2. Verify server is down
3. Check backup availability

### Recovery (30 min - 2 hours)
1. Provision replacement server
2. Install OS and dependencies
3. Restore from backup
4. Start services
5. Verify functionality

### Post-Recovery (2-4 hours)
1. Monitor node health
2. Verify sync status
3. Confirm RPC service
4. Update stakeholders
5. Document incident

Summary

Backup priorities: Keys > Config > Database
Automate backups: Use cron for regular backups
Encrypt backups: Use GPG or cloud encryption
Offsite storage: Keep copies in different locations
Test restore: Regularly verify backup integrity
Document procedures: Have clear recovery runbooks
Validator-specific: Extra precautions for validator keys
DR planning: Plan for complete site failure

Next Chapter

In Chapter 43: Chain Reorganizations, we’ll explore handling chain reorganizations.

Last Updated: 2026-02-22