Memory Performance
Chapter 64: Memory Performance
Section titled “Chapter 64: Memory Performance”Comprehensive Linux Memory Performance Monitoring and Tuning
Section titled “Comprehensive Linux Memory Performance Monitoring and Tuning”Why This Matters in DevOps/SRE
Section titled “Why This Matters in DevOps/SRE”Memory performance is critical for DevOps and SRE roles because applications rely on proper memory management to function efficiently. Memory leaks, OOM (Out of Memory) kills, and excessive swapping can cause service outages, slow response times, and customer-impacting incidents.
┌─────────────────────────────────────────────────────────────────────────────┐│ MEMORY IN DEVOPS/SRE CONTEXT │├─────────────────────────────────────────────────────────────────────────────┤│ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ INCIDENT RESPONSE │ ││ │ │ ││ │ "Server unresponsive" ──► Check memory ──► OOM killer active │ ││ │ │ │ ││ │ ▼ │ ││ │ Identify memory hog │ ││ │ │ │ ││ │ ▼ │ ││ │ Restart service / scale up │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ PROACTIVE MONITORING │ ││ │ │ ││ │ Memory Usage > 80% ──► Alert on-call ──► Scale application │ ││ │ │ ││ │ Swap usage rising ──► Investigate leak ──► Schedule fix │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ CAPACITY PLANNING │ ││ │ │ ││ │ Current: 16GB/app │ Project growth: 3x │ Plan: 64GB node │ ││ │ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Real-world scenarios:
- Pod eviction in Kubernetes: When a node runs out of memory, the kubelet evicts pods, causing service disruptions
- Database OOM: MySQL/PostgreSQL getting killed by OOM killer due to large queries
- Java heap issues: JVM consuming excessive RSS beyond heap size
- Redis memory fragmentation: Causing unexpected memory pressure
57.1 Linux Memory Architecture
Section titled “57.1 Linux Memory Architecture”Memory Management Overview
Section titled “Memory Management Overview”┌────────────────────────────────────────────────────────────────────────┐│ LINUX MEMORY ARCHITECTURE │├────────────────────────────────────────────────────────────────────────┤│ ││ ┌──────────────────────────────────────────────────────────────────┐ ││ │ VIRTUAL MEMORY │ ││ │ │ ││ │ ┌──────────────────┐ ┌──────────────────┐ │ ││ │ │ Process A │ │ Process B │ │ ││ │ │ │ │ │ │ ││ │ │ Virtual Address │ │ Virtual Address │ │ ││ │ │ Space 0-4GB │ │ Space 0-4GB │ │ ││ │ │ │ │ │ │ ││ │ └────────┬─────────┘ └────────┬─────────┘ │ ││ │ │ │ │ ││ │ ▼ ▼ │ ││ │ ┌──────────────────────────────────────────────────────────┐ │ ││ │ │ PAGE TABLES (MMU) │ │ ││ │ │ Translates virtual addresses to physical addresses │ │ ││ │ └──────────────────────────────────────────────────────────┘ │ ││ │ │ │ ││ └────────────────────────────┼──────────────────────────────────────┘ ││ │ ││ ┌────────────────────────────┼──────────────────────────────────────┐ ││ │ PHYSICAL MEMORY │ ││ │ │ │ ││ │ ┌──────────┬──────────┬───┴───┬──────────┬──────────┐ │ ││ │ │Page Cache│Buffers │ Free │ Slab │ Process │ │ ││ │ │(file data)│(blk dev)│ Pages │(kernel) │ RSS │ │ ││ │ └──────────┴──────────┴───────┴──────────┴──────────┘ │ ││ │ │ ││ └───────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌───────────────────────────────────────────────────────────────────┐ ││ │ SWAP │ ││ │ ┌──────────┬──────────┬──────────┬──────────┐ │ ││ │ │ Anon │ Swap │ │ │ │ ││ │ │ (process)│ Cache │ │ │ │ ││ │ └──────────┴──────────┴──────────┴──────────┘ │ ││ └───────────────────────────────────────────────────────────────────┘ ││ │└────────────────────────────────────────────────────────────────────────┘Memory Types in Linux
Section titled “Memory Types in Linux”| Type | Description | Location |
|---|---|---|
| RSS | Resident Set Size - actual physical memory used | Physical RAM |
| VSZ | Virtual Size - total virtual memory allocated | Virtual |
| Page Cache | Cached file content from disk | Physical RAM |
| Buffers | Block device buffers | Physical RAM |
| Slab | Kernel data structures | Physical RAM |
| Swap | Pages moved to disk | Swap space |
57.2 Memory Information Commands
Section titled “57.2 Memory Information Commands”Viewing Memory
Section titled “Viewing Memory”# Basic memory infofree -h# Output:# total used free shared buff/cache available# Mem: 15Gi 4.5Gi 8.2Gi 200Mi 2.3Gi 10Gi# Swap: 2.0Gi 0B 2.0Gi
# Detailed memory infocat /proc/meminfo# MemTotal: 16384000 kB# MemFree: 8500000 kB# MemAvailable: 10500000 kB# Buffers: 240000 kB# Cached: 2400000 kB# SwapCached: 10000 kB# Active: 3200000 kB# Inactive: 1800000 kB# SwapTotal: 2097152 kB# SwapFree: 2097152 kB# Dirty: 1000 kB# Writeback: 100 kB
# Memory usage in MBfree -m
# Continuous monitoringfree -h -s 3
# Human-readable with timestampfree -hwtDetailed Memory Information
Section titled “Detailed Memory Information”# Hardware memory infosudo dmidecode -t memory
# Memory devicessudo dmidecode -t 17
# NUMA memory infonumactl --hardwarenumactl --show
# Memory topologycat /proc/buddyinfo
# VM memory statscat /proc/vmstat
# Memory pressurecat /proc/pressure/memory57.3 Memory Monitoring Tools
Section titled “57.3 Memory Monitoring Tools”top and htop
Section titled “top and htop”# Memory usage with toptop# Press M to sort by memory# Press m to toggle memory display
# htop with memoryhtop# F6 to sort by %MEM
# Memory info in topVIRT - Virtual memoryRES - Resident (physical) memorySHR - Shared memory%MEM - Memory percentagevmstat and sar
Section titled “vmstat and sar”# Memory and swap statisticsvmstat 1
# Detailed memory statsvmstat -svmstat -m
# Memory stats with sarsar -r 1 5# kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit# 8000000 8000000 50% 200000 2400000 1000000 6%
# Swap activitysar -S 1 5
# Page statisticssar -B 1 5
# Slab statisticssar -m 1 5Process Memory Analysis
Section titled “Process Memory Analysis”# Memory by processps aux --sort=-%mem | head -20
# Detailed process memorypmap -x 12345# Shows each memory mapping for process
# Process memory in tree formatps -eo pid,ppid,%mem,%cpu,comm --forest | head -20
# Memory used by userps -U username -o pid,vsz,rss,pmem,comm
# Shared memory segmentsipcs -m
# Shared memory limitsipcs -lslabtop
Section titled “slabtop”# Kernel slab allocatorslabtop# Updates every second
# Optionsslabtop -s c # Sort by cacheslabtop -s b # Sort by object count57.4 Memory Tuning Parameters
Section titled “57.4 Memory Tuning Parameters”sysctl Parameters
Section titled “sysctl Parameters”# View current valuessysctl vm.swappinesssysctl vm.vfs_cache_pressuresysctl vm.overcommit_memorysysctl vm.overcommit_ratio
# Add to /etc/sysctl.d/99-memory.conf:
# Swappiness (0-100, lower = less swapping)vm.swappiness = 10
# Cache reclaim pressure (higher = reclaim more cache)vm.vfs_cache_pressure = 50
# Memory overcommit (0=heuristic, 1=always, 2=never)vm.overcommit_memory = 0
# Overcommit ratio (when overcommit_memory=2)vm.overcommit_ratio = 50
# Minimum free memoryvm.min_free_kbytes = 65536
# Dirty page ratiosvm.dirty_ratio = 15vm.dirty_background_ratio = 5vm.dirty_expire_centisecs = 3000
# Apply changessudo sysctl -pSwappiness Guide
Section titled “Swappiness Guide”| Value | Behavior | Best For |
|---|---|---|
| 0 | Swap only when out of memory | Desktop with SSD, enough RAM |
| 10 | Minimal swapping | Most servers |
| 30 | Default (some distros) | General use |
| 60 | Aggressive | Systems with limited RAM |
| 100 | Maximum swapping | Rarely used |
Huge Pages
Section titled “Huge Pages”# View huge pagescat /proc/meminfo | grep Huge
# Set number of huge pagessudo sysctl vm.nr_hugepages=128
# Make persistentecho "vm.nr_hugepages = 128" >> /etc/sysctl.d/99-hugepages.conf
# Applications can use huge pages# PostgreSQL, Oracle: shared_buffers# Java: -XX:+UseLargePagesDropping Caches
Section titled “Dropping Caches”# Sync first (important!)sync
# Drop page cache onlyecho 1 | sudo tee /proc/sys/vm/drop_caches
# Drop dentries and inodesecho 2 | sudo tee /proc/sys/vm/drop_caches
# Drop all cachesecho 3 | sudo tee /proc/sys/vm/drop_caches
# Add to cron for regular clearing (not recommended in production)# 0 3 * * * /bin/sync; /bin/echo 3 > /proc/sys/vm/drop_caches57.5 Swap Management
Section titled “57.5 Swap Management”Creating Swap Space
Section titled “Creating Swap Space”# Create swap filesudo dd if=/dev/zero of=/swapfile bs=1M count=2048# orsudo fallocate -l 2G /swapfile
# Set permissionssudo chmod 600 /swapfile
# Format as swapsudo mkswap /swapfile
# Enable swapsudo swapon /swapfile
# Verifyswapon -sfree -h
# Add to /etc/fstab for persistence# /swapfile none swap sw 0 0Swap Partition
Section titled “Swap Partition”# Create swap partition with fdisksudo fdisk /dev/sda# n (new), p (primary), +2G, t (type), 82 (Linux swap)
# Formatsudo mkswap /dev/sda3
# Enablesudo swapon /dev/sda3
# Add to /etc/fstab# /dev/sda3 none swap sw 0 0Managing Swap
Section titled “Managing Swap”# Disable swapsudo swapoff /swapfile
# Enable swapsudo swapon /swapfile
# Priority (higher = used first)sudo swapon -p 100 /swapfile
# In /etc/fstab:# /swapfile none swap sw,pri=100 0 0
# Check swap usageswapon --showcat /proc/swaps
# Remove swap filesudo swapoff /swapfilesudo rm /swapfile57.6 NUMA Management
Section titled “57.6 NUMA Management”NUMA Tuning
Section titled “NUMA Tuning”# Check NUMA configurationnumactl --hardware
# Show current NUMA policynumactl --show
# Run process on specific nodenumactl --cpunodebind=0 --membind=0 process
# Interleave memory across nodesnumactl --interleave=all process
# Move process to nodenumactl --membind=1 -p <pid>Zone Reclaim
Section titled “Zone Reclaim”# Disable zone reclaim (for NUMA)sysctl -w vm.zone_reclaim_mode=0
# Make persistentecho "vm.zone_reclaim_mode = 0" >> /etc/sysctl.d/99-numa.conf57.7 Memory Issues and Troubleshooting
Section titled “57.7 Memory Issues and Troubleshooting”Out of Memory (OOM)
Section titled “Out of Memory (OOM)”# Check OOM eventsdmesg | grep -i "out of memory"dmesg | grep -i "killed process"
# View OOM historysudo journalctl -xe | grep -i "killed process"
# Check process killeddmesg | tail -100 | grep -i oom
# Check vm.overcommit_memory settingsysctl vm.overcommit_memoryFinding Memory Hogs
Section titled “Finding Memory Hogs”# Top memory consumersps aux --sort=-%mem | head -10
# Detailed process memorypmap -x $(pgrep -f processname)
# Memory used by userps -eo user,pcpu,pmem,rss,comm --sort=-rss | head
# Shared memoryipcs -m
# Memory mapped fileslsof +L1 | head
# Check for memory leakswatch -n 1 'ps -eo pid,vsz,rss,pmem,comm --sort=-rss | head'# Watch for steadily increasing RSSChecking Memory Pressure
Section titled “Checking Memory Pressure”# Memory pressure (cgroup v2)cat /proc/pressure/memory
# Some avg10=0.00 avg60=0.00 avg300=0.00 total=0# full avg10=0.00 avg60=0.00 avg300=0.00 total=0
# Using psi (Pressure Stall Information)cat /proc/pressure/io
# From cgroupcgget -g memory:/ <cgroup_name>Troubleshooting Low Memory
Section titled “Troubleshooting Low Memory”# Check what's using memoryfree -hcat /proc/meminfo
# Process memoryps aux | sort -k4 -r | head
# Kernel memoryslabtop
# Page cacheecho 3 | sudo tee /proc/sys/vm/drop_caches
# Swap usageswapon -svmstat 157.8 Production Memory Configuration
Section titled “57.8 Production Memory Configuration”Database Server
Section titled “Database Server”# Memory settingsvm.swappiness = 10vm.overcommit_memory = 2vm.overcommit_ratio = 80
# Huge pagesvm.nr_hugepages = 256
# Page cachevm.vfs_cache_pressure = 50
# Min freevm.min_free_kbytes = 65536
# Applysudo sysctl -pWeb Server
Section titled “Web Server”vm.swappiness = 60vm.vfs_cache_pressure = 100vm.overcommit_memory = 0vm.dirty_ratio = 60vm.dirty_background_ratio = 10High Performance Computing
Section titled “High Performance Computing”vm.swappiness = 3vm.overcommit_memory = 2vm.overcommit_ratio = 50vm.nr_hugepages = 1024vm.min_free_kbytes = 1048576vm.zone_reclaim_mode = 057.9 Interview Questions
Section titled “57.9 Interview Questions”Q1: What is the difference between RSS and VSZ?
Section titled “Q1: What is the difference between RSS and VSZ?”Answer:
- RSS (Resident Set Size): Physical memory currently used by the process
- VSZ (Virtual Size): Total virtual memory allocated by the process (may include swapped out pages)
RSS is the actual physical RAM being used. VSZ includes memory that might be on disk (swap).
Q2: What is swappiness and what value should you use?
Section titled “Q2: What is swappiness and what value should you use?”Answer:
vm.swappiness controls how aggressively the kernel swaps pages to disk:
- Range: 0-100
- Higher values = more aggressive swapping
- Lower values = prefer keeping data in RAM
Recommended values:
- 10-30 for servers with ample RAM
- 0 for databases (let them manage memory)
- 60 for systems with limited RAM
Q3: How do you check memory usage in Linux?
Section titled “Q3: How do you check memory usage in Linux?”Answer:
# Basicfree -hcat /proc/meminfo
# Per processps aux --sort=-%memtophtop
# Detailed per processpmap -x <pid>
# Over timevmstat 1sar -r 1Q4: What is the OOM killer?
Section titled “Q4: What is the OOM killer?”Answer: The Out-of-Memory (OOM) killer is a Linux kernel mechanism that kills processes when memory is exhausted. It:
- Calculates which process to kill (based on oom_score)
- Kills the selected process to free memory
- Logs the event to dmesg
The oom_score is influenced by:
- Memory usage
- Process age
- Nice value
- oom_score_adj tunable
Q5: What is memory overcommit?
Section titled “Q5: What is memory overcommit?”Answer: Memory overcommit allows processes to allocate more virtual memory than physically available. Modes:
- 0 (Heuristic): Kernel uses heuristics (default)
- 1 (Always): Always allow overcommit (dangerous)
- 2 (Never): Never overcommit (limits = swap + RAM × overcommit_ratio)
Databases typically use mode 2 for predictable behavior.
Q6: How do you create a swap file?
Section titled “Q6: How do you create a swap file?”Answer:
# Create 2GB swap filesudo dd if=/dev/zero of=/swapfile bs=1M count=2048sudo chmod 600 /swapfilesudo mkswap /swapfilesudo swapon /swapfile
# Add to /etc/fstab# /swapfile none swap sw 0 0Q7: What is page cache?
Section titled “Q7: What is page cache?”Answer: Page cache is memory used by the kernel to cache file content from disk. Benefits:
- Reduces disk I/O
- Improves file read performance
- Pages are written back to disk when dirty
Managed by the kernel automatically. Can be controlled with vm.vfs_cache_pressure.
Q8: How do you diagnose a memory leak?
Section titled “Q8: How do you diagnose a memory leak?”Answer:
-
Monitor memory over time:
Terminal window watch -n 1 'ps -eo pid,vsz,rss,pmem,comm --sort=-rss' -
Compare RSS at different times - steadily increasing indicates leak
-
Check for growing memory in specific process
-
Use tools like:
valgrindfor application-level analysispmapto see memory mappings/proc/<pid>/smapsfor detailed mapping
Quick Reference
Section titled “Quick Reference”Commands
Section titled “Commands”# View memoryfree -hcat /proc/meminfo
# Monitorvmstat 1sar -r 1tophtop
# Processps aux --sort=-%mempmap -x <pid>
# Swapswapon -sswapon/swapoff
# Tunesysctl -w vm.swappiness=10echo 3 > /proc/sys/vm/drop_cachesKey Parameters
Section titled “Key Parameters”| Parameter | Default | Recommended | Description |
|---|---|---|---|
| vm.swappiness | 60 | 10-30 | Swap tendency |
| vm.overcommit_memory | 0 | 0/2 | Overcommit mode |
| vm.vfs_cache_pressure | 100 | 50 | Cache reclaim |
| vm.dirty_ratio | 20 | 15-40 | Dirty page % |
Common Mistakes & Anti-Patterns
Section titled “Common Mistakes & Anti-Patterns”1. Ignoring Memory Pressure Until OOM
Section titled “1. Ignoring Memory Pressure Until OOM”❌ WRONG: Only reacting when services start crashing
# Waiting for OOM before taking action# dmesg shows: "Out of memory: Killed process..."sudo dmesg | tail -20 # Only checking AFTER crash✅ CORRECT: Proactive monitoring with alerts
# Set up alert at 80% memory usage#!/bin/bashTHRESHOLD=80MEM_USED=$(free | grep Mem | awk '{printf "%.0f", $3/$2 * 100}')if [ "$MEM_USED" -gt "$THRESHOLD" ]; then curl -X POST "https://hooks.slack.com/services/..." \ -d "text=":warning: Memory at ${MEM_USED}% on $(hostname)"fi2. Setting swappiness to 0
Section titled “2. Setting swappiness to 0”❌ WRONG: Disabling swap completely
# Never do this - can cause OOM更快sysctl -w vm.swappiness=0✅ CORRECT: Lower but not zero, depends on workload
# For databases, reduce but don't eliminatesysctl -w vm.swappiness=10
# Make permanentecho 'vm.swappiness=10' >> /etc/sysctl.conf3. Not Understanding RSS vs Cache
Section titled “3. Not Understanding RSS vs Cache”❌ WRONG: Treating cache as used memory
# Wrong interpretation - cache is reclaimable$ free -h total used free shared buff/cache availableMem: 31Gi 28Gi 1.0Gi 200Mi 800Mi 2.0Gi# This looks like only 1GB free, panic!✅ CORRECT: Understand available = free + buff/cache
# Correct interpretation$ free -h total used free shared buff/cache availableMem: 31Gi 28Gi 1.0Gi 200Mi 800Mi 2.0Gi# available (2GB) is what new processes can get + free (1GB) = 3GB usable4. Clearing Cache at Wrong Time
Section titled “4. Clearing Cache at Wrong Time”❌ WRONG: Clearing cache during production hours
# NEVER do this in production - causes massive I/O spikeecho 3 > /proc/sys/vm/drop_caches✅ CORRECT: Schedule during maintenance window if needed
# Only in maintenance window, after notifying team# 2am Sunday morning0 2 * * 0 sync && echo 3 > /proc/sys/vm/drop_caches5. Overcommitting Memory Without Limits
Section titled “5. Overcommitting Memory Without Limits”❌ WRONG: No memory limits on containers
# docker-compose.yml - no memory limitservices: app: image: myapp:latest # Missing: mem_limit✅ CORRECT: Set appropriate limits
# docker-compose.yml - with memory limitsservices: app: image: myapp:latest mem_limit: 512m mem_reservation: 256m deploy: resources: limits: memory: 512M reservations: memory: 256MSummary
Section titled “Summary”In this chapter, you learned:
- ✅ Linux memory architecture
- ✅ Memory information commands
- ✅ Memory monitoring tools (top, vmstat, sar)
- ✅ Memory tuning parameters
- ✅ Swap management
- ✅ NUMA management
- ✅ Troubleshooting memory issues
- ✅ Production configurations
- ✅ Interview questions and answers
Next Chapter
Section titled “Next Chapter”Chapter 53: Disk I/O Performance
Last Updated: February 2026