Core_dumps

Chapter 89: Core Dump Analysis

Overview

Core dumps are memory snapshots of crashed processes that are invaluable for debugging production issues. When a process crashes unexpectedly, a core dump captures the entire state of the process at the time of the crash, including stack traces, variable values, and memory contents. This chapter covers enabling core dumps, analyzing them with gdb and other tools, kernel core dumps, and production best practices.

89.1 Enabling Core Dumps

Core Dump Fundamentals

┌─────────────────────────────────────────────────────────────────────────┐
│                    CORE DUMP BASICS                                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   What is a Core Dump?                                                   │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                                                                  │   │
│   │  A core dump is a file containing:                              │   │
│   │  - Complete memory image of the process                        │   │
│   │  - CPU register values at time of crash                       │   │
│   │  - Stack traces for all threads                                │   │
│   │  - Information about loaded libraries                          │   │
│   │  - Process metadata (PID, user, etc.)                         │   │
│   │                                                                  │   │
│   │  Generated when process:                                        │   │
│   │  - Crashes with signal (SIGSEGV, SIGABRT, etc.)              │   │
│   │  - Receives signal with core action                            │   │
│   │  - Calls abort()                                               │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│   Core Dump Types:                                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                                                                  │   │
│   │  1. Core Dump (traditional)                                    │   │
│   │     - Single file, process memory only                         │   │
│   │                                                                  │   │
│   │  2. Kernel Core Dump (vmcore)                                  │   │
│   │     - Entire kernel memory + processes                         │   │
│   │     - Used for kernel crashes                                  │   │
│   │                                                                  │   │
│   │  3. Core Dump with Abbrev (gcore)                             │   │
│   │     - Can attach to running process                            │   │
│   │                                                                  │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

System Configuration

# ============================================================
# ENABLING CORE DUMPS - SYSTEM CONFIGURATION
# ============================================================

# /etc/security/limits.conf
# Set core dump size limits for all users
* soft core unlimited
* hard core unlimited
root soft core unlimited
root hard core unlimited

# /etc/sysctl.conf - Kernel settings
# Core pattern (where to save, what name)
kernel.core_pattern = core.%p        # %p = process ID
kernel.core_uses_pid = 1            # Include PID in name

# Alternative patterns:
# core.%e.%p    - executable name + PID
# core.%t       - timestamp
# core.%h       - hostname

# Enable compressed core dumps
kernel.core_compress_press = 1

# Set core dump size limit (0 = disabled)
kernel.core_limit = 0              # 0 = unlimited in modern kernels

# Apply without reboot
sudo sysctl -p

# Check current settings
sysctl kernel.core_pattern
sysctl kernel.core_uses_pid

# System-wide ulimit (systemd)
# /etc/systemd/system.conf
DefaultLimitCORE=infinity

Process Level Configuration

# ============================================================
# PROCESS-LEVEL CORE DUMP CONFIGURATION
# ============================================================

# In shell - enable for current session
ulimit -c unlimited

# In shell script
#!/bin/bash
ulimit -c unlimited
./my_program

# In Python
import resource
soft, hard = resource.getrlimit(resource.RLIMIT_CORE)
resource.setrlimit(resource.RLIMIT_CORE, (hard, hard))

# In C/C++ program
#include <sys/resource.h>

struct rlimit rlim;
rlim.rlim_cur = RLIM_INFINITY;
rlim.rlim_max = RLIM_INFINITY;
setrlimit(RLIMIT_CORE, &rlim);

# Using prctl (Linux-specific)
#include <sys/prctl.h>
prctl(PR_SET_COREFILE, PR_SET_COREFILE_LARGE);

# Disable core dumps for sensitive processes
prctl(PR_SET_DUMPABLE, 0);

# Check if core dumps are enabled for a process
cat /proc/<PID>/limits | grep "Max core file size"

Application Configuration

# ============================================================
# APPLICATION CORE DUMP CONFIGURATION
# ============================================================

# For system services (systemd)
# /etc/systemd/system/myapp.service
[Service]
LimitCORE=infinity
WorkingDirectory=/path/to/app
Environment="MALLOC_CHECK_=3"  # Debug malloc issues

# For nginx
# /etc/nginx/nginx.conf
worker_rlimit_core 100M;
working_directory /var/crash;

# For PostgreSQL
# postgresql.conf
kernel.core_pattern = /var/lib/postgresql/%u/core.%p
postgresql.conf: kernel.core_pattern

# For Java applications
# Heap dumps vs core dumps
# -XX:+HeapDumpOnOutOfMemoryError
# -XX:HeapDumpPath=/var/log/heapdump.hprof

# For Python
# ulimit must be set before starting
# Or use: faulthandler module
python -c "import faulthandler; faulthandler.enable()"

89.2 Analyzing Core Dumps

Using GDB

# ============================================================
# ANALYZING CORE DUMPS WITH GDB
# ============================================================

# Load core dump with executable
gdb ./program core.1234

# Or specify both separately
gdb -c core.1234 ./program

# Essential GDB Commands:

# Backtrace
bt                  # Quick backtrace
bt full             # Backtrace with local variables
bt 20               # Show 20 frames
thread apply all bt # All threads

# Frame navigation
frame 3             # Switch to frame 3
up 2                # Go up 2 frames
down 1              # Go down 1 frame
info frame          # Current frame info

# Variables and memory
print variable_name
print *pointer
print array[0]@10   # Print 10 elements
x/100x &buffer     # Examine 100 words hex
x/s &string        # Examine string
x/i &function      # Examine at address

# Registers
info registers
info all-registers

# Threads
info threads
thread apply all bt
thread 2
bt

# Source code
list                # Show source around PC
list function_name
list 10,20         # Lines 10-20

# Breakpoints
info breakpoints
delete 1

# Continuing after crash (for debugging)
set disassemble-next-line on
run

# Useful commands
info signals       # Signal handling
info proc          # Process info
info files         # Loaded files

Practical Analysis Examples

# ============================================================
# PRACTICAL CORE DUMP ANALYSIS
# ============================================================

# Example: Debugging a crash
gdb ./myapp core.5678

# 1. Quick overview
(gdb) bt
#0  0x00007f8a12345678 in ?? () from /lib/x86_64-linux-gnu/libc.so.6

# 2. Find the crash location
(gdb) info registers rip
#rip            0x555555554a2a <main+42>

# 3. Examine variables at crash
(gdb) info locals
#i = 10
#buffer = 0x0

# 4. Check the backtrace fully
(gdb) bt full

# 5. Find null pointer
(gdb) print buffer
#$1 = 0x0

# 6. Go up to see where buffer was set
(gdb) up
(gdb) list

# Common crash patterns:

# Null pointer dereference
# Program received signal SIGSEGV, Segmentation fault.
# 0x0000000000000000 in ?? ()

# Buffer overflow
# Program received signal SIGABRT, Aborted.
# __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50

# Double free
# *** glibc detected *** ./program: double free or corruption

# Use after free
# Invalid free() / delete / delete[]

# Stack overflow
# Program received signal SIGSEGV, Segmentation fault.
# 0x00007f8a1c00000 in ?? ()

Using crash Utility

# ============================================================
# ANALYZING KERNEL CORE DUMPS WITH CRASH
# ============================================================

# Install crash and kernel debug info
sudo apt install crash
sudo debuginfo-install kernel

# Analyze kernel crash dump
sudo crash vmlinux /var/crash/vmcore

# Common crash commands:
crash> ps                     # List processes
crash> bt                     # Backtrace
crash> bt -a                  # All CPU backtraces
crash> kmem -i               # Memory info
crash> log                    # Kernel log buffer
crash> files                  # Open files
crash> net                    # Network sockets
crash> waitq                  # Wait queues
crash> struct task_struct    # Task structure
crash> p <PID>               # Process details

# Analyze specific
crash> ps | grep <name>      # Find process
crash> rd <address> 20       # Read 20 longs from addr
crash> dis <address>         # Disassemble

# Extensions
crash> mod                   # Kernel modules
crash> irq                  # IRQ handlers

89.3 Production Best Practices

Configuration Checklist

# ============================================================
# PRODUCTION CORE DUMP CONFIGURATION CHECKLIST
# ============================================================

# 1. System Configuration
# Enable core dumps system-wide
# /etc/security/limits.conf
* soft core unlimited
* hard core unlimited

# /etc/sysctl.conf
kernel.core_pattern = /var/crash/core.%e.%p.%t
kernel.core_uses_pid = 1

# 2. Storage Location
mkdir -p /var/crash
chmod 1777 /var/crash

# 3. Log Rotation
# /etc/logrotate.d/core dumps
/var/crash/core.* {
    daily
    rotate 5
    compress
    delaycompress
    missingok
    notifempty
}

# 4. Application Configuration
# Set ulimit in systemd service
[Service]
LimitCORE=infinity

# 5. Monitoring
# Alert if core dumps appear
find /var/crash -type f -mtime -1 -ls

# 6. Security
# Restrict core dump location permissions
# Prevent information disclosure

Core Dump Analysis Workflow

┌─────────────────────────────────────────────────────────────────────────┐
│                  CORE DUMP ANALYSIS WORKFLOW                            │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Step 1: Collect                                              │   │
│   │  - Locate core dump file                                      │   │
│   │  - Get matching binary/executable                             │   │
│   │  - Gather related logs                                        │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                    │                                    │
│                                    ▼                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Step 2: Initial Analysis                                     │   │
│   │  - Load in gdb: gdb ./program core                           │   │
│   │  - Get backtrace: bt                                          │   │
│   │  - Identify crash location                                     │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                    │                                    │
│                                    ▼                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Step 3: Deep Dive                                           │   │
│   │  - Examine variables                                          │   │
│   │  - Check register values                                      │   │
│   │  - Analyze call chain                                         │   │
│   │  - Identify root cause                                         │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                    │                                    │
│                                    ▼                                    │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │  Step 4: Fix and Prevent                                     │   │
│   │  - Implement fix                                              │   │
│   │  - Add tests                                                  │   │
│   │  - Add monitoring                                             │   │
│   │  - Update code                                                │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

89.4 Interview Questions

┌─────────────────────────────────────────────────────────────────────────┐
│                  CORE DUMP INTERVIEW QUESTIONS                            │
├─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q1: What is a core dump?                                                 │
                                                                         │
A1:                                                                       │
A core dump is a file containing:                                        │
- Complete memory snapshot of a process at crash time                   │
- CPU register values                                                    │
- Stack traces for all threads                                           │
- Information about loaded shared libraries                             │
- Used for post-mortem debugging of crashes                             │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q2: How do you enable core dumps in Linux?                              │
                                                                         │
A2:                                                                       │
1. System: /etc/security/limits.conf - set core unlimited              │
2. Kernel: sysctl kernel.core_pattern                                   │
3. Process: ulimit -c unlimited in shell                               │
4. Program: setrlimit(RLIMIT_CORE, ...) in code                        │
5. Systemd: LimitCORE=infinity in service file                        │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q3: What does kernel.core_pattern mean?                                 │
                                                                         │
A3:                                                                       │
Defines where and how core dumps are saved:                             │
- %p = PID                                                             │
- %e = executable name                                                 │
- %t = timestamp                                                       │
- %h = hostname                                                        │
- /var/crash/ = directory (if starts with /)                          │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q4: How do you analyze a core dump with gdb?                           │
                                                                         │
A4:                                                                       │
gdb ./program core.1234                                                │
- bt: backtrace                                                        │
- bt full: backtrace with local variables                               │
- frame N: switch to frame N                                            │
- print var: print variable                                            │
- x/100x addr: examine memory                                          │
- info registers: register values                                       │
- thread apply all bt: all threads                                     │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q5: What are the differences between process and kernel core dumps?     │
                                                                         │
A5:                                                                       │
- Process core dump: Single process memory only                         │
- Kernel core dump (vmcore): Entire kernel memory + all processes       │
- Kernel dumps need special tools (crash, kdump)                       │
- Kernel crashes handled by kdump/kexec                                 │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q6: How do you prevent core dumps in production for security?          │
                                                                         │
A6:                                                                       │
- Set core dump size to 0                                               │
- Use prctl(PR_SET_DUMPABLE, 0) for sensitive processes              │
- Store core dumps in secure location with restricted permissions       │
- Consider coredumpctl with systemd                                     │
- Avoid including sensitive data in core dumps                         │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q7: What is the difference between gcore and gdb -c?                   │
                                                                         │
A7:                                                                       │
- gdb -c: Analyzes existing core dump                                  │
- gcore: Creates core dump of running process                          │
- gcore <PID>: Generate core without killing process                   │
- Useful for debugging live processes or hung programs                  │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q8: How do you debug a core dump without debug symbols?                │
                                                                         │
A8:                                                                       │
- Rebuild with debug symbols (-g flag)                                 │
- Use separate debug symbol packages                                   │
- Install debuginfo packages (Fedora/RHEL)                            │
- Use objdump for basic analysis                                       │
- Function names may be missing but addresses still useful             │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q9: What is systemd-coredump and how does it work?                      │
                                                                         │
A9:                                                                       │
- systemd-coredump is systemd's core dump handling                     │
- Catches core dumps via kernel.core_pattern                           │
- Stores in /var/lib/systemd/coredump/                                │
- Uses coredumpctl to list and extract                                │
- Can limit size, compress, store metadata                             │
- Configure in /etc/systemd/coredump.conf                            │
                                                                         │
─────────────────────────────────────────────────────────────────────────┤
                                                                         │
Q10: How do you debug multi-threaded core dumps?                       │
                                                                         │
A10:                                                                      │
- thread apply all bt: backtrace for all threads                       │
- info threads: list all threads                                        │
- thread N: switch to thread N                                         │
- Use bt full to see all thread-local variables                        │
- Look for deadlock patterns in backtraces                             │
- Check which thread received signal                                   │
                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Quick Reference

# Enable core dumps
ulimit -c unlimited
sysctl kernel.core_pattern=/var/crash/core.%p

# Generate core dump of running process
gcore <PID>

# Analyze core dump
gdb ./program core.1234

# Common gdb commands
bt                  # Backtrace
bt full             # Full backtrace
frame N             # Switch frame
print var           # Print variable
x/100x &buffer     # Examine memory

# Kernel crash analysis
sudo crash vmlinux vmcore
crash> bt
crash> ps

# List core dumps
coredumpctl list

Summary

Core Dump: Memory snapshot of crashed process
Enable: ulimit, sysctl, systemd service configuration
Analyze: gdb for user space, crash for kernel
Commands: bt, print, x/, frame, thread
Production: Configure storage, rotation, security

Next Chapter

Chapter 90: Hardware Diagnostics

Last Updated: February 2026