Process

Chapter 23: Process Management in Bash

Overview

Process management is a critical skill for DevOps engineers and system administrators. This chapter covers how to monitor, control, and manage processes in Linux using bash scripts.

Understanding Processes

What is a Process?

A process is an instance of a running program. Each process has:

PID (Process ID) - Unique identifier
PPID (Parent Process ID) - Parent process
UID/GID - User and group IDs
Status - Running, sleeping, stopped, zombie
Resources - Memory, CPU, file descriptors

Process States

┌────────────────────────────────────────────────────────────────┐
│                    Process States                               │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│   ┌─────┐                                                     │
│   │ New │  Process being created                              │
│   └──┬──┘                                                     │
│      │                                                        │
│      ▼                                                        │
│   ┌──────────┐      ┌─────────────┐                          │
│   │ Running  │ ──► │ Terminated  │                          │
│   │ (R)      │      │ (Z)         │                          │
│   └────┬─────┘      └─────────────┘                          │
│        │                                                        │
│   ┌────┴─────┐                                                 │
│   │          │                                                 │
│   ▼          ▼                                                 │
│ ┌────┐    ┌────┐                                               │
│ │Sleep│    │Stop│                                              │
│ │(S) │    │(T) │                                              │
│ └────┘    └────┘                                               │
│                                                                │
│   R - Running/Runnable                                         │
│   S - Interruptible Sleep (waiting for event)                 │
│   D - Uninterruptible Sleep (usually I/O)                     │
│   T - Stopped (by signal)                                      │
│   Z - Zombie (terminated but not reaped)                      │
│                                                                │
└────────────────────────────────────────────────────────────────┘

Viewing Processes

ps Command

# Basic ps output
ps

# Full format
ps -ef
ps -aux

# Show threads
ps -eLf
ps -eT

# Custom output
ps -eo pid,ppid,cmd,%mem,%cpu

# Sort by memory
ps aux --sort=-%mem

# Sort by CPU
ps aux --sort=-%cpu

# Show process tree
ps -ef --forest

# Show user's processes
ps -U username

top Command

# Interactive process viewer
top

# Batch mode (useful for scripts)
top -bn1

# Top with specific refresh
top -d 5 -n 3

# Show only specific user
top -u username

# Show specific process
top -p $(pgrep -f process_name)

# Custom fields
top -o %MEM

htop (Enhanced top)

# Interactive process manager (install: sudo pacman -S htop)
htop

# Show specific columns
htop -d 10

# Filter processes
htop -F nginx

Other Process Viewers

# Continuous process monitoring
watch -n 1 'ps aux | grep nginx'

# Tree view
pstree

# Find process by name
pgrep -a nginx

# Find process with full details
pidof nginx

Process Information Commands

/proc Filesystem

# Process status
cat /proc/$PID/status

# Process command line
cat /proc/$PID/cmdline

# Process environment
cat /proc/$PID/environ

# Process file descriptors
ls -la /proc/$PID/fd

# Process limits
cat /proc/$PID/limits

Starting Processes

Running in Foreground

# Standard foreground process
nginx

# With output redirection
./script.sh > output.log 2>&1 &

Running in Background

# Start process in background
./long_running_script.sh &

# Start with nohup (immune to hangup)
nohup ./script.sh &

# Start with redirect
nohup ./script.sh > output.log 2>&1 &

# Using setsid (new session)
setsid ./script.sh &

# Using disown (remove from shell)
./script.sh &
disown

Process Control Commands

kill - Send Signals

# Kill by PID
kill 1234

# Kill by process name
pkill nginx

# Kill all processes by name
killall nginx

# Kill with specific signal
kill -TERM 1234      # Graceful termination (15)
kill -KILL 1234      # Force kill (9)
kill -HUP 1234       # Reload configuration (1)
kill -INT 1234       # Interrupt (2)

# Common signals
kill -SIGHUP 1234    # Reload
kill -SIGTERM 1234   # Graceful stop
kill -SIGKILL 1234   # Force stop
kill -SIGSTOP 1234   # Stop process
kill -SIGCONT 1234   # Continue stopped process

killall - Kill by Name

# Kill all processes by name
killall nginx

# Kill by exact name
killall -exact nginx

# Kill processes owned by user
killall -u username nginx

# Interactive
killall -i nginx

pkill - Kill by Pattern

# Kill by name
pkill nginx

# Kill by pattern
pkill -f "python.*script"

# Kill by user
pkill -u username

# Kill with signal
pkill -9 -f pattern

Job Control

Background and Foreground

# Start in background
./script.sh &

# List jobs
jobs

# Bring to foreground
fg %1

# Send to background (from fg)
Ctrl+Z
bg

# Bring specific job
fg %2

Managing Multiple Processes

Wait for Process

# Wait for specific PID
wait $PID

# Wait for background jobs
wait

# Wait with timeout
timeout 30 ./script.sh

Process Substitution in Loops

# Run commands in parallel
for file in *.txt; do
    process "$file" &
done
wait

# Parallel processing with limit
sem -j 4 ./process.sh

Process Monitoring Scripts

Monitor Process Existence

#!/usr/bin/env bash
# Check if process is running

PROCESS_NAME="${1:-nginx}"

if pgrep -x "$PROCESS_NAME" > /dev/null; then
    echo "$PROCESS_NAME is running"
    exit 0
else
    echo "$PROCESS_NAME is not running"
    exit 1
fi

Process Monitor with Restart

#!/usr/bin/env bash
# Monitor and restart process

PROCESS_NAME="${1:-nginx}"
RESTART_DELAY=5

while true; do
    if pgrep -x "$PROCESS_NAME" > /dev/null; then
        echo "$(date): $PROCESS_NAME is running"
    else
        echo "$(date): $PROCESS_NAME stopped, restarting..."
        $PROCESS_NAME &
    fi
    sleep $RESTART_DELAY
done

Resource Usage Monitor

#!/usr/bin/env bash
# Monitor CPU and memory usage

PID="${1:-$$}"

while true; do
    if [[ -d "/proc/$PID" ]]; then
        # Get CPU and memory
        cpu=$(ps -p $PID -o %cpu=)
        mem=$(ps -p $PID -o %mem=)

        echo "$(date): CPU=$cpu% MEM=$mem%"
    else
        echo "Process $PID not found"
        exit 1
    fi
    sleep 5
done

Real-World DevOps Examples

Docker Process Management

#!/usr/bin/env bash
# Manage Docker containers

CONTAINER_NAME="myapp"

# Check if running
if docker ps --format '{{.Names}}' | grep -q "^${CONTAINER_NAME}$"; then
    echo "Container is running"
else
    echo "Container not running, starting..."
    docker start $CONTAINER_NAME
fi

# Restart with health check
restart_container() {
    local container="$1"
    docker restart "$container"

    # Wait for health check
    local max_attempts=30
    local attempt=0

    while [ $attempt -lt $max_attempts ]; do
        if docker inspect --format='{{.State.Health.Status}}' "$container" 2>/dev/null | grep -q "healthy"; then
            echo "Container healthy"
            return 0
        fi
        sleep 2
        ((attempt++))
    done

    echo "Health check failed"
    return 1
}

Kubernetes Process Management

#!/usr/bin/env bash
# Manage Kubernetes pods

NAMESPACE="${1:-default}"

# Get pod status
kubectl get pods -n "$NAMESPACE" -o wide

# Watch pod status
kubectl get pods -n "$NAMESPACE" --watch

# Get process info in pod
kubectl exec -it pod-name -n "$NAMESPACE" -- ps aux

# Debug pod issues
kubectl debug pod-name -n "$NAMESPACE" --image=busybox -- sh

Systemd Service Management

#!/usr/bin/env bash
# Manage systemd services

SERVICE_NAME="nginx.service"

# Check status
systemctl status $SERVICE_NAME

# Start/Stop/Restart
systemctl start $SERVICE_NAME
systemctl stop $SERVICE_NAME
systemctl restart $SERVICE_NAME

# Enable/Disable at boot
systemctl enable $SERVICE_NAME
systemctl disable $SERVICE_NAME

# View logs
journalctl -u $SERVICE_NAME -f

# Restart on failure (systemd)
# Add to service file:
# Restart=on-failure
# RestartSec=5

Process CPU Affinity

#!/usr/bin/env bash
# Set CPU affinity

PID="$1"
CPU_LIST="0,1,2,3"

# Set CPU affinity
taskset -cp $CPU_LIST $PID

# View CPU affinity
taskset -p $PID

# Run on specific CPU
taskset -c 0 ./process.sh

Process Priority (Nice/ionice)

#!/usr/bin/env bash
# Set process priority

# Nice value (lower = higher priority, -20 to 19)
nice -n 10 ./script.sh        # Lower priority
nice -n -5 ./script.sh        # Higher priority (requires root)

# I/O priority (best-effort, idle, real-time)
ionice -c 3 -p $PID           # Idle priority
ionice -c 2 -n 7 -p $PID      # Best-effort, lowest priority

# Run with both
nice -n 10 ionice -c 3 ./script.sh

Process Limits

ulimit

# View all limits
ulimit -a

# Set max processes
ulimit -u 4096

# Set max file descriptors
ulimit -n 8192

# Set max memory size
ulimit -v unlimited

# Persistent limits (add to /etc/security/limits.conf)
# username  soft  nofile  8192
# username  hard  nofile  16384

Zombie and Orphan Processes

Finding Zombies

# Find zombie processes
ps aux | grep 'Z'

# More detailed
ps -eo pid,ppid,state,cmd | grep Z

# System-wide
cat /proc/*/stat | awk '$3=="Z" {print $1}'

Killing Zombie Processes

# Zombies are already dead, need to kill parent
kill -9 $(ps -eo pid,ppid,state,cmd | grep Z | awk '{print $2}')

# Or restart parent
systemctl restart parent_service

Advanced Process Operations

strace - Trace System Calls

# Trace process
strace -p $PID

# Trace and save to file
strace -o output.txt -p $PID

# Trace specific system calls
strace -e trace=open,read,write -p $PID

# Follow child processes
strace -f -p $PID

ltrace - Trace Library Calls

# Trace library calls
ltrace -p $PID

# Output to file
ltrace -o output.txt -p $PID

Summary

In this chapter, you learned:

✅ Understanding Linux processes and states
✅ Viewing processes (ps, top, htop)
✅ /proc filesystem
✅ Starting processes (foreground/background)
✅ Process control (kill, killall, pkill)
✅ Job control
✅ Process monitoring scripts
✅ Docker/Kubernetes process management
✅ Systemd service management
✅ Process priority and limits
✅ Finding and handling zombie processes
✅ System call tracing

Next Steps

Continue to the next chapter to learn about Signals and Traps.

Previous Chapter: awk - Pattern Scanning Next Chapter: Signals and Traps