Skip to content

Process

Process management is a critical skill for DevOps engineers and system administrators. This chapter covers how to monitor, control, and manage processes in Linux using bash scripts.


A process is an instance of a running program. Each process has:

  • PID (Process ID) - Unique identifier
  • PPID (Parent Process ID) - Parent process
  • UID/GID - User and group IDs
  • Status - Running, sleeping, stopped, zombie
  • Resources - Memory, CPU, file descriptors
┌────────────────────────────────────────────────────────────────┐
│ Process States │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────┐ │
│ │ New │ Process being created │
│ └──┬──┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ ┌─────────────┐ │
│ │ Running │ ──► │ Terminated │ │
│ │ (R) │ │ (Z) │ │
│ └────┬─────┘ └─────────────┘ │
│ │ │
│ ┌────┴─────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌────┐ ┌────┐ │
│ │Sleep│ │Stop│ │
│ │(S) │ │(T) │ │
│ └────┘ └────┘ │
│ │
│ R - Running/Runnable │
│ S - Interruptible Sleep (waiting for event) │
│ D - Uninterruptible Sleep (usually I/O) │
│ T - Stopped (by signal) │
│ Z - Zombie (terminated but not reaped) │
│ │
└────────────────────────────────────────────────────────────────┘

Terminal window
# Basic ps output
ps
# Full format
ps -ef
ps -aux
# Show threads
ps -eLf
ps -eT
# Custom output
ps -eo pid,ppid,cmd,%mem,%cpu
# Sort by memory
ps aux --sort=-%mem
# Sort by CPU
ps aux --sort=-%cpu
# Show process tree
ps -ef --forest
# Show user's processes
ps -U username
Terminal window
# Interactive process viewer
top
# Batch mode (useful for scripts)
top -bn1
# Top with specific refresh
top -d 5 -n 3
# Show only specific user
top -u username
# Show specific process
top -p $(pgrep -f process_name)
# Custom fields
top -o %MEM
Terminal window
# Interactive process manager (install: sudo pacman -S htop)
htop
# Show specific columns
htop -d 10
# Filter processes
htop -F nginx
Terminal window
# Continuous process monitoring
watch -n 1 'ps aux | grep nginx'
# Tree view
pstree
# Find process by name
pgrep -a nginx
# Find process with full details
pidof nginx

Terminal window
# Process status
cat /proc/$PID/status
# Process command line
cat /proc/$PID/cmdline
# Process environment
cat /proc/$PID/environ
# Process file descriptors
ls -la /proc/$PID/fd
# Process limits
cat /proc/$PID/limits

Terminal window
# Standard foreground process
nginx
# With output redirection
./script.sh > output.log 2>&1 &
Terminal window
# Start process in background
./long_running_script.sh &
# Start with nohup (immune to hangup)
nohup ./script.sh &
# Start with redirect
nohup ./script.sh > output.log 2>&1 &
# Using setsid (new session)
setsid ./script.sh &
# Using disown (remove from shell)
./script.sh &
disown

Terminal window
# Kill by PID
kill 1234
# Kill by process name
pkill nginx
# Kill all processes by name
killall nginx
# Kill with specific signal
kill -TERM 1234 # Graceful termination (15)
kill -KILL 1234 # Force kill (9)
kill -HUP 1234 # Reload configuration (1)
kill -INT 1234 # Interrupt (2)
# Common signals
kill -SIGHUP 1234 # Reload
kill -SIGTERM 1234 # Graceful stop
kill -SIGKILL 1234 # Force stop
kill -SIGSTOP 1234 # Stop process
kill -SIGCONT 1234 # Continue stopped process
Terminal window
# Kill all processes by name
killall nginx
# Kill by exact name
killall -exact nginx
# Kill processes owned by user
killall -u username nginx
# Interactive
killall -i nginx
Terminal window
# Kill by name
pkill nginx
# Kill by pattern
pkill -f "python.*script"
# Kill by user
pkill -u username
# Kill with signal
pkill -9 -f pattern

Terminal window
# Start in background
./script.sh &
# List jobs
jobs
# Bring to foreground
fg %1
# Send to background (from fg)
Ctrl+Z
bg
# Bring specific job
fg %2

Terminal window
# Wait for specific PID
wait $PID
# Wait for background jobs
wait
# Wait with timeout
timeout 30 ./script.sh
Terminal window
# Run commands in parallel
for file in *.txt; do
process "$file" &
done
wait
# Parallel processing with limit
sem -j 4 ./process.sh

#!/usr/bin/env bash
# Check if process is running
PROCESS_NAME="${1:-nginx}"
if pgrep -x "$PROCESS_NAME" > /dev/null; then
echo "$PROCESS_NAME is running"
exit 0
else
echo "$PROCESS_NAME is not running"
exit 1
fi
#!/usr/bin/env bash
# Monitor and restart process
PROCESS_NAME="${1:-nginx}"
RESTART_DELAY=5
while true; do
if pgrep -x "$PROCESS_NAME" > /dev/null; then
echo "$(date): $PROCESS_NAME is running"
else
echo "$(date): $PROCESS_NAME stopped, restarting..."
$PROCESS_NAME &
fi
sleep $RESTART_DELAY
done
#!/usr/bin/env bash
# Monitor CPU and memory usage
PID="${1:-$$}"
while true; do
if [[ -d "/proc/$PID" ]]; then
# Get CPU and memory
cpu=$(ps -p $PID -o %cpu=)
mem=$(ps -p $PID -o %mem=)
echo "$(date): CPU=$cpu% MEM=$mem%"
else
echo "Process $PID not found"
exit 1
fi
sleep 5
done

#!/usr/bin/env bash
# Manage Docker containers
CONTAINER_NAME="myapp"
# Check if running
if docker ps --format '{{.Names}}' | grep -q "^${CONTAINER_NAME}$"; then
echo "Container is running"
else
echo "Container not running, starting..."
docker start $CONTAINER_NAME
fi
# Restart with health check
restart_container() {
local container="$1"
docker restart "$container"
# Wait for health check
local max_attempts=30
local attempt=0
while [ $attempt -lt $max_attempts ]; do
if docker inspect --format='{{.State.Health.Status}}' "$container" 2>/dev/null | grep -q "healthy"; then
echo "Container healthy"
return 0
fi
sleep 2
((attempt++))
done
echo "Health check failed"
return 1
}
#!/usr/bin/env bash
# Manage Kubernetes pods
NAMESPACE="${1:-default}"
# Get pod status
kubectl get pods -n "$NAMESPACE" -o wide
# Watch pod status
kubectl get pods -n "$NAMESPACE" --watch
# Get process info in pod
kubectl exec -it pod-name -n "$NAMESPACE" -- ps aux
# Debug pod issues
kubectl debug pod-name -n "$NAMESPACE" --image=busybox -- sh
#!/usr/bin/env bash
# Manage systemd services
SERVICE_NAME="nginx.service"
# Check status
systemctl status $SERVICE_NAME
# Start/Stop/Restart
systemctl start $SERVICE_NAME
systemctl stop $SERVICE_NAME
systemctl restart $SERVICE_NAME
# Enable/Disable at boot
systemctl enable $SERVICE_NAME
systemctl disable $SERVICE_NAME
# View logs
journalctl -u $SERVICE_NAME -f
# Restart on failure (systemd)
# Add to service file:
# Restart=on-failure
# RestartSec=5
#!/usr/bin/env bash
# Set CPU affinity
PID="$1"
CPU_LIST="0,1,2,3"
# Set CPU affinity
taskset -cp $CPU_LIST $PID
# View CPU affinity
taskset -p $PID
# Run on specific CPU
taskset -c 0 ./process.sh
#!/usr/bin/env bash
# Set process priority
# Nice value (lower = higher priority, -20 to 19)
nice -n 10 ./script.sh # Lower priority
nice -n -5 ./script.sh # Higher priority (requires root)
# I/O priority (best-effort, idle, real-time)
ionice -c 3 -p $PID # Idle priority
ionice -c 2 -n 7 -p $PID # Best-effort, lowest priority
# Run with both
nice -n 10 ionice -c 3 ./script.sh

Terminal window
# View all limits
ulimit -a
# Set max processes
ulimit -u 4096
# Set max file descriptors
ulimit -n 8192
# Set max memory size
ulimit -v unlimited
# Persistent limits (add to /etc/security/limits.conf)
# username soft nofile 8192
# username hard nofile 16384

Terminal window
# Find zombie processes
ps aux | grep 'Z'
# More detailed
ps -eo pid,ppid,state,cmd | grep Z
# System-wide
cat /proc/*/stat | awk '$3=="Z" {print $1}'
Terminal window
# Zombies are already dead, need to kill parent
kill -9 $(ps -eo pid,ppid,state,cmd | grep Z | awk '{print $2}')
# Or restart parent
systemctl restart parent_service

Terminal window
# Trace process
strace -p $PID
# Trace and save to file
strace -o output.txt -p $PID
# Trace specific system calls
strace -e trace=open,read,write -p $PID
# Follow child processes
strace -f -p $PID
Terminal window
# Trace library calls
ltrace -p $PID
# Output to file
ltrace -o output.txt -p $PID

In this chapter, you learned:

  • ✅ Understanding Linux processes and states
  • ✅ Viewing processes (ps, top, htop)
  • ✅ /proc filesystem
  • ✅ Starting processes (foreground/background)
  • ✅ Process control (kill, killall, pkill)
  • ✅ Job control
  • ✅ Process monitoring scripts
  • ✅ Docker/Kubernetes process management
  • ✅ Systemd service management
  • ✅ Process priority and limits
  • ✅ Finding and handling zombie processes
  • ✅ System call tracing

Continue to the next chapter to learn about Signals and Traps.


Previous Chapter: awk - Pattern Scanning Next Chapter: Signals and Traps