Process_substitution

Chapter 42: Process Substitution in Bash

Overview

Process substitution is a powerful feature that allows commands to be executed and their output treated as files. It’s essential for advanced scripting, allowing you to use command output where filenames are expected, which is critical for DevOps automation tasks.

Understanding Process Substitution

What is Process Substitution?

Process substitution <(command) or >(command) executes a command and provides its output/input as a virtual file descriptor. This allows you to use command output where filenames are expected.

┌────────────────────────────────────────────────────────────────┐
│                Process Substitution Flow                           │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│   <(command)    →    /dev/fd/63 (named pipe)                 │
│                                                                │
│   command ──► output ──► /dev/fd/63 ──► another_command      │
│                                                                │
│   >(command)    ←    /dev/fd/63 (named pipe)                 │
│                                                                │
│   command ← input ← /dev/fd/63 ← another_command               │
│                                                                │
│   Key benefits:                                                 │
│   - No temporary files needed                                  │
│   - Works with commands expecting file arguments                │
│   - Enables advanced data processing                           │
│                                                                │
└────────────────────────────────────────────────────────────────┘

Basic Syntax

Input Process Substitution

# Use command output as file input
diff <(command1) <(command2)

# Compare file with command output
diff /etc/passwd <(getent passwd)

# View differences
diff <(ls /bin) <(ls /usr/bin)

Output Process Substitution

# Send output to command as input
command >(file)

# Example - redirect to multiple files
echo "data" > >(command1) > >(command2)

# Combined input and output
command <(input1) >(output1)

Common Use Cases

Comparing Outputs

#!/usr/bin/env bash
# Compare current and expected file contents

# Using diff with process substitution
diff <(cat /etc/passwd | sort) <(getent passwd | sort)

# Compare directory listings
diff <(ls /bin | sort) <(ls /usr/bin | sort)

# Compare process lists
diff <(ps aux | grep nginx) <(systemctl status nginx)

While Loop with Command Output

#!/usr/bin/env bash
# Read from command output using process substitution

# ✅ Good - preserves spaces in filenames
while IFS= read -r line; do
    echo "Processing: $line"
done < <(find . -type f -name "*.txt")

# ❌ Bad - word splitting issues
for file in $(find . -type f); do
    echo "Processing: $file"  # Breaks on spaces
done

Multiple File Processing

#!/usr/bin/env bash
# Process multiple sources simultaneously

# Combine outputs from multiple commands
cat <(echo "=== Header ===") \
    <(cat data.txt) \
    <(echo "=== Footer ===") \
    > combined.txt

# Merge sorted files
cat <(file1.txt | sort) <(file2.txt | sort) | uniq

Practical Examples

Compare Disk Usage

#!/usr/bin/env bash
# Compare disk usage between directories

# Get current disk usage
du -sh /home/* 2>/dev/null | sort -rn > /tmp/now.txt

# Compare with backup
du -sh /backup/home/* 2>/dev/null | sort -rn > /tmp/then.txt

# Show differences
diff /tmp/now.txt /tmp/then.txt

# Or use process substitution directly
diff <(du -sh /home/*) <(du -sh /backup/home/*)

Parallel Processing

#!/usr/bin/env bash
# Process files in parallel using xargs

# Process files in parallel
export -f process_file
find . -name "*.txt" -print0 | xargs -0 -P 4 -I {} bash -c 'process_file "$@"' _ {}

# With process substitution
while IFS= read -r file; do
    process "$file" &
done < <(find . -name "*.txt")

wait
echo "All files processed"

Filter Output

#!/usr/bin/env bash
# Filter command output using process substitution

# Using grep with process substitution
grep "ERROR" <(command -output)

# Combining multiple filters
cat <(command1 | grep "filter1") <(command2 | grep "filter2")

# Real example - combine logs
cat <(systemctl status nginx | grep Active) \
    <(tail -n 20 /var/log/nginx/error.log)

Advanced Techniques

Temporary Files Alternative

#!/usr/bin/env bash
# Process substitution vs temp files

# Using process substitution (no temp files needed)
diff <(sort file1.txt) <(sort file2.txt)

# Creates virtual files in /dev/fd/
# No disk space used
# Faster for small/medium data

# When to use temp files:
# - Very large datasets
# - Need to reuse data multiple times
# - Commands don't support stdin

Pipeline with Multiple Stages

#!/usr/bin/env bash
# Multi-stage processing with process substitution

# Combine and process multiple streams
cat <(command1 | grep "pattern") <(command2 | sort) | \
    awk '{print $1, $2}' | \
    sort -u

# Real example - analyze logs from multiple sources
cat <(journalctl -u nginx -n 50 --no-pager) \
    <(tail -n 50 /var/log/nginx/error.log) | \
    grep ERROR | \
    sort | uniq -c | sort -rn

Database Queries

#!/usr/bin/env bash
# Compare database states using process substitution

# Compare query results
diff <(mysql -u root -e "SELECT * FROM users" mydb | sort) \
     <(mysql -u root -e "SELECT * FROM users" mydb_bak | sort)

# Export and compare
diff <(mysqldump -u root -T /tmp/dir1 mydb) \
     <(mysqldump -u root -T /tmp/dir2 mydb_bak)

Debugging Process Substitution

Viewing File Descriptors

# List open file descriptors
ls -la /proc/$$/fd

# For process substitution
echo <(echo test)
# Shows: /dev/fd/63

# View actual file
ls -la /dev/fd/

Troubleshooting

#!/usr/bin/env bash
# Debug process substitution issues

# Check if process substitution is supported
if [[ -r /dev/fd/0 ]]; then
    echo "Process substitution supported"
else
    echo "Process substitution NOT supported"
fi

# Debug: trace what happens
bash -x script.sh

# Common issues:
# 1. Command doesn't read from stdin - use temp file
# 2. Permission denied - check /dev/fd permissions
# 3. Broken pipe - handle in script

DevOps Examples

Log Analysis

#!/usr/bin/env bash
# Analyze logs from multiple sources

# Combine and analyze
cat <(tail -f /var/log/nginx/access.log) \
    <(tail -f /var/log/nginx/error.log) | \
    grep ERROR

# Aggregate metrics
cat <(docker logs container1 2>&1) \
    <(docker logs container2 2>&1) | \
    grep ERROR | \
    sort | uniq -c

Configuration Comparison

#!/usr/bin/env bash
# Compare configurations

# Current vs default
diff <(nginx -T 2>&1 | grep -v "^#") \
     <(cat /etc/nginx/default.conf)

# Production vs staging
diff <(aws s3 cp s3://bucket/prod/config -) \
     <(aws s3 cp s3://bucket/staging/config -)

Summary

In this chapter, you learned:

✅ What process substitution is and how it works
✅ Basic syntax: <() and >()
✅ Comparing command outputs
✅ Reading from process substitution
✅ Parallel processing techniques
✅ Advanced techniques and troubleshooting
✅ DevOps real-world examples

Next Steps

Continue to the next chapter to learn about Performance Optimization.

Previous Chapter: Coprocesses Next Chapter: Performance Optimization