String_manipulation

Chapter 13: String Manipulation

Overview

String manipulation is a core skill for any bash scripter. This chapter covers substring extraction, pattern matching, search and replace, case conversion, and advanced string operations using bash built-ins and external tools. These skills are essential for processing log files, parsing configuration files, and manipulating text data in DevOps workflows.

Basic String Operations

String Length

#!/usr/bin/env bash
text="Hello, World!"

# Get string length
echo "${#text}"  # 13

# Alternative using expr
length=$(expr length "$text")
echo "$length"   # 13

String Concatenation

#!/usr/bin/env bash
str1="Hello"
str2="World"

# Concatenate
result="$str1 $str2"
echo "$result"  # Hello World

# Append
str1+=" World"
echo "$str1"    # Hello World

Substring Operations

Substring Extraction

#!/usr/bin/env bash
text="Hello, World!"

# From position 0, length 5
echo "${text:0:5}"    # Hello

# From position 7, length 5
echo "${text:7:5}"    # World

# From position to end
echo "${text:7}"      # World!

# Last N characters
echo "${text: -5}"    # World

# Using expr substr
echo $(expr substr "$text" 1 5)  # Hello

Pattern Matching

Remove Pattern

#!/usr/bin/env bash
filename="report-2024-01-15.csv"

# Remove shortest match from beginning
echo "${filename#*-}"   # 2024-01-15.csv

# Remove longest match from beginning
echo "${filename##*-}"  # csv

# Remove shortest match from end
echo "${filename%.*}"   # report-2024-01-15

# Remove longest match from end
echo "${filename%%.*}"  # report

Replace Pattern

#!/usr/bin/env bash
text="Hello World"

# Replace first occurrence
echo "${text/World/Universe}"  # Hello Universe

# Replace all occurrences
echo "${text//l/L}"           # HeLLo WorLd

# Replace if at beginning
echo "${text/#Hello/Hi}"     # Hi World

# Replace if at end
echo "${text/%World/Everyone}" # Hello Everyone

Case Conversion

Convert Case (Bash 4+)

#!/usr/bin/env bash
text="Hello World"

# First character uppercase
echo "${text^}"   # Hello World

# All characters uppercase
echo "${text^^}"  # HELLO WORLD

# First character lowercase
text="HELLO WORLD"
echo "${text,}"   # hELLO WORLD

# All characters lowercase
echo "${text,,}"  # hello world

Word Manipulation

#!/usr/bin/env bash
sentence="the quick brown fox jumps over the lazy dog"

# Remove first word
echo "${sentence#* }"  # quick brown fox jumps over the lazy dog

# Remove last word
echo "${sentence% *}"   # the quick brown fox jumps over the lazy

# Get first word
echo "${sentence%% *}"  # the

# Get last word
echo "${sentence##* }"  # dog

Practical Examples

Example 1: Parse Log File Name

#!/usr/bin/env bash
logfile="application-2024-01-15.log"

# Extract components
filename="${logfile%.*}"      # application-2024-01-15
extension="${logfile##*.}"    # log
date_part="${filename#*-}"    # 2024-01-15.log (after removing prefix)
base_name="${filename%-*}"    # application

echo "Log file: $logfile"
echo "Base name: $base_name"
echo "Date: $date_part"
echo "Extension: $extension"

Example 2: Extract Version from String

#!/usr/bin/env bash
# Parse various version formats
version_str="app-v1.2.3-beta"

# Extract version number
version="${version_str##*-}"   # beta (if using different pattern)
version=$(echo "$version_str" | grep -oP '\d+\.\d+\.\d+' || echo "unknown")
echo "Version: $version"

# Alternative: remove all non-numeric
clean_version="${version_str//[^0-9.]/}"
echo "Clean: $clean_version"

Example 3: Sanitize Input

#!/usr/bin/env bash
# Remove special characters
input="user@domain.com!#$%^&*()"

# Keep only alphanumeric
sanitized="${input//[^a-zA-Z0-9]/}"
echo "Alphanumeric: $sanitized"

# Remove spaces
no_spaces="${input// /}"
echo "No spaces: $no_spaces"

# Trim whitespace
trimmed=$(echo "$input" | xargs)
echo "Trimmed: $trimmed"

Example 4: URL Parsing

#!/usr/bin/env bash
url="https://example.com:8080/path/to/resource?key=value&foo=bar"

# Extract protocol
protocol="${url%%://*}"
echo "Protocol: $protocol"

# Extract host and port
remain="${url#*://}"
host_port="${remain%%/*}"
echo "Host:Port: $host_port"

# Extract path
path="/${remain#*/}"
path="${path%%\?*}"
echo "Path: $path"

# Extract query string
query="${remain##*\?}"
echo "Query: $query"

Example 5: String Padding

#!/usr/bin/env bash
# Left pad with zeros
num=42
printf "%05d\n" "$num"   # 00042

# Right pad with spaces
printf "%-10s\n" "hello"  # hello

# Using bash parameter expansion (limited)
text="hello"
echo "${text}-------"     # hello-------

Example 6: Split String

#!/usr/bin/env bash
# Split by delimiter
IFS=',' read -ra parts <<< "apple,banana,cherry,date"

echo "Parts: ${#parts[@]}"
for part in "${parts[@]}"; do
    echo "  - $part"
done

# Split by whitespace
read -ra words <<< "one two three four"
echo "Words: ${words[@]}"

External Tools for String Manipulation

Using sed

#!/usr/bin/env bash
text="The quick brown fox jumps over the lazy dog"

# Replace first occurrence
echo "$text" | sed 's/brown/red/'

# Replace all
echo "$text" | sed 's/the/a/g'

# Delete pattern
echo "$text" | sed '/quick/d'

# Extract by pattern
echo "$text" | sed -n 's/.*\(quick\).*/\1/p'

Using awk

#!/usr/bin/env bash
text="apple,banana,cherry"

# Split and print field
echo "$text" | awk -F',' '{print $2}'

# Print all fields
echo "$text" | awk -F',' '{for(i=1;i<=NF;i++) print $i}'

# String length
echo "hello" | awk '{print length($0)}'

Advanced Operations

Regular Expression Matching

#!/usr/bin/env bash
email="user@example.com"

# Using [[ ]] =~ operator
if [[ "$email" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then
    echo "Valid email"
else
    echo "Invalid email"
fi

# Extract match
if [[ "$email" =~ ^([a-zA-Z0-9._%+-]+)@ ]]; then
    echo "Username: ${BASH_REMATCH[1]}"
fi

Base64 Encoding/Decoding

#!/usr/bin/env bash
# Encode
echo "Hello World" | base64
# Output: SGVsbG8gV29ybGQK

# Decode
echo "SGVsbG8gV29ybGQK" | base64 -d

# File encoding
base64 input.txt -o output.b64
base64 -d output.b64 -o decoded.txt

Summary

In this chapter, you learned about:

✅ String length and concatenation
✅ Substring extraction
✅ Pattern matching and removal
✅ String replacement
✅ Case conversion
✅ Word manipulation
✅ Practical DevOps examples
✅ External tools (sed, awk)
✅ Regular expression matching
✅ Base64 encoding/decoding

Exercises

Level 1: Basics

Get the length of a string
Extract substring from a string
Convert string to uppercase

Level 2: Intermediate

Parse a log filename to extract components
Validate an email address using regex
Parse a URL into components

Level 3: Advanced

Implement a string padding function
Create a CSV parser
Build a URL encoder/decoder

Next Steps

Continue to the next chapter to learn about input/output operations.

Previous Chapter: Associative Arrays Next Chapter: Input/Output