Skip to content

Cgroups_namespaces

Linux Resource Isolation and Virtualization

Section titled “Linux Resource Isolation and Virtualization”

Control groups (cgroups) are a Linux kernel feature that allows processes to be organized into hierarchical groups and have resource limits, accounting, and isolation applied to them. Cgroups are the foundation for containerization technologies like Docker and system services managed by systemd.

┌────────────────────────────────────────────────────────────────────────┐
│ CGROUPS ARCHITECTURE │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ LINUX KERNEL │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────────────────────┐ │ │
│ │ │ cgroup v2 (unified hierarchy) │ │ │
│ │ │ │ │ │
│ │ │ /sys/fs/cgroup/ │ │ │
│ │ │ ├── cgroup.controllers │ │ │
│ │ │ ├── cgroup.subtree_control │ │ │
│ │ │ ├── cgroup.procs │ │ │
│ │ │ │ │ │ │
│ │ │ ├── system.slice/ │ │ │
│ │ │ │ ├──.slice/ │ │ │
│ │ │ │ └── user.slice/ │ │ │
│ │ │ │ │ │ │
│ │ │ ├── docker/ │ │ │
│ │ │ │ └── abc123.../ │ │ │
│ │ │ │ │ │ │
│ │ │ └── mygroup/ │ │ │
│ │ │ ├── cgroup.procs │ │ │
│ │ │ ├── cpu.max │ │ │
│ │ │ ├── memory.max │ │ │
│ │ │ └── io.max │ │ │
│ │ └──────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ User Space │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ systemd, Docker, containerd, runc, cgcreate, cgset │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
Featurecgroup v1cgroup v2
HierarchyMultiple separate hierarchiesSingle unified hierarchy
ControllersSeparate controllersUnified controllers
Process ManagementPer-controller cgroup.procsSingle cgroup.procs
Thread ModelThread-group levelProcess-level
PerformanceMore overheadBetter performance
CompatibilityLegacy supportModern, recommended
┌────────────────────────────────────────────────────────────────────────┐
│ CGROUP CONTROLLERS │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┬──────────────────────────────────────────────────┐ │
│ │ Controller │ Purpose │ │
│ ├─────────────┼──────────────────────────────────────────────────┤ │
│ │ cpu │ CPU time scheduling │ │
│ │ cpuacct │ CPU usage accounting │ │
│ │ memory │ Memory allocation and usage │ │
│ │ io │ Block I/O control │ │
│ │ blkio │ Block I/O (v1 legacy) │ │
│ │ pids │ Process count limits │ │
│ │ devices │ Device access control │ │
│ │ net_cls │ Network traffic classification (v1) │ │
│ │ net_prio │ Network priority (v1) │ │
│ │ perf_event │ Performance monitoring │ │
│ │ freezer │ Suspend/resume processes │ │
│ │ hugetlb │ Huge pages limits │ │
│ │ rdma │ RDMA/IB resources │ │
│ └─────────────┴──────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘

Terminal window
# Check mounted cgroup filesystems
mount | grep cgroup
# Show cgroup version
cat /proc/cgroups
# Check if running v2
ls -la /sys/fs/cgroup/
# If unified cgroup, shows single directory without controllers
# Check current cgroup
cat /proc/self/cgroup
# Check systemd cgroup
systemd-cgls
Terminal window
# Create a new cgroup
sudo mkdir -p /sys/fs/cgroup/mylimit
# Add current shell to cgroup
echo $$ | sudo tee /sys/fs/cgroup/mylimit/cgroup.procs
# Or add specific process
echo <PID> | sudo tee /sys/fs/cgroup/mylimit/cgroup.procs
# List processes in cgroup
cat /sys/fs/cgroup/mylimit/cgroup.procs
# Verify cgroup controllers
cat /sys/fs/cgroup/mylimit/cgroup.controllers
# Enable controllers
echo cpu memory | sudo tee /sys/fs/cgroup/mylimit/cgroup.subtree_control
# Remove cgroup (must be empty)
sudo rmdir /sys/fs/cgroup/mylimit
Terminal window
# Set CPU limit (v2) - max microseconds per period
# Format: "max period" or "quota period"
echo "50000 100000" | sudo tee /sys/fs/cgroup/mylimit/cpu.max
# 50000μs out of 100000μs = 50% CPU
# Alternative: Set as percentage (0-10000 = 0-100%)
# Not directly available in v2, use cpu.max
# Set CPU weight (relative share)
echo 512 | sudo tee /sys/fs/cgroup/mylimit/cpu.weight
# Default is 100, range 1-10000
# Check CPU usage
cat /sys/fs/cgroup/mylimit/cpu.stat
Terminal window
# Set memory limit (hard limit)
echo "512M" | sudo tee /sys/fs/cgroup/mylimit/memory.max
# Set memory limit (soft limit, triggers reclaim)
echo "256M" | sudo tee /sys/fs/cgroup/mylimit/memory.high
# Disable memory limits
echo "max" | sudo tee /sys/fs/cgroup/mylimit/memory.max
# Check memory usage
cat /sys/fs/cgroup/mylimit/memory.current
cat /sys/fs/cgroup/mylimit/memory.stat
# Set swap limit (v2)
# Format: "current,max"
echo "256M 512M" | sudo tee /sys/fs/cgroup/mylimit/memory.swap.max
Terminal window
# Set I/O limit (v2)
# Format: "device major:minor weight" or "device limit"
# Using io.max (absolute limit)
echo "8:0 wbps=104857600 rbps=104857600" | sudo tee /sys/fs/cgroup/mylimit/io.max
# Using io.weight (relative)
echo "8:0 weight=100" | sudo tee /sys/fs/cgroup/mylimit/io.weight
# Check I/O statistics
cat /sys/fs/cgroup/mylimit/io.stat
# List available devices
ls /sys/devices/
Terminal window
# Set max number of processes
echo 100 | sudo tee /sys/fs/cgroup/mylimit/pids.max
# Set max recursively (includes children)
echo 100 | sudo tee /sys/fs/cgroup/mylimit/pids.max
# Check current process count
cat /sys/fs/cgroup/mylimit/pids.current
# Allow root to exceed limits
echo 1 | sudo tee /sys/fs/cgroup/mylimit/pids.allow
Terminal window
# Allow specific device
# Format: "a" for allow, "c" for char, "b" for block
echo "c 1:3 rwm" | sudo tee /sys/fs/cgroup/mylimit/devices.allow
# Allow /dev/null
# Deny device
echo "b 8:0 rwm" | sudo tee /sys/fs/cgroup/mylimit/devices.deny
# List allowed devices
cat /sys/fs/cgroup/mylimit/devices.list

Terminal window
# Install tools (Debian/Ubuntu)
sudo apt-get install cgroup-tools
# Install (RHEL/CentOS)
sudo yum install libcgroup-tools
# Create cgroup with multiple controllers
sudo cgcreate -g cpu,memory,io:/mylimit
# List cgroups
ls /sys/fs/cgroup/
sudo lscgroup
Terminal window
# Set CPU limit (cgroup v1)
sudo cgset -r cpu.cfs_quota_us=50000 mylimit
sudo cgset -r cpu.cfs_period_us=100000 mylimit
# Set memory limit
sudo cgset -r memory.limit_in_bytes=512M mylimit
# Set IO limit
sudo cgset -r blkio.throttle.write_bps_device="8:0 104857600" mylimit
# Run command in cgroup
sudo cgexec -g cpu,memory:mylimit /bin/bash
# Set parameters after creation
sudo cgset -r cpu.shares=512 mylimit
# Delete cgroup
sudo cgdelete cpu,memory:/mylimit

Terminal window
# Run command with memory limit (transient scope)
systemd-run --scope -p MemoryMax=512M -p CPUWeight=200 /bin/bash
# Run with CPU limit
systemd-run --scope -p CPUQuota=50% /bin/yes
# Run with multiple limits
systemd-run --scope \
-p MemoryMax=1G \
-p CPUQuota=50% \
-p IOWeight=200 \
-p TasksMax=50 \
/bin/bash
# Check status
systemctl status run-xxxxx.scope
# View resource usage
systemctl show run-xxxxx.scope
Terminal window
# Create drop-in for service
sudo mkdir -p /etc/systemd/system/myservice.service.d
# Add resource limits
sudo tee /etc/systemd/system/myservice.service.d/limits.conf << 'EOF'
[Service]
MemoryMax=1G
CPUQuota=50%
TasksMax=100
IOWeight=200
EOF
# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart myservice

Namespaces partition kernel resources so that each namespace appears to have its own isolated instance of that resource. This provides isolation between processes, enabling containers to have their own view of system resources.

┌────────────────────────────────────────────────────────────────────────┐
│ NAMESPACES ARCHITECTURE │
├────────────────────────────────────────────────────────────────────────┤
│ │
│ Each namespace type wraps/isolates specific kernel resources: │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Host System │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ PID Namespace │ │ Net Namespace │ │ Mount NS │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ PID 1 │ │ eth0 │ │ / │ │ │
│ │ │ PID 2 │ │ lo │ │ /proc │ │ │
│ │ │ PID 3 │ │ /etc/net │ │ /sys │ │ │
│ │ │ ... │ │ ... │ │ ... │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ User NS │ │ IPC Namespace│ │ UTS Namespace│ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ UID 0 │ │ mqueue │ │ hostname │ │ │
│ │ │ GID 0 │ │ shm │ │ domain │ │ │
│ │ │ capabilities │ │ semaphores │ │ │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
│ Container Process: │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Sees its own PID 1, network stack, mount points, etc. │ │
│ │ Can be UID 0 inside while UID is 1000 on host │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
NamespaceFlagIsolatesUse Case
PIDCLONE_NEWPIDProcess IDsProcess isolation
NetworkCLONE_NEWNETNetwork devices, stacksContainer networking
MountCLONE_NEWNSMount pointsFilesystem isolation
UserCLONE_NEWUSERUIDs, GIDs, capabilitiesUnprivileged containers
IPCCLONE_NEWIPCPOSIX message queues, shared memoryProcess communication
UTSCLONE_NEWUTSHostname, domain nameContainer isolation
TimeCLONE_NEWTIMESystem timeVirtualization

Terminal window
# Create new PID namespace
sudo unshare --pid --fork --mount-proc /bin/bash
# Create network namespace
sudo ip netns add mynet
ip netns list
# Execute in network namespace
sudo ip netns exec mynet /bin/bash
ip link show
# Create mount namespace
sudo unshare --mount /bin/bash
# Mount changes now isolated
# Create user namespace
unshare --user /bin/bash
# Now running as UID mapping
# Create UTS namespace
sudo unshare --uts /bin/bash
hostname container-host
# Create IPC namespace
sudo unshare --ipc /bin/bash
# Combine namespaces
sudo unshare --pid --mount --net --user /bin/bash
# Create all namespaces
unshare --all
Terminal window
# List network namespaces
ip netns list
# Add network namespace
sudo ip netns add web
sudo ip netns add db
# Execute commands in namespace
sudo ip netns exec web ip link
sudo ip netns exec web ping 8.8.8.8
# Create virtual ethernet pair
sudo ip link add veth0 type veth peer name veth1
# Move one end to namespace
sudo ip link set veth1 netns web
# Configure in namespace
sudo ip netns exec web ip addr add 10.0.0.2/24 dev veth1
sudo ip netns exec web ip link set veth1 up
# Configure in host
ip addr add 10.0.0.1/24 dev veth0
ip link set veth0 up
# Delete namespace
sudo ip netns delete web
Terminal window
# View namespace for current process
ls -la /proc/self/ns/
# View namespaces for specific process
ls -la /proc/<PID>/ns/
# Check process namespace
ls -la /proc/self/ns/ | grep -E "pid|net|mnt|user|ipc|uts"
# Read namespace inode to compare
stat /proc/self/ns/pid
# Check if processes share namespace
ls -la /proc/<PID1>/ns/ | /proc/<PID2>/ns/
Terminal window
# Enter namespace of running process using nsenter
nsenter -t <PID> -n /bin/bash # Network namespace
nsenter -t <PID> -m /bin/bash # Mount namespace
nsenter -t <PID> -p /bin/bash # PID namespace
nsenter -t <PID> -u /bin/bash # UTS namespace
nsenter -t <PID> -i /bin/bash # IPC namespace
nsenter -t <PID> -U /bin/bash # User namespace
# Enter multiple
nsenter -t <PID> -m -u -i -n -p /bin/bash

#!/bin/bash
# Create isolated container with both cgroups and namespaces
# Create cgroup
sudo mkdir -p /sys/fs/cgroup/container
echo 512M | sudo tee /sys/fs/cgroup/container/memory.max
echo 50 | sudo tee /sys/fs/cgroup/container/cpu.weight
# Create cpuset
echo "0-1" | sudo tee /sys/fs/cgroup/container/cpuset.cpus
echo "0" | sudo tee /sys/fs/cgroup/container/cpuset.mems
# Create network namespace
sudo ip netns add container_net
# Create mount namespace
sudo unshare --mount --pid --fork --net --user --cgroup /bin/bash << 'CMDS'
# Inside container
hostname container
ip link set lo up
echo "Container running with limits:"
cat /sys/fs/cgroup/container/memory.max
cat /sys/fs/cgroup/container/cpu.weight
/bin/bash
CMDS

Docker uses both cgroups and namespaces:

Terminal window
# Docker automatically creates cgroups and namespaces
docker run -d --name mycontainer nginx
# Check cgroup for container
docker inspect mycontainer | grep -i cgroup
# Check namespace
ls -la /proc/$(docker pid mycontainer)/ns/
# Resource limits (cgroups)
docker run -d --name limited \
--memory=512m \
--cpus=0.5 \
--cpu-shares=512 \
nginx
# Check limits
docker stats mycontainer

#!/bin/bash
# Limit a process using systemd-run
# Memory limited
systemd-run --scope -p MemoryMax=512M \
-p MemorySwapMax=256M \
/bin/bash -c 'while true; do echo "Running"; sleep 1; done'
# CPU limited
systemd-run --scope -p CPUQuota=50% \
/bin/bash -c 'while true; do echo $((i++)); done'
# IO limited
systemd-run --scope -p IOWeight=100 \
/bin/bash -c 'dd if=/dev/zero of=/tmp/test bs=1M count=100'
/etc/systemd/system/resource-limits.service
[Unit]
Description=Application with Resource Limits
[Service]
Type=simple
ExecStart=/usr/bin/myapp
MemoryMax=1G
MemoryHigh=512M
CPUQuota=50%
IOWeight=200
TasksMax=50
[Install]
WantedBy=multi-user.target
#!/bin/bash
# Create isolated environment for testing
# Create user namespace
unshare --user --map-root-user /bin/bash
# Or with systemd
systemd-run --user --scope -p MemoryMax=100M /bin/bash
# Check isolation
id
# Shows UID 0 inside, mapped to actual user outside
# Check /proc
cat /proc/1/cmdline
# Different from host

Terminal window
# cgroup not found
mount -t cgroup2 none /sys/fs/cgroup
# Permission denied with cgroups
# Ensure running as root or have proper capabilities
# Namespace operation not permitted
# Check kernel capabilities
capsh --print
# Unshare fails
# Check kernel config: CONFIG_*
# Cannot create network namespace
# Check iproute2 installed
# cgroup memory limit not working
# Ensure using correct cgroup version
Terminal window
# Check cgroup for process
cat /proc/<PID>/cgroup
# Check systemd cgroup
systemd-cgls | grep <PID>
# Check namespace
ls -la /proc/<PID>/ns/
# Check capabilities
cat /proc/<PID>/status | grep Cap
# View cgroup resource usage
cat /sys/fs/cgroup/mylimit/memory.stat
cat /sys/fs/cgroup/mylimit/cpu.stat
# Monitor cgroup events
cgroup_events -m -g /sys/fs/cgroup/mylimit

Q1: What is the difference between cgroups and namespaces?

Section titled “Q1: What is the difference between cgroups and namespaces?”

Answer:

  • cgroups (control groups): Limit and account for resources (CPU, memory, I/O, etc.). They control how much of a resource a process can use.
  • Namespaces: Isolate global system resources so processes see different views. They control what resources a process can see/access.

Together they form the foundation of Linux containers - namespaces provide isolation while cgroups provide resource limits.

Q2: What are the different types of namespaces in Linux?

Section titled “Q2: What are the different types of namespaces in Linux?”

Answer:

  1. PID - Process ID isolation (each container has PID 1)
  2. Network - Network devices, stacks, ports
  3. Mount - Mount points and filesystems
  4. User - User and group ID mapping
  5. IPC - POSIX message queues, shared memory, semaphores
  6. UTS - Hostname and domain name
  7. Time - System time (newer)

Q3: How do you limit CPU usage for a process in Linux?

Section titled “Q3: How do you limit CPU usage for a process in Linux?”

Answer: Using cgroups v2:

Terminal window
# Create cgroup
mkdir /sys/fs/cgroup/limit
echo "50000 100000" > /sys/fs/cgroup/limit/cpu.max
# Add process
echo <PID> > /sys/fs/cgroup/limit/cgroup.procs

Or with systemd:

Terminal window
systemd-run --scope -p CPUQuota=50% /bin/bash

Q4: How does Docker use cgroups and namespaces?

Section titled “Q4: How does Docker use cgroups and namespaces?”

Answer: When you run a container:

  1. Docker creates a new cgroup for the container
  2. Applies resource limits (memory, CPU, etc.) to the cgroup
  3. Creates namespaces for isolation
  4. The container process runs inside these cgroup and namespace limits

Each container gets its own cgroup hierarchy and set of namespaces.

Q5: What is the difference between cgroup v1 and v2?

Section titled “Q5: What is the difference between cgroup v1 and v2?”

Answer:

  • v1: Multiple separate hierarchies per controller, less efficient
  • v2: Single unified hierarchy for all controllers, better performance, simpler structure

Most modern systems use v2. You can check with:

Terminal window
mount | grep cgroup
ls /sys/fs/cgroup/

Q6: How do you create a network namespace and configure it?

Section titled “Q6: How do you create a network namespace and configure it?”

Answer:

Terminal window
# Create namespace
sudo ip netns add mynet
# Add interfaces
sudo ip link add veth0 type veth peer name veth1
sudo ip link set veth1 netns mynet
# Configure in namespace
sudo ip netns exec mynet ip addr add 10.0.0.2/24 dev veth1
sudo ip netns exec mynet ip link set veth1 up
sudo ip netns exec mynet ip link set lo up
# Execute in namespace
sudo ip netns exec mynet /bin/bash

Q7: What are the security implications of using namespaces?

Section titled “Q7: What are the security implications of using namespaces?”

Answer: Namespaces provide isolation but not complete security:

  • Not a security boundary by themselves (user namespace is special)
  • Need additional security (AppArmor, SELinux, seccomp)
  • Container escapes can still occur
  • Shared kernel means kernel vulnerabilities affect all containers

Best practice: Use defense in depth - namespaces + cgroups + AppArmor/SELinux + seccomp

Q8: How do you limit memory for a process using cgroups?

Section titled “Q8: How do you limit memory for a process using cgroups?”

Answer: Using cgroups v2:

Terminal window
# Create cgroup
mkdir /sys/fs/cgroup/limit
# Set memory limit
echo 512M > /sys/fs/cgroup/limit/memory.max
# Add process
echo <PID> > /sys/fs/cgroup/limit/cgroup.procs

Or with systemd:

Terminal window
systemd-run --scope -p MemoryMax=512M /bin/bash

Terminal window
# Create cgroup (v2)
mkdir /sys/fs/cgroup/group
echo $$ > /sys/fs/cgroup/group/cgroup.procs
# Set limits
echo 500M > /sys/fs/cgroup/group/memory.max
echo "50000 100000" > /sys/fs/cgroup/group/cpu.max
# Using systemd
systemd-run --scope -p MemoryMax=512M -p CPUQuota=50%
# Create namespaces
unshare --pid --fork /bin/bash
ip netns add mynet
# Enter namespace
nsenter -t <PID> -n /bin/bash
PathPurpose
/sys/fs/cgroup/cgroup v2 mount
/proc/self/cgroupProcess cgroup membership
/proc/self/ns/Process namespaces
/sys/fs/cgroup/…/cgroup.procsProcesses in cgroup
/sys/fs/cgroup/…/cgroup.controllersAvailable controllers

In this chapter, you learned:

  • ✅ cgroups architecture (v1 vs v2)
  • ✅ Creating and managing cgroups
  • ✅ Resource limits (CPU, memory, I/O, processes)
  • ✅ systemd integration
  • ✅ Namespace types and purposes
  • ✅ Creating and using namespaces
  • ✅ Network namespaces in detail
  • ✅ Combining cgroups and namespaces
  • ✅ Practical examples
  • ✅ Interview questions and answers

Chapter 84: eBPF


Last Updated: February 2026