Skip to content

Linux_Practical_Interview_1501 1750

Linux Practical Interview Questions (1501-1750)

Section titled “Linux Practical Interview Questions (1501-1750)”

Q1501: How do you implement SELinux policies?

Section titled “Q1501: How do you implement SELinux policies?”

Answer:

Terminal window
# Check SELinux status
getenforce
sestatus
# SELinux contexts
# View file contexts
ls -Z /var/www/html
ls -Zd /var/www/html
# View process contexts
ps auxZ | grep nginx
# Change context
chcon -t httpd_sys_content_t /var/www/html/file.html
semanage fcontext -a -t httpd_sys_content_t "/web(/.*)?"
restorecon -Rv /web
# Boolean values
getsebool -a
setsebool -P httpd_can_network_connect on
# Create custom policy module
# 1. Generate Type Enforcement file
# myapp.te
module myapp 1.0;
require {
type httpd_t;
type myapp_log_t;
class file { read write };
}
allow httpd_t myapp_log_t:file { read write };
# 2. Compile and install
checkmodule -M -m -o myapp.mod myapp.te
semodule_package -o myapp.pp -m myapp.mod
semodule -i myapp.pp

Q1502: How do you configure AppArmor profiles?

Section titled “Q1502: How do you configure AppArmor profiles?”

Answer:

Terminal window
# Install AppArmor
apt install apparmor apparmor-utils
# View profiles
aa-status
ls /etc/apparmor.d/
# Create profile
aa-genprof /usr/bin/myapp
# Profile syntax
# /etc/apparmor.d/usr.bin.myapp
#include <tunables/global>
/usr/bin/myapp {
#include <abstractions/base>
#include <abstractions/bash>
# Allow read /etc
/etc/** r,
# Allow write to log
/var/log/myapp/* w,
# Deny access
deny /etc/shadow r,
deny /var/log/secure w,
# Network
network inet stream,
}
# Enable/disable
aa-disable /usr/bin/myapp
aa-enforce /usr/bin/myapp
aa-complain /usr/bin/myapp
# Reload
apparmor_parser -r /etc/apparmor.d/usr.bin.myapp

Q1503: How do you implement Linux capabilities?

Section titled “Q1503: How do you implement Linux capabilities?”

Answer:

Terminal window
# View capabilities
# File capabilities
getcap -r /usr/bin/
# Process capabilities
cat /proc/$$/status | grep Cap
# Set file capabilities
setcap 'cap_net_raw+ep' /usr/bin/ping
getcap /usr/bin/ping
# Remove capabilities
setcap -r /usr/bin/ping
# Run with specific capabilities
# Using run helper
# /etc/security/capability.conf
# none root
# cap_net_raw user1
# cap_net_admin user2
# Use setcap in code
# In C
#include <sys/capability.h>
cap_t caps;
caps = cap_get_proc();
cap_set_flag(caps, CAP_EFFECTIVE, CAP_NET_RAW, 1);
cap_set_proc(caps);

Q1504: How do you secure Linux system services?

Section titled “Q1504: How do you secure Linux system services?”

Answer:

Terminal window
# Disable unnecessary services
systemctl mask service_name
systemctl disable service_name
# View active services
systemctl list-units --type=service --state=running
# Secure SSH
# /etc/ssh/sshd_config
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
ClientAliveInterval 300
X11Forwarding no
AllowUsers user1 user2
DenyUsers root
# Secure Cron
# /etc/cron.allow (only these users)
# /etc/cron.deny (deny these users)
# Secure at
# /etc/at.allow
# Secure system limits
# /etc/security/limits.conf
* hard maxlocks 100
* soft nproc 512

Q1505: How do you implement user authentication security?

Section titled “Q1505: How do you implement user authentication security?”

Answer:

/etc/pam.d/common-auth
# Configure PAM
auth required pam_tally2.so deny=3 unlock_time=600 onerr=fail
# Password policy
# /etc/pam.d/common-password
password required pam_pwhistory.so remember=5
password [default=1] pam_permit.so
password requisite pam_cracklib.so try_first_pass retry=3 minlen=12 dcredit=-1 ucredit=-1 lcredit=-1 ocredit=-1
# Set password expiry
# /etc/login.defs
PASS_MAX_DAYS 90
PASS_MIN_DAYS 1
PASS_WARN_AGE 14
# For user
passwd -x 90 -w 14 -n 1 username
chage -M 90 -W 14 username
# View aging
chage -l username

Q1506: How do you configure firewall rules?

Section titled “Q1506: How do you configure firewall rules?”

Answer:

Terminal window
# Basic iptables rules
# Flush existing rules
iptables -F
iptables -X
iptables -t nat -F
iptables -t mangle -F
# Default policies
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT
# Allow loopback
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT
# Allow established connections
iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow SSH
iptables -A INPUT -p tcp --dport 22 -m conntrack --ctstate NEW -m recent --set
iptables -A INPUT -p tcp --dport -m conntrack --ctstate NEW -m recent --update --seconds 60 --hitcount 4 -j DROP
# Allow HTTP/HTTPS
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
# Save rules
iptables-save > /etc/iptables/rules.v4

Q1507: How do you implement network segmentation?

Section titled “Q1507: How do you implement network segmentation?”

Answer:

Terminal window
# Create network namespaces
ip netns add dmz
ip netns add internal
# Configure VLANs
ip link add link eth0 name eth0.100 type vlan id 100
ip addr add 192.168.100.1/24 dev eth0.100
ip link set eth0.100 up
# Bridge isolation
ip link add name br-dmz type bridge
ip link set eth1 master br-dmz
ip link set eth2 master br-dmz
# iptables zone-based firewall
iptables -N DMZ-ZONE
iptables -N INTERNAL-ZONE
iptables -N EXTERNAL-ZONE
# DMZ rules
iptables -A DMZ-ZONE -p tcp --dport 80 -j ACCEPT
iptables -A DMZ-ZONE -p tcp --dport 443 -j ACCEPT
iptables -A DMZ-ZONE -j REJECT
# Internal rules
iptables -A INTERNAL-ZONE -j ACCEPT
iptables -A INTERNAL-ZONE -o eth0 -j MASQUERADE

Answer:

Terminal window
# Install Snort
apt install snort
# Configure
# /etc/snort/snort.conf
ipvar HOME_NET 192.168.1.0/24
ipvar EXTERNAL_NET !$HOME_NET
# Custom rules
# /etc/snort/rules/local.rules
# Alert on ICMP
alert icmp any any -> $HOME_NET any (msg:"ICMP Ping"; sid:1000001; rev:1;)
# Alert on SSH attempts
alert tcp any any -> $HOME_NET 22 (msg:"SSH Connection Attempt"; \
flow:to_server,established; content:"SSH"; nocase; sid:1000002; rev:1;)
# Alert on port scan
alert tcp any any -> $HOME_NET any (msg:"Port Scan"; \
flow:to_server; detection_filter:track by_src,count 5,seconds 10; \
sid:1000003; rev:1;)
# Run snort
snort -c /etc/snort/snort.conf -i eth0
# Suricata (modern alternative)
apt install suricata
suricata -c /etc/suricata/suricata.yaml -i eth0

Q1509: How do you implement DDoS protection?

Section titled “Q1509: How do you implement DDoS protection?”

Answer:

Terminal window
# Rate limiting with iptables
# Limit connections per IP
iptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \
-m recent --set
iptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \
-m recent --update --seconds 60 --hitcount 20 -j DROP
# Limit ICMP
iptables -A INPUT -p icmp --icmp-type echo-request \
-m hashlimit --hashlimit-above 1/sec --hashlimit-burst 4 \
--hashlimit-htable-size 100000 --hashlimit-mode srcip \
--hashlimit-name icmp_limit -j DROP
# SYN flood protection
# /etc/sysctl.conf
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_syn_retries=2
net.ipv4.tcp_max_syn_backlog=4096
# Application layer
# Nginx rate limiting
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req zone=general burst=20 nodelay;

Answer:

Terminal window
# WireGuard setup
# Generate keys
wg genkey | tee private.key | wg pubkey > public.key
# Server configuration
# /etc/wireguard/wg0.conf
[Interface]
PrivateKey = <server-private-key>
Address = 10.0.0.1/24
ListenPort = 51820
PostUp = iptables -A FORWARD -i wg0 -j ACCEPT
PostUp = iptables -A FORWARD -o wg0 -j ACCEPT
PostUp = iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -j ACCEPT
PostDown = iptables -D FORWARD -o wg0 -j ACCEPT
PostDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
[Peer]
PublicKey = <client-public-key>
AllowedIPs = 10.0.0.2/32
# Client configuration
[Interface]
PrivateKey = <client-private-key>
Address = 10.0.0.2/24
[Peer]
PublicKey = <server-public-key>
Endpoint = server.example.com:51820
AllowedIPs = 0.0.0.0/0
PersistentKeepalive = 25
wg-quick up wg0

Q1511: How do you secure kernel parameters?

Section titled “Q1511: How do you secure kernel parameters?”

Answer:

/etc/sysctl.conf
# Network security
net.ipv4.conf.all.rp_filter=1
net.ipv4.conf.default.rp_filter=1
net.ipv4.icmp_echo_ignore_broadcasts=1
net.ipv4.icmp_ignore_bogus_error_responses=1
net.ipv4.conf.all.accept_redirects=0
net.ipv4.conf.default.accept_redirects=0
net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.default.send_redirects=0
net.ipv4.conf.all.accept_source_route=0
net.ipv4.conf.default.accept_source_route=0
net.ipv4.tcp_timestamps=0
# Kernel security
kernel.dmesg_restrict=1
kernel.kptr_restrict=2
kernel.yama.ptrace_scope=2
kernel.sysrq=0
# Memory protection
vm.mmap_min_addr=65536
vm.swappiness=10
# Apply
sysctl -p
sysctl --system

Answer:

Terminal window
# Install grsecurity
# Option 1: Use precompiled kernel
wget https:// kernels.org/pub/linux/kernel/v4.x/linux-4.14.12-grsec.tar.xz
# Option 2: Use PaX/gradm
apt install paxctl gradm
# PaX flags
paxctl -C /usr/bin/nginx
paxctl -m /usr/bin/nginx # Enable MPROTECT
paxctl -s /usr/bin/nginx # Enable SEGMEXEC
paxctl -r /usr/bin/nginx # Enable RANDEX
# gradm configuration
# /etc/gradm/admin
# admin:password:0:0
# Enable learning mode
gradm -L /etc/gradm/learning
# Run application in learning mode
gradm -L /etc/gradm/learning -E /usr/bin/nginx
# Compile rules
gradm -F -O /etc/gradm/default.policies
# Enable
gradm -e nginx

Q1513: How do you implement mandatory access control?

Section titled “Q1513: How do you implement mandatory access control?”

Answer:

/etc/selinux/config
# SELinux configuration
SELINUX=enforcing
SELINUXTYPE=targeted
# Create custom policy
# myapp.te
policy_module(myapp, 1.0)
type myapp_t;
type myapp_exec_t;
role system_r types myapp_t;
type_transition system_r myapp_exec_t:process myapp_t;
# Compile and install
make -f /usr/share/selinux/devel/Makefile myapp.pp
semodule -i myapp.pp
# AppArmor configuration
# Already covered in previous question
# SMACK (Simplified Mandatory Access Control)
# Enable in kernel
# CONFIG_SECURITY_SMACK=y
# Configure
# /etc/smack/accesses
# Format: subject object access
_ _ r
root myapp rw
myapp _ rw

Q1514: How do you implement container security?

Section titled “Q1514: How do you implement container security?”

Answer:

Terminal window
# Docker security
# Run without privileges
docker run --rm -it --cap-drop ALL --user 1000:1000 nginx
# Read-only root filesystem
docker run --rm -it --read-only nginx
# Resource limits
docker run --rm -it --memory=256m --cpus=0.5 nginx
# Network isolation
docker run --rm -it --network none nginx
# SELinux/AppArmor
docker run --rm -it --securitymor:default nginx
#-opt appar Seccomp profile
docker run --rm -it --security-opt seccomp:default nginx
# Rootless Docker
# Install
apt install docker-ce-rootless-extras
# Setup
dockerd-rootless.sh
# Verify
docker info
# Check capabilities
docker run --rm -it --rm nginx capsh --print

Answer:

Terminal window
# UEFI secure boot
# Check status
mokutil --sb-state
# Enroll keys
mokutil --import key.der
# GRUB password
# Generate hash
grub-mkpasswd-pbkdf2
# Add to /etc/grub.d/40_custom
set superusers="admin"
password_pbkdf2 admin grub.pbkdf2.sha512...hash...
# Rebuild GRUB
update-grub
# Disable USB boot
# /etc/modprobe.d/blacklist-usb.conf
install usb-storage /bin/true
# Boot kernel parameters
# /etc/default/grub
GRUB_CMDLINE_LINUX="secure boot=1"
# TPM measured boot
# Install
apt install tpm2-tools
# Measure boot
tpm2_pcrread
# Verify
tpm2_quote -c -k key.file -g sha256 -f quote.out -q "my quote"

Q1516: How do you configure advanced routing?

Section titled “Q1516: How do you configure advanced routing?”

Answer:

Terminal window
# Policy routing
# Add table
echo "200 wan2" >> /etc/iproute2/rt_tables
# Add route
ip route add default via 192.168.2.1 dev eth1 table wan2
# Add rule
ip rule add from 192.168.2.10 table wan2
ip rule add to 192.168.2.0/24 table wan2
# NAT with iptables
# SNAT
iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth0 -j SNAT --to-source 203.0.113.10
# DNAT
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 8080 -j DNAT --to-destination 192.168.1.10:80
# Masquerade
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
# Multi-path routing
ip route add default scope global nexthop via 192.168.1.1 dev eth0 weight 1 \
nexthop via 192.168.2.1 dev eth1 weight 1

Q1517: How do you configure network bonding for high availability?

Section titled “Q1517: How do you configure network bonding for high availability?”

Answer:

/etc/sysconfig/network-scripts/ifcfg-bond0
# Load bonding module
modprobe bonding mode=1 miimon=100
DEVICE=bond0
BONDING_OPTS="mode=1 miimon=100 primary=eth0"
IPADDR=192.168.1.10
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
ONBOOT=yes
# /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
MASTER=bond0
SLAVE=yes
ONBOOT=yes
# /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
MASTER=bond0
SLAVE=yes
ONBOOT=yes
# Mode 4 (LACP)
# /etc/sysconfig/network-scripts/ifcfg-bond0
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1"
# Monitor
cat /proc/net/bonding/bond0
# ethtool
ethtool -S bond0

Q1518: How do you configure IPv6 security?

Section titled “Q1518: How do you configure IPv6 security?”

Answer:

/etc/sysctl.conf
# Disable IPv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1
# Or via GRUB
# GRUB_CMDLINE_LINUX="ipv6.disable=1"
# IPv6 firewall rules
ip6tables -F
ip6tables -P INPUT DROP
ip6tables -P FORWARD DROP
ip6tables -P OUTPUT ACCEPT
# Allow established
ip6tables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow ICMPv6
ip6tables -A INPUT -p ipv6-icmp -j ACCEPT
# Allow SSH
ip6tables -A INPUT -p tcp --dport 22 -j ACCEPT
# Block routing header
ip6tables -A INPUT -m rt --rt-type 0 -j DROP
# RA guard
# On switch or router
# Configure Router Advertisement filtering

Q1519: How do you implement Quality of Service?

Section titled “Q1519: How do you implement Quality of Service?”

Answer:

Terminal window
# Traffic control with tc
# Create qdisc
tc qdisc add dev eth0 root handle 1: htb default 10
# Create classes
tc class add dev eth0 parent 1: classid 1:10 htb rate 10Mbit ceil 10Mbit
tc class add dev eth0 parent 1: classid 1:20 htb rate 5Mbit ceil 5Mbit
# Filter traffic
tc filter add dev eth0 parent 1: protocol all prio 1 u32 match ip dst 192.168.1.10 flowid 1:20
# Example: Prioritize SSH
tc qdisc add dev eth0 root handle 1: prio
tc filter add dev eth0 parent 1: protocol ip prio 10 u32 match ip dport 22 0xffff flowid 1:2
tc filter add dev eth0 parent 1: protocol ip prio 20 u32 match ip sport 22 0xffff flowid 1:2
# View
tc qdisc show
tc class show
tc filter show

Answer:

Terminal window
# DNSSEC with BIND
# Enable in named.conf
dnssec-validation auto;
dnssec-lookaside auto;
# Sign zone
dnssec-keygen -a RSASHA256 -b 2048 -n ZONE example.com
dnssec-signzone -S -o example.com db.example.com
# Configure resolver
# /etc/bind/named.conf.options
options {
dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;
};
# Test DNSSEC
dig +dnssec example.com
dig +cd secure.example.com
# Unbound configuration
# /etc/unbound/unbound.conf
server:
val-log-level: 2
harden-glue: yes
harden-dnssec: yes
use-caps-for-id: yes
# Query validation
drill -S example.com

Q1521: How do you analyze CPU performance?

Section titled “Q1521: How do you analyze CPU performance?”

Answer:

Terminal window
# CPU info
lscpu
cat /proc/cpuinfo
# CPU usage over time
mpstat -P ALL 1
sar -u 1
# Per-CPU usage
mpstat -P ALL 1
# Process CPU usage
top
ps aux --sort=-%cpu
pidstat -p <pid> 1
# CPU steal (virtualization)
vmstat 1
# Scheduler
# View process priority
ps -eo pid,ni,pri,comm
# CPU affinity
taskset -c 0-3 program
taskset -p 0xF <pid>
# Check CPU frequency
cpupower frequency-info
cpupower frequency-set -g performance

Q1522: How do you analyze memory performance?

Section titled “Q1522: How do you analyze memory performance?”

Answer:

Terminal window
# Memory info
free -h
cat /proc/meminfo
# Memory usage over time
vmstat 1
sar -B 1
# Per-process memory
ps aux --sort=-%mem
pmap -x <pid>
cat /proc/<pid>/status | grep -i vm
# Memory allocation issues
# Check for OOM
dmesg | grep -i "out of memory"
cat /var/log/syslog | grep -i oom
# Slab info
slabtop
# Huge pages
cat /proc/meminfo | grep -i huge
# Transparent huge pages
cat /sys/kernel/mm/transparent_hugepage/enabled
# Memory pressure
cat /proc/pressure/memory

Q1523: How do you analyze I/O performance?

Section titled “Q1523: How do you analyze I/O performance?”

Answer:

Terminal window
# I/O statistics
iostat -xz 1
sar -d 1
# Per-process I/O
iotop
pidstat -d 1
# Block device info
lsblk
blkid
# I/O scheduler
cat /sys/block/sda/queue/scheduler
echo deadline > /sys/block/sda/queue/scheduler
# Queue depth
cat /sys/block/sda/queue/nr_requests
# Check for I/O waits
vmstat 1
# File system performance
# Read-ahead
cat /sys/block/sda/queue/read_ahead_kb
# Trace I/O
blktrace -d /dev/sda -o trace
blkparse -i trace

Q1524: How do you analyze network performance?

Section titled “Q1524: How do you analyze network performance?”

Answer:

Terminal window
# Network statistics
netstat -s
ss -s
# Per-interface statistics
ip -s link
netstat -i
# Connection states
ss -tan state established
ss -tan state time-wait
# Bandwidth monitoring
iftop
nethogs
# Packet capture
tcpdump -i eth0
tcpdump -i eth0 -w capture.pcap
# Network latency
ping -c 4 host
traceroute host
mtr host
# TCP analysis
# TCP retransmits
netstat -s | grep -i retrans
# Connection tracking
conntrack -L
# Socket statistics
ss -tulpn
lsof -i

Q1525: How do you use performance profiling tools?

Section titled “Q1525: How do you use performance profiling tools?”

Answer:

Terminal window
# perf
perf record -g -p <pid>
perf report
perf top
# Flame graph
# Install
git clone https://github.com/brendangregg/FlameGraph.git
# Generate
perf record -F 99 -g -p <pid>
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flame.svg
# System-wide profiling
perf record -a -g -- sleep 10
perf report
# Valgrind
valgrind --tool=cachegrind ./program
cg_annotate cachegrind.out.*
# gprof
gcc -pg -g program.c -o program
./program
gprof program gmon.out > analysis.txt
# strace
strace -c -p <pid>
strace -T -tt -p <pid>

Answer:

Terminal window
# Create RAID5
mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[b-e]1
# Create RAID10
mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/sd[b-e]1
# Create RAID1 with spare
mdadm --create /dev/md0 --level=1 --raid-devices=2 --spare-devices=1 /dev/sd[b-d]1
# Monitor
mdadm --detail /dev/md0
cat /proc/mdstat
# Add to /etc/mdadm.conf
mdadm --examine --scan >> /etc/mdadm.conf
# Manage
mdadm /dev/md0 --add /dev/sdf1
mdadm /dev/md0 --remove /dev/sdb1
mdadm /dev/md0 --fail /dev/sdb1
# Stop/start
mdadm --stop /dev/md0
mdadm --assemble /dev/md0

Answer:

Terminal window
# Create physical volume
pvcreate /dev/sdb1
pvdisplay
pvmove /dev/sdb1 /dev/sdc1
# Create volume group
vgcreate vg_data /dev/sdb1
vgextend vg_data /dev/sdc1
vgdisplay
# Create logical volume
lvcreate -L 10G -n lv_data vg_data
lvcreate -l 100%FREE -n lv_backup vg_data
# Create thin pool
lvcreate -L 100G --thinpool vg_data/thin_pool
lvcreate -V 10G --thin -n lv_thin vg_data/thin_pool
# Snapshot
lvcreate -s -L 5G -n lv_snap vg_data/lv_data
# Resize
lvextend -L +10G /dev/vg_data/lv_data
lvreduce -L -5G /dev/vg_data/lv_data
# Remove
lvremove /dev/vg_data/lv_data
vgremove vg_data
pvremove /dev/sdb1

Q1528: How do you configure encrypted filesystems?

Section titled “Q1528: How do you configure encrypted filesystems?”

Answer:

Terminal window
# LUKS encryption
cryptsetup luksFormat /dev/sdb1
cryptsetup luksOpen /dev/sdb1 encrypted
mkfs.xfs /dev/mapper/encrypted
# Add key
cryptsetup luksAddKey /dev/sdb1
# Backup header
cryptsetup luksHeaderBackup /dev/sdb1 --header-backup-file header.img
# Auto unlock
# /etc/crypttab
encrypted /dev/sdb1 none luks
# /etc/fstab
/dev/mapper/encrypted /mnt/data xfs defaults 0 2
# eCryptfs
mount -t ecryptfs /backup /encrypted
# Or use fscrypt
mkfs.ext4 -O encrypt /dev/sda1
mount /dev/sda1 /mnt
fscrypt setup
fscrypt encrypt /mnt

Answer:

/etc/exports
# Server configuration
/data 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash,sec=krb5p)
/secure 192.168.1.0/24(rw,sync,sec=sys)
# Kerberized NFS
# Server
# /etc/exports
/data gss/krb5p(rw,sync,no_subtree_check)
# Export
exportfs -av
# Client
mount -t nfs -o sec=krb5p server:/data /mnt
# Security options
# sec=sys - UID/GID mapping
# sec=krb5 - Authentication only
# sec=krb5i - Integrity
# sec=krb5p - Privacy
# Firewall
# Allow NFS
iptables -A INPUT -p tcp --dport 2049 -j ACCEPT
iptables -A INPUT -p udp --dport 2049 -j ACCEPT
# Test
nfsstat -c
showmount -e server

Answer:

/etc/fstab
# Enable quota
/dev/sda1 /home ext4 usrquota,grpquota 0 2
# Remount
mount -o remount /home
# Initialize quota
quotacheck -cug /home
# Enable quota
quotaon /home
# Set user quota
edquota -u username
# Edit soft/hard limits
# Set group quota
edquota -g groupname
# View quota
quota -u username
quota -g groupname
repquota -a
# Copy quota template
edquota -p template_user new_user
# Email reports
# /etc/cron.daily/quotas
quotacheck -avug
repquota -a | mail -s "Quota Report" admin@example.com

Q1531: How do you configure Apache advanced?

Section titled “Q1531: How do you configure Apache advanced?”

Answer:

# Virtual host with SSL
<VirtualHost *:443>
ServerName example.com
DocumentRoot /var/www/html
SSLEngine on
SSLCertificateFile /etc/ssl/certs/server.crt
SSLCertificateKeyFile /etc/ssl/private/server.key
SSLCertificateChainFile /etc/ssl/certs/ca.crt
<Directory /var/www/html>
Options -Indexes +FollowSymLinks
AllowOverride None
Require all granted
</Directory>
# Security headers
Header always set X-Frame-Options "SAMEORIGIN"
Header always set X-Content-Type-Options "nosniff"
Header always set X-XSS-Protection "1; mode=block"
# Performance
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
# Compression
AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/javascript
</VirtualHost>
# Load balancer
<Proxy balancer://mycluster>
BalancerMember http://192.168.1.10:8080 route=node1
BalancerMember http://192.168.1.11:8080 route=node2
ProxySet lbmethod=byrequests
</Proxy>
ProxyPass / balancer://mycluster/
ProxyPassReverse / balancer://mycluster/

Q1532: How do you configure Nginx advanced?

Section titled “Q1532: How do you configure Nginx advanced?”

Answer:

# Worker configuration
worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 65535;
use epoll;
multi_accept on;
}
http {
# Logging
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log warn;
# Performance
open_file_cache max=10000 inactive=30s;
open_file_cache_valid 60s;
open_file_cache_min_uses 2;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# Upstream with health check
upstream backend {
least_conn;
server 192.168.1.10:8080 max_fails=3 fail_timeout=30s;
server 192.168.1.11:8080 max_fails=3 fail_timeout=30s;
keepalive 32;
}
server {
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}
}

Q1533: How do you configure PostgreSQL high availability?

Section titled “Q1533: How do you configure PostgreSQL high availability?”

Answer:

/etc/postgresql/14/main/postgresql.conf
# Streaming replication
wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB
hot_standby = on
# Master: /etc/postgresql/14/main/pg_hba.conf
host replication replicator 192.168.1.0/24 md5
# Create replication user
psql -c "CREATE USER replicator REPLICATION LOGIN PASSWORD 'secret';"
# Backup on replica
pg_basebackup -h master -D /var/lib/postgresql/14/main -U replicator -P -Xs
# Replica: /etc/postgresql/14/main/postgresql.conf
hot_standby = on
# Replica: /etc/postgresql/14/main/recovery.conf
standby_mode = on
primary_conninfo = 'host=master port=5432 user=replicator password=secret'
trigger_file = /tmp/promote
# pgBouncer for connection pooling
# /etc/pgbouncer/pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

Q1534: How do you configure Redis Sentinel?

Section titled “Q1534: How do you configure Redis Sentinel?”

Answer:

/etc/redis/sentinel.conf
# Sentinel configuration
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
# Start sentinel
redis-sentinel /etc/redis/sentinel.conf
# Client connection
# Python example
from redis.sentinel import Sentinel
sentinel = Sentinel([('localhost', 26379)], socket_timeout=0.1)
master = sentinel.master_for('mymaster', socket_timeout=0.1)
slave = sentinel.slave_for('mymaster', socket_timeout=0.1)
# Commands
redis-cli -p 26379 INFO SENTINEL
redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster
# Failover
# Sentinel automatically promotes replica to master
# Old master becomes replica when back online

Q1535: How do you configure MySQL Cluster?

Section titled “Q1535: How do you configure MySQL Cluster?”

Answer:

Terminal window
# MySQL NDB Cluster
# Install
apt install mysql-cluster-community-server
# Management node config
# /etc/mysql/my.cnf
[ndb_mgmd]
node-id=1
hostname=192.168.1.10
datadir=/var/lib/mysql-cluster
# Data nodes
# /etc/mysql/my.cnf
[ndbd]
node-id=2
hostname=192.168.1.11
datadir=/var/lib/mysql-cluster
[ndbd]
node-id=3
hostname=192.168.1.12
datadir=/var/lib/mysql-cluster
# SQL node
# /etc/mysql/my.cnf
[mysqld]
node-id=4
# Start management node
ndb_mgmd -f /etc/mysql/config.ini
# Start data nodes
ndbd --initial
# Start SQL node
mysqld --ndbcluster
# Check status
ndb_mgm -e show

Answer:

Terminal window
# Create encrypted file
ansible-vault create secret.yml
# Encrypt existing file
ansible-vault encrypt secrets.yml
# Edit encrypted file
ansible-vault edit secrets.yml
# View encrypted file
ansible-vault view secrets.yml
# Decrypt file
ansible-vault decrypt secrets.yml
# Change password
ansible-vault rekey secret.yml
# Use in playbook
# playbook.yml
- hosts: all
vars_files:
- secrets.yml
tasks:
- name: Create user
user:
name: "{{ db_user }}"
password: "{{ db_password }}"
# Run with vault password
ansible-playbook site.yml --ask-vault-pass
# or
ansible-playbook site.yml --vault-password-file ~/.vault_pass.txt

Answer:

Terminal window
# Create role structure
ansible-galaxy init nginx
# Role structure
# nginx/
# ├── defaults/
# │ └── main.yml
# ├── handlers/
# │ └── main.yml
# ├── meta/
# │ └── main.yml
# ├── tasks/
# │ └── main.yml
# ├── templates/
# │ └── nginx.conf.j2
# ├── tests/
# │ ├── inventory
# │ └── test.yml
# └── vars/
# └── main.yml
# defaults/main.yml
nginx_port: 80
nginx_workers: 4
# tasks/main.yml
- name: Install nginx
apt:
name: nginx
state: present
- name: Configure nginx
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: restart nginx
# handlers/main.yml
- name: restart nginx
service:
name: nginx
state: restarted
# Use role
# playbook.yml
- hosts: webservers
roles:
- nginx

Answer:

modules/
# Module structure
# ├── ec2/
# │ ├── main.tf
# │ ├── variables.tf
# │ └── outputs.tf
# └── vpc/
# ├── main.tf
# ├── variables.tf
# └── outputs.tf
# ec2/variables.tf
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.micro"
}
variable "ami_id" {
description = "AMI ID"
type = string
}
# ec2/outputs.tf
output "instance_id" {
value = aws_instance.this.id
}
# Main configuration
# main.tf
module "vpc" {
source = "./modules/vpc"
cidr_block = "10.0.0.0/16"
}
module "ec2" {
source = "./modules/ec2"
ami_id = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
vpc_id = module.vpc.vpc_id
}

Answer:

myapp/
# Cookbook structure
# ├── metadata.rb
# ├── recipes/
# │ └── default.rb
# ├── templates/
# │ └── config.erb
# └── attributes/
# └── default.rb
# metadata.rb
name 'myapp'
version '1.0.0'
depends 'nginx'
# attributes/default.rb
default['myapp']['port'] = 8080
default['myapp']['workers'] = 4
# recipes/default.rb
package 'myapp'
template '/etc/myapp/config.yml' do
source 'config.erb'
mode '0644'
variables(
port: node['myapp']['port']
)
end
service 'myapp' do
action [:enable, :start]
end
# Use cookbook
# Run list
chef-client -r "recipe[myapp]"

Answer:

nginx/
# Module structure
# ├── manifests/
# │ ├── init.pp
# │ └── config.pp
# ├── templates/
# │ └── nginx.conf.erb
# └── files/
# └── index.html
# manifests/init.pp
class nginx {
package { 'nginx':
ensure => installed,
}
service { 'nginx':
ensure => running,
enable => true,
hasrestart => true,
}
}
# manifests/config.pp
class nginx::config inherits nginx {
file { '/etc/nginx/nginx.conf':
ensure => file,
content => template('nginx/nginx.conf.erb'),
require => Package['nginx'],
notify => Service['nginx'],
}
}
# Use module
# site.pp
node 'webserver.example.com' {
include nginx
include nginx::config
}

Q1541: How do you configure Kubernetes networking?

Section titled “Q1541: How do you configure Kubernetes networking?”

Answer:

Terminal window
# CNI plugins
# Install Flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# Install Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# Network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
EOF
# Allow specific traffic
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend
spec:
podSelector:
matchLabels:
app: frontend
ingress:
- from:
- podSelector:
matchLabels:
app: backend
EOF
# Service mesh (Istio)
istioctl install --set profile=demo
kubectl label namespace default istio-injection=enabled

Q1542: How do you configure Kubernetes storage?

Section titled “Q1542: How do you configure Kubernetes storage?”

Answer:

# PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
storageClassName: standard
hostPath:
path: /mnt/data
---
# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
# StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
replication-type: regional-pd
---
# Use in Pod
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myapp
image: nginx
volumeMounts:
- name: my-storage
mountPath: /data
volumes:
- name: my-storage
persistentVolumeClaim:
claimName: my-pvc

Q1543: How do you configure Kubernetes security?

Section titled “Q1543: How do you configure Kubernetes security?”

Answer:

Terminal window
# RBAC
kubectl create serviceaccount myapp
kubectl create role myapp-reader --verb=get,list --resource=pods
kubectl create rolebinding myapp-reader-binding --role=myapp-reader --serviceaccount=default:myapp
# Use service account in pod
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
serviceAccountName: myapp
containers:
- name: myapp
image: nginx
# Pod Security Policy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
seLinux:
rule: RunAsAny
runAsUser:
rule: MustRunAsNonRoot
fsGroup:
rule: RunAsAny
# Network policies
# See previous question
# Secrets
kubectl create secret generic mysecret \
--from-literal=username=admin \
--from-literal=password=secret

Q1544: How do you configure Helm workflows?

Section titled “Q1544: How do you configure Helm workflows?”

Answer:

Terminal window
# Create chart
helm create myapp
# Add dependencies
# Chart.yaml
dependencies:
- name: nginx
version: "1.0.0"
repository: "https://charts.bitnami.com/bitnami"
# Install with dependencies
helm dependency build
helm dependency update
# Template functions
# values.yaml
replicaCount: 3
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "myapp.fullname" . }}
spec:
replicas: {{ .Values.replicaCount }}
# Common functions
{{ .Values.image.repository }}:{{ .Values.image.tag }}
{{ include "myapp.fullname" . }}
{{ .Release.Name }}
{{ .Release.Namespace }}
# Hooks
hooks:
- name: backup
manifest: |
apiVersion: v1
kind: Pod
metadata:
name: backup
hook: pre-install
weight: 10

Q1545: How do you implement GitOps with ArgoCD?

Section titled “Q1545: How do you implement GitOps with ArgoCD?”

Answer:

Terminal window
# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Get password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
# Create application
kubectl apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/repo.git
targetRevision: HEAD
path: k8s/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
EOF
# Sync
argocd app sync myapp
argocd app get myapp
# Sync waves
# Add annotations to resources
# metadata:
# annotations:
# argocd.argoproj.io/sync-wave: "1"

Answer:

Terminal window
# Kernel messages
dmesg
dmesg | tail -100
# Kernel panic
# Enable kdump
apt install kdump-tools
kdump-config load
# Test kdump
echo c > /proc/sysrq-trigger
# Analyze crash
crash /var/crump/ vmcore
# Kernel config
zcat /proc/config.gz
# or
cat /boot/config-$(uname -r)
# System calls
strace -c program
strace -f program
# ftrace
echo function > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace
# perf
perf record -g program
perf report

Answer:

Terminal window
# Interface status
ip link show
ip addr show
ethtool eth0
# Routing
ip route
ip route get 8.8.8.8
# DNS
dig example.com
getent hosts example.com
# Connectivity
ping -c 4 8.8.8.8
traceroute 8.8.8.8
# Port status
netstat -tulpn
ss -tulpn
# Capture
tcpdump -i eth0 host 192.168.1.1
tcpdump -i eth0 port 80
# Firewall
iptables -L -n -v
iptables -t nat -L -n -v
# TCP issues
# Retransmits
netstat -s | grep -i retrans
# Connection states
ss -tan state time-wait
# ARP
ip neigh show
arp -a

Answer:

Terminal window
# Disk usage
df -h
df -i
# Find large files
find / -type f -size +100M 2>/dev/null | head -20
# I/O stats
iostat -xz 1
sar -d 1
# Mount issues
mount
cat /proc/mounts
# Filesystem check
fsck -n /dev/sda1
# LVM issues
lvs
pvs
vgs
lvdisplay
# NFS issues
showmount -e server
mount -v server:/share /mnt
# SMART status
smartctl -a /dev/sda
smartctl -H /dev/sda
# Lsof for deleted files
lsof +L1

Answer:

Terminal window
# Service status
systemctl status service
systemctl list-failed
# Logs
journalctl -u service -n 50
journalctl -u service --since "1 hour ago"
journalctl -xe
# Process
ps auxf | grep service
lsof -p $(pgrep -f service)
# Configuration
service configtest
nginx -t
# Dependencies
systemctl list-dependencies service
systemctl is-active service
# Resources
cat /proc/$(pgrep -f service)/limits
# Network
netstat -tulpn | grep service
# Environment
cat /proc/$(pgrep -f service)/environ | tr '\0' '\n'
# Cgroups
systemd-cgls | grep service

Q1550: How do you debug application issues?

Section titled “Q1550: How do you debug application issues?”

Answer:

Terminal window
# Core dumps
# Enable
ulimit -c unlimited
# /etc/security/limits.conf
* soft core unlimited
# Generate core
gcore <pid>
# Analyze
gdb program core
(gdb) bt
(gdb) info threads
# Memory leaks
valgrind --leak-check=full program
# Performance profiling
perf record -g program
perf report
# Python debugging
python -m pdb program.py
python -m cProfile program.py
# Java debugging
jstack <pid>
jmap -heap <pid>
jmap -dump:format=b,file=heap.bin <pid>
# Node.js debugging
node --inspect program.js
chrome://inspect

Q1551: How do you implement zero-downtime restarts?

Section titled “Q1551: How do you implement zero-downtime restarts?”

Answer:

Terminal window
# Nginx graceful reload
nginx -s reload
# or
systemctl reload nginx
# HAProxy
# Reload without downtime
systemctl reload haproxy
# Application with SIGTERM handling
# In application code
import signal
import sys
def sigterm_handler(signum, frame):
# Stop accepting new connections
# Wait for existing connections to complete
# Then exit
print("Shutting down gracefully...")
sys.exit(0)
signal.signal(signal.SIGTERM, sigterm_handler)
# Kubernetes rolling update
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp
# Rollback if needed
kubectl rollout undo deployment/myapp

Q1552: How do you implement feature toggles?

Section titled “Q1552: How do you implement feature toggles?”

Answer:

# Simple feature toggle
class FeatureToggle:
def __init__(self):
self.features = {}
def enable(self, feature):
self.features[feature] = True
def disable(self, feature):
self.features[feature] = False
def is_enabled(self, feature):
return self.features.get(feature, False)
# Usage
toggle = FeatureToggle()
toggle.enable('new_ui')
if toggle.is_enabled('new_ui'):
show_new_ui()
else:
show_old_ui()
# Environment-based
import os
if os.getenv('FEATURE_NEW_UI') == '1':
show_new_ui()
# Database-backed
def is_feature_enabled(feature_name):
result = db.query("SELECT enabled FROM features WHERE name = ?", feature_name)
return result.enabled if result else False

Q1553: How do you implement circuit breaker pattern?

Section titled “Q1553: How do you implement circuit breaker pattern?”

Answer:

import time
from functools import wraps
class CircuitBreaker:
def __init__(self, failure_threshold=5, timeout=60):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.failures = 0
self.last_failure_time = None
self.state = "CLOSED" # CLOSED, OPEN, HALF_OPEN
def call(self, func, *args, **kwargs):
if self.state == "OPEN":
if time.time() - self.last_failure_time > self.timeout:
self.state = "HALF_OPEN"
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func(*args, **kwargs)
self._on_success()
return result
except Exception as e:
self._on_failure()
raise
def _on_success(self):
self.failures = 0
self.state = "CLOSED"
def _on_failure(self):
self.failures += 1
self.last_failure_time = time.time()
if self.failures >= self.failure_threshold:
self.state = "OPEN"
# Usage
breaker = CircuitBreaker()
result = breaker.call(risky_api_call)

Q1554: How do you implement rate limiting?

Section titled “Q1554: How do you implement rate limiting?”

Answer:

import time
from collections import defaultdict
class RateLimiter:
def __init__(self, max_requests, time_window):
self.max_requests = max_requests
self.time_window = time_window
self.requests = defaultdict(list)
def is_allowed(self, key):
now = time.time()
# Remove old requests
self.requests[key] = [
req_time for req_time in self.requests[key]
if now - req_time < self.time_window
]
if len(self.requests[key]) >= self.max_requests:
return False
self.requests[key].append(now)
return True
# Usage (Flask)
limiter = RateLimiter(100, 60)
@app.route('/api')
def api():
if not limiter.is_allowed(request.remote_addr):
return "Too many requests", 429
# Process request
return "OK"
# Redis-based (distributed)
import redis
class RedisRateLimiter:
def __init__(self, redis_client, max_requests, time_window):
self.redis = redis_client
self.max_requests = max_requests
self.time_window = time_window
def is_allowed(self, key):
current = self.redis.incr(key)
if current == 1:
self.redis.expire(key, self.time_window)
return current <= self.max_requests

Q1555: How do you implement service discovery?

Section titled “Q1555: How do you implement service discovery?”

Answer:

Terminal window
# Consul
# Install
apt install consul
# Configuration
# /etc/consul/config.json
{
"datacenter": "dc1",
"data_dir": "/var/consul",
"ui_config": {
"enabled": true
},
"retry_join": ["provider=aws tag_key=consul tag_value=server"],
"server": true,
"bootstrap_expect": 3
}
# Register service
# /etc/consul/service.json
{
"service": {
"name": "web",
"port": 80,
"check": {
"http": "http://localhost:80/health",
"interval": "10s"
}
}
}
# DNS interface
# Query service
dig @127.0.0.1 -p 8600 web.service.consul
# HTTP API
curl http://127.0.0.1:8500/v1/catalog/service/web
# Register in code
import consul
c = consul.Consul()
# Register service
c.agent.service.register(
'web',
service_id='web-1',
port=80,
check=consul.Check.http('http://localhost:80/health', '10s')
)

Q1556: How do you implement observability?

Section titled “Q1556: How do you implement observability?”

Answer:

Terminal window
# Distributed tracing with Jaeger
# Client integration
# Python
from opentelemetry import trace
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(TracerProvider())
jaeger_exporter = JaegerExporter(
agent_host_name="jaeger",
agent_port=6831,
)
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(jaeger_exporter)
)
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("operation") as span:
span.set_attribute("key", "value")
# Do work
# Metrics with Prometheus
from prometheus_client import Counter, generate_latest
requests_total = Counter('requests_total', 'Total requests')
@app.route('/')
def hello():
requests_total.inc()
return 'Hello'
# Export metrics
@app.route('/metrics')
def metrics():
return generate_latest()
# Logging structured
import logging
import json
logger = logging.getLogger(__name__)
logger.info("Request processed", extra={
"user_id": user.id,
"duration_ms": duration
})

Q1557: How do you implement chaos engineering?

Section titled “Q1557: How do you implement chaos engineering?”

Answer:

Terminal window
# Chaos Mesh
# Install
helm repo add chaos-mesh https://charts.chaos-mesh.org
helm install chaos-mesh chaos-mesh/chaos-mesh -n chaos-mesh --create-namespace
# Pod failure experiment
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
name: pod-failure
spec:
action: pod-failure
mode: one
duration: 60s
selector:
namespaces:
- default
labelSelectors:
app: myapp
# Network chaos
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: network-delay
spec:
action: delay
mode: one
duration: 60s
selector:
namespaces:
- default
delay:
latency: 100ms
# Litmus
# Install
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm
helm install litmuschaos litmuschaos/litmus
# Use with AWS
# Simulate EC2 instance termination
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0

Q1558: How do you implement multi-tenancy?

Section titled “Q1558: How do you implement multi-tenancy?”

Answer:

Terminal window
# Kubernetes namespaces
kubectl create namespace tenant1
kubectl create namespace tenant2
# Resource quotas
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-quota
namespace: tenant1
spec:
hard:
requests.cpu: "4"
requests.memory: 8Gi
limits.cpu: "8"
limits.memory: 16Gi
pods: "20"
# Limit ranges
apiVersion: v1
kind: LimitRange
metadata:
name: tenant-limits
namespace: tenant1
spec:
limits:
- max:
cpu: "2"
memory: "4Gi"
min:
cpu: "100m"
memory: "128Mi"
type: Container
# RBAC
kubectl create rolebinding tenant1-admin \
--role=admin \
--user=user1 \
--namespace=tenant1
# Network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-cross-namespace
namespace: tenant1
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
tenant: tenant1
EOF

Q1559: How do you implement disaster recovery?

Section titled “Q1559: How do you implement disaster recovery?”

Answer:

Terminal window
# Backup Kubernetes
# ETCD backup
ETCDCTL_API=3 etcdctl snapshot save backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Restore ETCD
ETCDCTL_API=3 etcdctl snapshot restore backup.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
# Velero (Kubernetes backup)
# Install
kubectl apply -f https://github.com/vmware-tanzu/velero/releases/download/v1.10.0/velero.yaml
# Backup
velero backup create backup-2024-01-01 --include-namespaces default
# Restore
velero restore create --from-backup backup-2024-01-01
# Schedule backups
velero schedule create daily --schedule="0 2am * * *"
# Database backup
mysqldump -u root -p mydb > backup.sql
pg_dump -U postgres mydb > backup.sql
# Object storage
aws s3 sync /data s3://bucket/backup/

Q1560: How do you implement security scanning?

Section titled “Q1560: How do you implement security scanning?”

Answer:

Terminal window
# Container scanning
# Trivy
trivy image myimage:latest
trivy image --severity HIGH,CRITICAL myimage:latest
trivy image --exit-code 1 --severity CRITICAL myimage:latest
# Clair
docker run -p 5432:5432 -d quay.io/coreos/clair:latest
clair-scanner myimage
# Infrastructure scanning
# Kube-bench
kube-bench run --targets node
# Kube-hunter
kubectl run --rm -it --image=kubehunter/kubehunter --name kubehunter
# SAST
# Bandit (Python)
bandit -r myapp/
# Semgrep
semgrep --config=auto mycode/
# DAST
# OWASP ZAP
zap-baseline.py -t https://myapp.example.com
# Secret scanning
# TruffleHog
trufflehog filesystem myrepo/
# gitleaks
gitleaks --path=mydir --verbose

Q1561: How do you implement backup strategy?

Section titled “Q1561: How do you implement backup strategy?”

Answer:

Terminal window
# 3-2-1 backup rule
# 3 copies of data
# 2 different storage types
# 1 offsite copy
# Backup types
# Full backup
tar -czf full-backup-$(date +%Y%m%d).tar.gz /data
# Incremental backup
# First full backup
tar -czf backup-$(date +%Y%m%d).tar.gz -g /var/log/backup.snar /data
# Differential backup
# After first full backup
tar -czf differential-$(date +%Y%m%d).tar.gz -N "2024-01-01" /data
# Database backup
mysqldump -u root -p --all-databases > all-databases-$(date +%Y%m%d).sql
pg_dumpall -U postgres > all-databases-$(date +%Y%m%d).sql
# Automated backup script
#!/bin/bash
BACKUP_DIR="/backup"
DATE=$(date +%Y%m%d)
# Database
mysqldump -u root mydb | gzip > $BACKUP_DIR/mydb-$DATE.sql.gz
# Files
tar -czf $BACKUP_DIR/files-$DATE.tar.gz /data
# Retention
find $BACKUP_DIR -type f -mtime +30 -delete

Q1562: How do you implement monitoring strategy?

Section titled “Q1562: How do you implement monitoring strategy?”

Answer:

Terminal window
# Prometheus + Grafana
# Install
helm install prometheus stable/prometheus-operator \
--set grafana.service.type=LoadBalancer
# Define metrics
# node_exporter
# - node_cpu_seconds_total
# - node_memory_MemTotal_bytes
# - node_filesystem_size_bytes
# Custom application metrics
from prometheus_client import Counter, Gauge, Histogram
requests_total = Counter('app_requests_total', 'Total requests')
processing_duration = Histogram('app_processing_duration_seconds')
# Alerting rules
# prometheus.rules
- alert: HighCPU
expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
# AlertManager
# alertmanager.yaml
route:
group_by: ['alertname']
receiver: 'team'
receivers:
- name: 'team'
email_configs:
- to: 'team@example.com'
slack_configs:
- api_url: 'https://hooks.slack.com/...'

Q1563: How do you implement logging strategy?

Section titled “Q1563: How do you implement logging strategy?”

Answer:

Terminal window
# ELK Stack
# Filebeat
filebeat.inputs:
- type: log
paths:
- /var/log/*.log
fields:
type: syslog
fields_under_root: true
output.logstash:
hosts: ["logstash:5044"]
# Logstash
input { beats { port => 5044 } }
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{DATA:program}: %{GREEDYDATA:message}" }
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
}
}
# Kibana
# Create index pattern
# Create dashboards
# Structured logging
# JSON format
import logging
import json
class JSONFormatter(logging.Formatter):
def format(self, record):
log_data = {
'timestamp': self.formatTime(record),
'level': record.levelname,
'message': record.getMessage(),
'module': record.module
}
return json.dumps(log_data)

Q1564: How do you implement incident response?

Section titled “Q1564: How do you implement incident response?”

Answer:

Terminal window
# Incident response plan
# 1. Detection
# Monitor alerts -> PagerDuty -> On-call engineer
# 2. Assessment
# Check severity -> Determine impact
# 3. Communication
# Create incident channel
# Update status page
# 4. Mitigation
# Stop bleeding
# Restore service
# 5. Resolution
# Fix root cause
# Deploy fix
# Runbook example
# Runbook: Database Connection Issues
# 1. Check database status
# systemctl status postgresql
# 2. Check connections
# psql -c "SELECT count(*) FROM pg_stat_activity"
# 3. Check slow queries
# psql -c "SELECT * FROM pg_stat_activity WHERE state != 'idle' LIMIT 10"
# 4. Kill long-running queries
# SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query_start < NOW() - INTERVAL '5 minutes';
# 5. If needed, restart database
# systemctl restart postgresql
# Post-incident
# 1. Document timeline
# 2. Identify root cause
# 3. Implement fix
# 4. Review and improve

Q1565: How do you implement capacity planning?

Section titled “Q1565: How do you implement capacity planning?”

Answer:

Terminal window
# Metrics collection
# CPU
sar -u 1 60 > cpu_usage.csv
# Memory
sar -r 1 60 > memory_usage.csv
# I/O
sar -d 1 60 > io_usage.csv
# Network
sar -n DEV 1 60 > network_usage.csv
# Analysis
# Growth rate
# (current_value - past_value) / days_between
# Capacity planning formula
# CPU: (peak_usage * growth_factor * buffer) / cores
# Memory: (peak_usage * growth_factor * buffer) / available
# Disk: (current_usage * (1 + growth_rate)^years)
# Network: peak_bandwidth * redundancy_factor
# Tools
# Google SRE capacity planning
# Horizontal Pod Autoscaler metrics
kubectl autoscale deployment myapp --cpu-percent=80 --min=2 --max=10
# Vertical Pod Autoscaler
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto"
EOF

Q1566: How do you design highly available systems?

Section titled “Q1566: How do you design highly available systems?”

Answer:

Terminal window
# HA architecture
# Load balancer -> Web servers -> Database (primary + replica)
# \-> Cache (Redis Sentinel)
# \-> Message queue (Kafka/RabbitMQ cluster)
# Keepalived + HAProxy
# /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
virtual_ipaddress {
192.168.1.100
}
track_script {
check_haproxy
}
}
vrrp_script check_haproxy {
script "pkill -0 haproxy"
interval 2
weight 2
}
# HAProxy backend
backend web
balance roundrobin
option httpchk
http-check expect status 200
server web1 192.168.1.10:80 check inter 2000 fall 3 rise 2
server web2 192.168.1.11:80 check inter 2000 fall 3 rise 2 backup
# Database HA
# See PostgreSQL replication earlier
# DNS failover
# Route 53 health checks

Q1567: How do you design scalable systems?

Section titled “Q1567: How do you design scalable systems?”

Answer:

Terminal window
# Horizontal scaling
# Add more instances behind load balancer
# Auto-scaling based on metrics
# Vertical scaling
# Increase instance size
# Requires downtime
# Database scaling
# Read replicas
# Sharding
# Partitioning
# Cache scaling
# Redis cluster mode
redis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 \
127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1
# Message queue scaling
# Kafka
# Partition across brokers
# Replicate for redundancy
# CDN
# CloudFront, Cloudflare
# Cache static assets at edge
# Stateless application design
# Store sessions in Redis
# Store files in S3
# Database for persistent data

Answer:

Terminal window
# Defense in depth
# 1. Network security
# - Firewalls
# - Network segmentation
# - VPN
# 2. Application security
# - Input validation
# - Output encoding
# - Parameterized queries
# - Security headers
# 3. Data security
# - Encryption at rest
# - Encryption in transit
# - Key management
# - Backup encryption
# 4. Identity and access
# - RBAC
# - MFA
# - Least privilege
# - Regular access review
# 5. Monitoring
# - SIEM
# - IDS/IPS
# - Vulnerability scanning
# - Penetration testing
# Compliance
# - GDPR, HIPAA, PCI-DSS
# - Audit logging
# - Data retention policies

Q1569: How do you implement immutable infrastructure?

Section titled “Q1569: How do you implement immutable infrastructure?”

Answer:

Terminal window
# Packer
# Build immutable images
packer build template.json
# No SSH access in production
# Use Systems Manager Session Manager
# Cloud-init for configuration
#cloud-config
package_update: true
packages:
- nginx
# Container-based deployment
# Never modify running containers
# Rebuild and redeploy
# Infrastructure as Code
# Terraform
terraform apply -var-file=prod.tfvars
# GitOps
# ArgoCD
argocd app sync myapp
# Blue-green deployments
# Deploy to new environment
# Switch traffic
# Keep old environment for rollback

Q1570: How do you implement cost optimization?

Section titled “Q1570: How do you implement cost optimization?”

Answer:

Terminal window
# Right-sizing
# Use smaller instances
# Monitor utilization
# Reserved instances
# For steady-state workloads
# Spot instances
# For batch jobs
# With checkpointing
# Autoscaling
# Scale down during off-hours
# Storage optimization
# Use appropriate storage classes
# Delete unused data
# Implement lifecycle policies
# Network optimization
# Use private subnets
# Use VPC endpoints
# Use CDN for static content
# Cost monitoring
# AWS Cost Explorer
# Budget alerts
# Tools
# cloud-custodian
# Filter and take action on resources
custodian run -s output.yml policy.yml

Q1571: How do you implement CI/CD pipelines?

Section titled “Q1571: How do you implement CI/CD pipelines?”

Answer:

.gitlab-ci.yml
stages:
- build
- test
- security
- deploy
variables:
DOCKER_DRIVER: overlay2
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker build -t $IMAGE:$CI_COMMIT_SHA .
- docker push $IMAGE:$CI_COMMIT_SHA
test:
stage: test
image: $IMAGE
script:
- npm test
- npm run lint
coverage: '/Coverage: \d+\.\d+%/'
security:
stage: security
image: aquasec/trivy:latest
script:
- trivy image --exit-code 0 --severity HIGH,CRITICAL $IMAGE
allow_failure: true
deploy-staging:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=$IMAGE
- kubectl rollout status deployment/myapp
environment:
name: staging
deploy-production:
stage: deploy
script:
- kubectl set image deployment/myapp myapp=$IMAGE
- kubectl rollout status deployment/myapp
environment:
name: production
when: manual
only:
- main

Q1572: How do you implement infrastructure testing?

Section titled “Q1572: How do you implement infrastructure testing?”

Answer:

Terminal window
# Infrastructure as Code testing
# terraform validate
terraform validate
terraform plan -out=tfplan
# Terratest (Go)
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
)
func TestTerraform(t *testing.T) {
terraformOptions := &terraform.Options{
TerraformDir: "../examples/basic",
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
}
# InSpec
# controls/server.rb
control 'server-01' do
impact 1.0
title 'Server should be configured properly'
describe package('nginx') do
it { should be_installed }
end
describe service('nginx') do
it { should be_running }
it { should be_enabled }
end
end
# Run
inspec exec profile/

Q1573: How do you implement secret management?

Section titled “Q1573: How do you implement secret management?”

Answer:

Terminal window
# HashiCorp Vault
# Install
vault server -config=config.hcl
# Enable secrets engine
vault secrets enable -path=secret kv
# Write secret
vault kv put secret/myapp/db password=secretpassword
# Read secret
vault kv get secret/myapp/db
# Use with Kubernetes
# Install Vault Agent Injector
helm install vault hashicorp/vault \
--set "injector.enabled=true"
# Annotate pod
# metadata:
# annotations:
# vault.hashicorp.com/agent-inject: "true"
# vault.hashicorp.com/role: "myapp"
# vault.hashicorp.com/agent-inject-secret-db: "secret/data/myapp/db"
# Use in application
# Read from /vault/secrets/db file
# AWS Secrets Manager
aws secretsmanager create-secret \
--name myapp/db \
--secret-string '{"username":"admin","password":"secret"}'
# Kubernetes Secrets
kubectl create secret generic myapp-secrets \
--from-literal=username=admin \
--from-literal=password=secret

Answer:

Terminal window
# Istio installation
istioctl install --set profile=demo
# Deploy application
kubectl apply -f myapp.yaml
# Enable mutual TLS
kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
spec:
mtls:
mode: STRICT
EOF
# Traffic management
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: myapp
spec:
hosts:
- myapp
http:
- route:
- destination:
host: myapp
subset: v1
weight: 90
- destination:
host: myapp
subset: v2
weight: 10
# Observability
# Enable tracing
istioctl install --set values.telemetry.enabled=true
# View dashboards
istioctl dashboard kiali

Q1575: How do you implement edge computing?

Section titled “Q1575: How do you implement edge computing?”

Answer:

Terminal window
# K3s (lightweight Kubernetes)
curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -
# KubeEdge
# Cloud node
helm install cloudcore kubeedge/cloudcore --namespace kubeedge
# Edge node
# Install edgecore
wget https://github.com/kubeedge/kubeedge/releases/download/v1.12.0/kubeedge_1.12.0_linux_amd64.tar.gz
tar -xzf kubeedge_1.12.0_linux_amd64.tar.gz
# Run edgecore
edgecore --config=/etc/kubeedge/config/edgecore.yaml
# Deploy to edge
kubectl apply -f deployment.yaml
# Use case: IoT
# Collect sensor data at edge
# Process locally
# Send aggregated data to cloud

Q1576: How do you handle production incidents?

Section titled “Q1576: How do you handle production incidents?”

Answer:

Terminal window
# Incident response workflow
# 1. Detection
# - Alerts from monitoring
# - User reports
# 2. Triage
# - Assess severity (SEV1-4)
# - Identify impact
# - Determine if customer-facing
# 3. Communication
# - Create incident channel
# - Update status page
# - Notify stakeholders
# 4. Mitigation
# - Stop bleeding (rollbacks, traffic shift)
# - Apply fix
# 5. Resolution
# - Verify fix
# - Confirm recovery
# 6. Post-mortem
# - Document timeline
# - Identify root cause (5 whys)
# - Action items
# Example incident
# Database down
# 1. Check status
# systemctl status postgresql
# 2. Attempt restart
# systemctl restart postgresql
# 3. If failed, promote replica
# pg_ctl promote -D /var/lib/postgresql/data
# 4. Verify
# psql -c "SELECT 1"
# 5. Document

Q1577: How do you perform root cause analysis?

Section titled “Q1577: How do you perform root cause analysis?”

Answer:

Terminal window
# 5 Whys Analysis
# Problem: API response time increased
# Why 1: Database queries slow
# Why 2: Missing index
# Why 3: New feature added without proper schema review
# Why 4: Code review didn't catch it
# Why 5: Process doesn't require schema review
# Corrective Action: Implement schema review in CI/CD
# Tools for RCA
# Logs
journalctl -u service -n 100
# Metrics
# Compare before/after
sar -q
# Traces
# Jaeger, Zipkin
# Dumps
# Core files, heap dumps
# Timeline
# Create incident timeline
# 14:00 - Alert triggered
# 14:05 - On-call acknowledged
# 14:10 - Root cause identified
# 14:15 - Fix deployed
# 14:20 - Service recovered

Answer:

Terminal window
# Cost optimization strategies
# 1. Right-sizing instances
# Use CloudWatch metrics
aws ec2 describe-instance-types --instance-type t3.micro
# 2. Reserved instances
# For predictable workloads
# 3. Spot instances
# For fault-tolerant workloads
# 4. Autoscaling
# Scale in when not needed
# 5. Storage lifecycle
# Move cold data to Glacier
aws s3 ls
aws s3api put-bucket-lifecycle-configuration --bucket mybucket \
--lifecycle-configuration file://lifecycle.json
# 6. Delete unused resources
# Find unattached volumes
aws ec2 describe-volumes --filters Name=status,Values=available
# 7. Use managed services
# RDS, Lambda instead of EC2
# 8. Budget alerts
aws budgets create-budget \
--account-id 123456789012 \
--budget file://budget.json

Answer:

Terminal window
# Compliance frameworks
# SOC 2, PCI-DSS, HIPAA, GDPR
# Audit logging
# Enable auditd
auditd
# Rules
# /etc/audit/audit.rules
-w /etc/passwd -p wa -k passwd_changes
-w /etc/shadow -p wa -k shadow_changes
-w /etc/sudoers -p wa -k sudoers_changes
# Review logs
aureport -f
ausearch -k passwd_changes
# Vulnerability scanning
# OpenVAS, Nessus, Qualys
# Penetration testing
# Annual third-party pen tests
# Data encryption
# At rest
# LUKS, TDE
# In transit
# TLS 1.2+
# Access reviews
# Quarterly user access review
# Documentation
# Policies and procedures
# Evidence collection
# Compliance reports

Q1580: How do you design disaster recovery?

Section titled “Q1580: How do you design disaster recovery?”

Answer:

Terminal window
# DR strategies
# RTO (Recovery Time Objective)
# RPO (Recovery Point Objective)
# Strategy comparison
# Backup & Restore
# - RTO: Hours
# - RPO: Days
# Pilot Light
# - RTO: Minutes to hours
# - RPO: Hours
# Warm Standby
# - RTO: Minutes
# - RPO: Minutes
# Multi-Region Active-Active
# - RTO: Near zero
# - RPO: Near zero
# Implementation
# 1. Backup data
# mysqldump --all-databases | aws s3 cp - s3://bucket/backup.sql
# 2. Replicate data
# PostgreSQL streaming replication to DR region
# 3. Infrastructure as Code
# terraform import
# terraform apply
# 4. Regular DR testing
# Quarterly DR tests
# Document results
# 5. Runbook
# Document recovery procedures

Q1581: How do you handle zero-downtime deployment?

Section titled “Q1581: How do you handle zero-downtime deployment?”

Answer:

Terminal window
# Blue-green deployment
# Deploy to green environment
# Test green
# Switch traffic
# Monitor
# If issues, rollback to blue
# Rolling deployment
# Update one instance at a time
kubectl rolling-update myapp --image=myapp:v2
# Canary deployment
# Route 10% to new version
# Monitor metrics
# Gradually increase
# Rollback if issues
# Kubernetes
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
replicas: 10
strategy:
canary:
maxSurge: "25%"
maxUnavailable: 0
steps:
- setWeight: 10
- pause: {duration: 10m}
- setWeight: 30
- pause: {duration: 10m}
- setWeight: 50
- pause: {duration: 10m}
- setWeight: 100
# Feature flags
# See earlier question

Q1582: How do you handle database migrations?

Section titled “Q1582: How do you handle database migrations?”

Answer:

Terminal window
# Zero-downtime migrations
# 1. Add new column (nullable)
ALTER TABLE users ADD COLUMN new_field VARCHAR(255);
# 2. Write to both columns
# Application code change
# 3. Backfill data
UPDATE users SET new_field = old_field;
# 4. Make new column NOT NULL
ALTER TABLE users MODIFY COLUMN new_field VARCHAR(255) NOT NULL;
# 5. Remove old column
ALTER TABLE users DROP COLUMN old_field;
# For PostgreSQL
# Use pg_online
# Create index concurrently
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
# For MySQL
# Use pt-online-schema-change
pt-online-schema-change D=t,s=users --alter "ADD COLUMN new_field VARCHAR(255)" \
--execute
# Rollback plan
# Keep old column
# Dual write
# Test thoroughly

Q1583: How do you handle capacity emergencies?

Section titled “Q1583: How do you handle capacity emergencies?”

Answer:

Terminal window
# Emergency response
# 1. Immediate mitigation
# Scale up
kubectl scale deployment myapp --replicas=20
# Add capacity
# In AWS
aws autoscaling set-desired-capacity \
--auto-scaling-group-name my-asg \
--desired-capacity 10
# 2. Identify root cause
# Check metrics
# Check logs
# Common issues
# - Traffic spike
# - Slow query
# - Memory leak
# 3. Short-term fix
# Clear cache
redis-cli FLUSHALL
# Kill expensive queries
# PostgreSQL
SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE query_start < NOW() - INTERVAL '5 minutes';
# 4. Long-term fix
# Optimize code
# Add capacity
# Implement caching

Q1584: How do you handle security incidents?

Section titled “Q1584: How do you handle security incidents?”

Answer:

Terminal window
# Security incident response
# 1. Detection
# - SIEM alerts
# - IDS alerts
# - User reports
# 2. Containment
# Isolate affected systems
# iptables -I INPUT -s attacker_ip -j DROP
# iptables -I OUTPUT -d attacker_ip -j DROP
# 3. Investigation
# Collect evidence
# tcpdump -i eth0 -w capture.pcap
# Forensics
# 4. Eradication
# Remove malware
# Patch vulnerability
# Reset compromised credentials
# 5. Recovery
# Restore from clean backup
# Verify system integrity
# 6. Lessons learned
# Document incident
# Update security controls
# Tools
# - CHKRootkit
# - RKHunter
# - ClamAV
# - OSSEC

Answer:

Terminal window
# Data corruption response
# 1. Identify corruption
# Check logs
# Verify checksums
# md5sum
# 2. Stop writes
# Read-only mount
# mount -o remount,ro /data
# 3. Restore from backup
# Find last good backup
# Restore
# mysql -u root -p mydb < backup.sql
# 4. Point-in-time recovery
# PostgreSQL
# Find transaction ID
# pg_restore -P "2024-01-01 12:00:00" backup.dump
# 5. Verify integrity
# Check application data
# Run database checks
# 6. Prevention
# Enable checksums
# Regular backups
# Monitoring

Answer:

Terminal window
# Network outage response
# 1. Verify outage
# ping gateway
# ping 8.8.8.8
# 2. Check interfaces
# ip link
# ip addr
# 3. Check DNS
# cat /etc/resolv.conf
# nslookup example.com
# 4. Check routes
# ip route
# 5. Recovery steps
# Reset network
systemctl restart networking
# Or
# ip link set eth0 down
# ip link set eth0 up
# For DNS issues
# systemd-resolve --flush-caches
# For cloud
# AWS
aws ec2 describe-instance-status --instance-id i-xxx
# 6. Contact provider
# If not resolvable internally

Q1587: How do you handle performance degradation?

Section titled “Q1587: How do you handle performance degradation?”

Answer:

Terminal window
# Performance troubleshooting
# 1. Identify symptoms
# Check metrics
# top
# iostat 1
# 2. Locate bottleneck
# CPU bound?
top
ps aux --sort=-%cpu
# Memory bound?
free -h
vmstat 1
# I/O bound?
iostat -xz 1
# Network bound?
iftop
nethogs
# 3. Fix
# CPU: Scale, optimize code
# Memory: Add RAM, fix leaks
# I/O: Use faster storage
# Network: Optimize queries
# 4. Verify
# Monitor metrics
# Compare before/after

Q1588: How do you handle authentication failures?

Section titled “Q1588: How do you handle authentication failures?”

Answer:

Terminal window
# Authentication troubleshooting
# 1. Check logs
journalctl -u sshd | tail -50
tail -f /var/log/auth.log
# 2. Verify user exists
getent passwd username
id username
# 3. Check SSH configuration
# /etc/ssh/sshd_config
# PasswordAuthentication yes
# PubkeyAuthentication yes
# AllowUsers username
# 4. Test authentication
# SSH with debug
ssh -vvv user@host
# 5. Reset password
passwd username
# 6. Check PAM
# /etc/pam.d/sshd
# 7. For LDAP
# Check connectivity
ldapsearch -x -D "cn=admin,dc=example,dc=com" -W
# Check sssd
sssd -i -d 10

Answer:

Terminal window
# Storage full response
# 1. Find large files
du -sh /*
du -sh /var/*
du -sh /var/log/*
# 2. Find large directories
du -ah / | sort -rh | head -20
# 3. Clean logs
journalctl --vacuum-size=100M
find /var/log -type f -mtime +30 -delete
# 4. Clean tmp
rm -rf /tmp/*
rm -rf /var/tmp/*
# 5. Clean package cache
apt clean
yum clean all
# 6. Docker cleanup
docker system prune -a
# 7. Find deleted files still open
lsof +L1
# 8. Extend storage
# Add volume
# Add to LVM

Answer:

Terminal window
# Kernel panic response
# 1. Verify panic
# Check logs
dmesg | tail -100
# 2. Configure kdump
apt install kdump-tools
# 3. Analyze crash
# /var/crash/
crash /var/crash/202401011200/vmcore /usr/lib/debug/boot/vmlinux-$(uname -r)
# 4. Common causes
# - Hardware failure (RAM, disk)
# - Driver issues
# - OOM
# - Kernel bugs
# 5. Fixes
# Update kernel
# Disable problematic driver
# Add RAM
# Fix OOM settings
# 6. Prevention
# Monitor resources
# Keep kernel updated
# Use hardware from compatibility list

Q1591: How do you implement infrastructure monitoring?

Section titled “Q1591: How do you implement infrastructure monitoring?”

Answer:

Terminal window
# Infrastructure monitoring
# Prometheus + Grafana
# Node exporter
node_exporter --collector.filesystem.mount-points-exclude="^/(sys|proc|run)"
# Custom metrics
# Python client
from prometheus_client import Counter
requests_total = Counter('app_requests_total', 'Total requests')
# Alert rules
- alert: HighCPU
expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
- alert: HighMemory
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90
# Dashboards
# Import from Grafana.com
# Logs
# ELK Stack
# Loki + Grafana

Q1592: How do you implement application monitoring?

Section titled “Q1592: How do you implement application monitoring?”

Answer:

Terminal window
# APM (Application Performance Monitoring)
# Jaeger
# Python
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("operation") as span:
span.set_attribute("key", "value")
# Prometheus metrics
from prometheus_client import Counter, Histogram, Gauge
request_count = Counter('http_requests_total', 'Total HTTP requests')
request_duration = Histogram('http_request_duration_seconds')
active_users = Gauge('active_users', 'Number of active users')
# Health checks
# Kubernetes liveness/readiness probes
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5

Answer:

Terminal window
# Log analysis
# ELK Stack
# Elasticsearch
# Logstash
# Kibana
# Loki
# Grafana + Loki
# Structured logging
# JSON format
import json
import logging
class JSONFormatter(logging.Formatter):
def format(self, record):
return json.dumps({
'timestamp': self.formatTime(record),
'level': record.levelname,
'message': record.getMessage(),
'module': record.module
})
# Log levels
# DEBUG - Detailed info
# INFO - Confirmation
# WARNING - Something unexpected
# ERROR - Serious problem
# CRITICAL - Very serious problem
# Analysis queries
# Find errors
grep -i error /var/log/app.log
# Count by hour
awk '{print $2}' /var/log/app.log | sort | uniq -c
# Slow requests
awk '$9 > 5 {print}' /var/log/nginx/access.log

Answer:

Terminal window
# Alerting
# Prometheus + AlertManager
# alertmanager.yaml
route:
group_by: ['alertname']
receiver: 'team'
group_wait: 10s
group_interval: 10s
receivers:
- name: 'team'
email_configs:
- to: 'team@example.com'
slack_configs:
- api_url: 'https://hooks.slack.com/...'
channel: '#alerts'
# PagerDuty integration
- name: 'pagerduty'
pagerduty_configs:
- service_key: 'KEY'
# Best practices
# 1. Alert on symptoms, not causes
# 2. Set appropriate thresholds
# 3. Avoid alert fatigue
# 4. Have runbooks
# 5. Test alerts regularly

Q1595: How do you implement backup verification?

Section titled “Q1595: How do you implement backup verification?”

Answer:

Terminal window
# Backup verification
# 1. Test restoration
# Restore to test environment
mysql -u root -p test < backup.sql
psql -U postgres test < backup.sql
# 2. Automated verification
#!/bin/bash
BACKUP_FILE=$1
# Verify backup file exists
if [ ! -f "$BACKUP_FILE" ]; then
echo "Backup file not found"
exit 1
fi
# Verify file size
SIZE=$(stat -f%z "$BACKUP_FILE")
if [ "$SIZE" -lt 1000 ]; then
echo "Backup file too small"
exit 1
fi
# Verify file integrity
if [[ "$BACKUP_FILE" == *.gz ]]; then
gzip -t "$BACKUP_FILE"
elif [[ "$BACKUP_FILE" == *.sql ]]; then
head -1 "$BACKUP_FILE" | grep -q "MySQL"
fi
# Verify database can be restored
# (Run in isolated environment)
# Report status

Q1596: How do you design multi-region architecture?

Section titled “Q1596: How do you design multi-region architecture?”

Answer:

Terminal window
# Multi-region design
# DNS failover
# Route 53 health checks
aws route53 create-health-check --health-check-config '{"Type":"HTTPS","FullyQualifiedDomainName":"example.com","Port":443,"ResourcePath":"/health"}'
# Database replication
# PostgreSQL
# Primary in us-east-1
# Replica in us-west-2
# Object storage
# S3 cross-region replication
aws s3api put-bucket-replication \
--bucket source-bucket \
--replication-configuration file://replication.json
# Cache
# Redis Global
aws elasticache create-global-replication-group \
--global-replication-group-id my-global \
--primary-replication-group-id primary-id
# CDN
# CloudFront
aws cloudfront create-distribution \
--origin-domain-name mybucket.s3.amazonaws.com
# Traffic management
# Global Accelerator
aws global-accelerator create-accelerator

Answer:

Terminal window
# Zero trust architecture
# 1. Identity verification
# MFA everywhere
# Conditional access policies
# 2. Network segmentation
# Micro-segmentation
# Private links
# Service mesh
# 3. Device trust
# Endpoint detection
# Mobile Device Management
# 4. Application security
# OAuth 2.0
# JWT validation
# 5. Data protection
# Encryption everywhere
# Implementation
# Kubernetes network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
EOF
# Service mesh mTLS
# Istio
istioctl install --set profile=strict
# BeyondCorp
# Access proxy
# No VPN needed

Q1598: How do you implement chaos engineering?

Section titled “Q1598: How do you implement chaos engineering?”

Answer:

Terminal window
# Chaos engineering
# 1. Define steady state
# 2. Hypothesize
# 3. Run experiment
# 4. Observe
# 5. Fix
# Tools
# Chaos Monkey (Netflix)
# Litmus
# Chaos Mesh
# Example: Kill random pod
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
name: random-pod-kill
spec:
action: pod-failure
mode: random
duration: 60s
# Example: Network delay
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
name: network-latency
spec:
action: delay
mode: one
duration: 60s
delay:
latency: 100ms
# Runbook
# Document expected behavior
# Monitor during experiment
# Have rollback plan

Answer:

Terminal window
# GitOps
# 1. Store all configs in Git
# 2. Use CI/CD to apply changes
# 3. Automated drift detection
# ArgoCD
# Application definition
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp
spec:
project: default
source:
repoURL: https://github.com/org/repo.git
targetRevision: HEAD
path: k8s/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
# Flux
# Install
flux install
# Create source
flux create source git myapp \
--url=https://github.com/org/repo \
--branch=main
# Create kustomization
flux create kustomization myapp \
--source=myapp \
--path=./k8s/production

Q1600: How do you implement cost governance?

Section titled “Q1600: How do you implement cost governance?”

Answer:

prod/staging/dev
# Cost governance
# 1. Tagging strategy
# All resources must have tags
# - Team: team-name
# - Project: project-name
# - CostCenter: cost-center
# 2. Budgets
# Set budgets per team/project
aws budgets create-budget \
--account-id 123456789012 \
--budget file://budget.json
# 3. Rightsizing
# Use AWS Compute Optimizer
aws compute-optimizer get-recommendation-resource-views
# 4. Reserved capacity
# For steady workloads
# Purchase reserved instances
# 5. Use spot
# For fault-tolerant workloads
# 6. Delete unused resources
# Find unattached volumes
aws ec2 describe-volumes --filters Name=status,Values=available
# 7. Regular review
# Weekly cost review meetings
# Track spend trends
# 8. Showback/Chargeback
# Report costs by team

Q1601: How do you implement compliance automation?

Section titled “Q1601: How do you implement compliance automation?”

Answer:

Terminal window
# Compliance automation
# Open Policy Agent (OPA)
# Gatekeeper
# Prevents non-compliant resources
# Policy example
package kubernetes.admission
deny[msg] {
input.request.kind.kind == "Deployment"
input.request.object.spec.replicas > 10
msg = "Cannot have more than 10 replicas"
}
# Install Gatekeeper
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replications/replicationconstraint.yaml
# InSpec
# Compliance as code
# controls/nginx.rb
control 'nginx-01' do
impact 1.0
title 'Nginx should be configured securely'
describe service('nginx') do
it { should be_running }
end
describe file('/etc/nginx/nginx.conf') do
its('content') { should_not match /server_tokens off;/ }
end
end
# Run
inspec exec compliance/

Q1602: How do you implement disaster recovery automation?

Section titled “Q1602: How do you implement disaster recovery automation?”

Answer:

# DR automation
# 1. Backup automation
#!/bin/bash
# Automated backup
BACKUP_DATE=$(date +%Y%m%d)
# Database backup
mysqldump -u root -p mydb | gzip > s3://bucket/backup-$BACKUP_DATE.sql.gz
# File backup
tar -czf - /data | aws s3 cp - s3://bucket/data-$BACKUP_DATE.tar.gz
# Retention
aws s3 ls s3://bucket/ | awk '{print $2}' | while read prefix; do
if [[ $(echo $prefix | grep -oP '\d{8}') < $(date -d '30 days ago' +%Y%m%d) ]]; then
aws s3 rm s3://bucket/$prefix --recursive
fi
done
# 2. DR playbook
# Documented runbooks
# Regular testing
# 3. Automated failover
# DNS failover
# Route 53 health checks + failover record
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890 \
--change-batch file://failover.json
# Database failover
# Automatic replica promotion
# Connection string update

Q1603: How do you implement capacity management?

Section titled “Q1603: How do you implement capacity management?”

Answer:

Terminal window
# Capacity management
# 1. Monitor utilization
# CPU, Memory, Storage, Network
# 2. Trend analysis
# Weekly reviews
# Growth rate calculation
# 3. Forecasting
# Use ML
# aws ce get-forecast
# 4. Planning
# Add capacity before hitting limits
# 5. Optimization
# Right-size instances
# Use savings plans
# Kubernetes
# Vertical Pod Autoscaler
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: myapp-vpa
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto"
EOF
# Horizontal Pod Autoscaler
kubectl autoscale deployment myapp \
--cpu-percent=80 --min=2 --max=10

Q1604: How do you implement reliability engineering?

Section titled “Q1604: How do you implement reliability engineering?”

Answer:

Terminal window
# Reliability engineering
# SRE principles
# 1. SLOs (Service Level Objectives)
# - Availability: 99.9%
# - Latency: p99 < 200ms
# 2. Error budgets
# 100% - SLO = error budget
# If budget exhausted, freeze features
# 3. Toil reduction
# Automate manual tasks
# 4. Post-mortems
# Blameless
# Focus on process improvement
# 5. Releases
# Canary deployments
# Feature flags
# 6. Circuit breakers
# See earlier
# 7. Bulkheads
# Isolate failures
# 8. Self-healing
# Restart failed pods
# Replace unhealthy nodes

Q1605: How do you implement SRE practices?

Section titled “Q1605: How do you implement SRE practices?”

Answer:

Terminal window
# SRE practices
# Error budgets
# https://sre.google/sre-book/availability-table/
# Toil management
# Identify
# Quantify
# Automate
# Eliminate
# Observability
# Metrics
# Logs
# Traces
# Incident management
# On-call rotation
# Runbooks
# Post-mortems
# Change management
# Canary releases
# Gradual rollouts
# SRE tools
# Prometheus
# Grafana
# Jaeger
# Loki
# On-call
# PagerDuty
# OpsGenie
# Automation
# Ansible
# Terraform
# Kubernetes

Q1606: How do you optimize Linux for cloud?

Section titled “Q1606: How do you optimize Linux for cloud?”

Answer:

# Cloud-optimized Linux
# Ubuntu Pro for AWS
# AWS-optimized kernel
# FIPS compliance
# Livepatch
# Cloud-specific optimizations
# Use instance store for temp data
# Use EBS for persistent data
# Network optimization
# ENA (Elastic Network Adapter)
# Use enhanced networking
# Storage optimization
# Use NVMe for high I/O
# Use EBS gp3 for balance
# CloudWatch Agent
# Install
# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
# Configure
# /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/metrics.json
{
"metrics": {
"namespace": "CustomNamespace",
"metrics_collected": {
"cpu": {
"measurement": ["cpu_usage_idle"]
},
"mem": {
"measurement": ["mem_used_percent"]
}
}
}
}

Answer:

Terminal window
# FinOps
# Cloud financial management
# 1. Visibility
# Tag all resources
# Use cost explorer
# 2. Optimization
# Right-sizing
# Reservations
# Spot instances
# 3. Accountability
# Showback to teams
# Budgets
# Tools
# AWS Cost Explorer
# GCP Cloud Billing
# Azure Cost Management
# FinOps workflow
# 1. Inform
# Show costs by team
# Dashboards
# 2. Optimize
# Right-size resources
# Use savings plans
# 3. Operate
# Monitor daily spend
# Alerts
# Automation
# Script to find idle resources
aws ec2 describe-instances --filters "Name=instance-state-name,Values=running" \
--query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value|[0],LaunchTime]' \
--output table

Q1608: How do you implement platform engineering?

Section titled “Q1608: How do you implement platform engineering?”

Answer:

Terminal window
# Platform engineering
# Internal Developer Platform (IDP)
# Self-service
# Components
# 1. CI/CD pipelines
# GitHub Actions
# GitLab CI
# 2. Service catalog
# Backstage
# Port
# 3. Infrastructure templates
# Terraform modules
# Helm charts
# 4. Observability
# Unified dashboards
# 5. Security
# Policy enforcement
# Implementation
# Platform team builds tools
# Developers consume
# Benefits
# Faster deployments
# Consistency
# Security
# Reduced cognitive load
# Backstage
# Create catalog
# Service templates
# Documentation

Q1609: How do you implement developer experience?

Section titled “Q1609: How do you implement developer experience?”

Answer:

Terminal window
# Developer experience
# 1. Local development
# Docker Compose
# localstack
# 2. Documentation
# OpenAPI specs
# Swagger UI
# 3. IDE integration
# LSP servers
# Debugging
# 4. Testing
# Fast feedback
# Unit tests
# Integration tests
# 5. Deployment
# Simple commands
# kubectl
# ArgoCD
# Example: Developer workflow
# 1. Clone repo
# 2. Make changes
# 3. Run tests locally
# 4. Push to branch
# 5. CI runs tests
# 6. Merge to main
# 7. CD deploys
# Self-service
# Create environment
# Deploy app
# View logs
# Scale application

Q1610: How do you implement cloud security?

Section titled “Q1610: How do you implement cloud security?”

Answer:

Terminal window
# Cloud security
# Shared responsibility model
# Identity
# IAM with least privilege
# MFA everywhere
# Network
# VPC with private subnets
# Security groups
# NACLs
# WAF
# Data
# Encryption at rest
# Encryption in transit
# Key management
# Compliance
# Regular audits
# Vulnerability scanning
# Penetration testing
# Tools
# AWS GuardDuty
# AWS Config
# AWS Security Hub
# Example: AWS security
# Enable CloudTrail
aws cloudtrail create-trail --name my-trail \
--s3-bucket-name mybucket
# Enable GuardDuty
aws guardduty create-detector --enable
# Enable Security Hub
aws securityhub enable-organization-admin-account \
--admin-account-id 123456789012

Q1611: How do you implement Kubernetes security?

Section titled “Q1611: How do you implement Kubernetes security?”

Answer:

Terminal window
# Kubernetes security
# 1. RBAC
# Least privilege
kubectl create role pod-reader --verb=get,list --resource=pods
kubectl create rolebinding --role=pod-reader --user=dev
# 2. Network policies
# Default deny
kubectl apply -f network-policy.yaml
# 3. Pod security
# Pod security standards
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
# 4. Secrets management
# Use Vault or AWS Secrets Manager
kubectl create secret generic mysecret \
--from-literal=key=value
# 5. Image scanning
trivy image myimage:latest
# 6. Runtime security
# Falco
falco -r rules/myrules.yaml
# 7. API server security
# Disable anonymous auth
# Enable RBAC
# Use TLS

Q1612: How do you implement data protection?

Section titled “Q1612: How do you implement data protection?”

Answer:

Terminal window
# Data protection
# 1. Classification
# Public, Internal, Confidential, Restricted
# 2. Encryption
# At rest
cryptsetup luksFormat /dev/sdb1
# In transit
# TLS 1.2+
# 3. Access control
# IAM policies
# Database permissions
# 4. Backup
# Regular backups
# Test restoration
# Offsite backup
# 5. Monitoring
# Audit logs
# Alerts on suspicious access
# 6. Data loss prevention
# Block sensitive data exfiltration
# Tools
# AWS Macie
# GCP DLP
# Azure Purview

Q1613: How do you implement supply chain security?

Section titled “Q1613: How do you implement supply chain security?”

Answer:

Terminal window
# Supply chain security
# 1. Dependency scanning
# Snyk
# Dependabot
# 2. Container scanning
# Trivy
# Clair
# 3. SBOM (Software Bill of Materials)
# Generate SBOM
syft myimage:latest
# Sign artifacts
# Cosign
cosign sign myimage:latest
# Verify
cosign verify myimage:latest
# 4. SLSA compliance
# Build provenance
# GitHub Actions
# Tekton
# 5. Secure build pipeline
# No external dependencies at build time
# Use pinned versions
# Scan for secrets

Q1614: How do you implement incident management?

Section titled “Q1614: How do you implement incident management?”

Answer:

Terminal window
# Incident management
# 1. Detection
# Monitoring alerts
# User reports
# 2. Response
# Acknowledge
# Assess severity
# Mitigate
# 3. Communication
# Status page
# Stakeholder updates
# 4. Resolution
# Fix root cause
# Verify recovery
# 5. Post-incident
# Blameless post-mortem
# Action items
# Tools
# PagerDuty
# OpsGenie
# VictorOps
# Runbook example
# Runbook: Database Connection Issues
# 1. Check database status
# systemctl status postgresql
# 2. Check connections
# psql -c "SELECT count(*) FROM pg_stat_activity"
# 3. Restart if needed
# systemctl restart postgresql

Q1615: How do you implement change management?

Section titled “Q1615: How do you implement change management?”

Answer:

Terminal window
# Change management
# 1. Request
# JIRA ticket
# RFC (Request for Change)
# 2. Review
# Technical review
# Security review
# 3. Approval
# Manager approval
# CAB (Change Advisory Board)
# 4. Implementation
# Schedule change window
# Implement change
# 5. Verification
# Test in staging
# Monitor in production
# 6. Documentation
# Update runbooks
# Document lessons learned
# 7. Emergency changes
# Expedited process
# Post-implementation review
# Tools
# ServiceNow
# Jira Service Management
# GitHub PRs

Q1616: How do you design a highly available web application?

Section titled “Q1616: How do you design a highly available web application?”

Answer:

Terminal window
# Architecture components
# 1. Load balancer (HAProxy/ALB)
# 2. Web servers (multiple)
# 3. Application servers (multiple)
# 4. Database (primary + replica)
# 5. Cache (Redis Sentinel/Cluster)
# 6. Message queue (Kafka cluster)
# 7. CDN for static content
# 8. Object storage (S3)
# Implementation
# Multi-AZ deployment
# Auto-scaling groups
# Health checks
# Graceful degradation
# DNS
# Route 53 with health checks
# Database
# PostgreSQL with streaming replication
# Caching
# Redis with Sentinel or Cluster
# Monitoring
# Comprehensiveability
# DR observ
# Multi-region deployment

Q1617: How do you troubleshoot a slow database?

Section titled “Q1617: How do you troubleshoot a slow database?”

Answer:

Terminal window
# Database troubleshooting
# 1. Check system resources
# CPU, Memory, I/O
# 2. Check database stats
# PostgreSQL
# pg_stat_activity
# pg_stat_statements
# MySQL
# SHOW PROCESSLIST;
# SHOW STATUS;
# 3. Check slow queries
# PostgreSQL
# pg_stat_statements
# EXPLAIN ANALYZE
# MySQL
# SHOW PROCESSLIST
# EXPLAIN
# 4. Check indexes
# PostgreSQL
# \d table_name
# MySQL
# SHOW INDEX FROM table
# 5. Fixes
# Add indexes
# Optimize queries
# Tune configuration
# Scale horizontally
# Add read replicas

Q1618: How do you design a backup strategy?

Section titled “Q1618: How do you design a backup strategy?”

Answer:

Terminal window
# Backup strategy
# 1. RPO/RTO definition
# Recovery Point Objective
# Recovery Time Objective
# 2. Backup types
# Full
# Incremental
# Differential
# 3. Frequency
# Full: Weekly
# Incremental: Daily
# Transaction logs: Every 15 minutes
# 4. Retention
# Daily: 30 days
# Weekly: 12 weeks
# Monthly: 12 months
# Yearly: 7 years
# 5. Testing
# Monthly restoration tests
# Document procedures
# 6. Offsite
# Cross-region replication
# Different cloud provider
# 7. Automation
# Cron jobs
# CI/CD pipelines

Answer:

Terminal window
# Linux security
# 1. Updates
# Regular patching
# 2. Firewall
# iptables/firewalld
# 3. SELinux/AppArmor
# Enable and configure
# 4. Users
# Disable root login
# SSH keys only
# Strong passwords
# 5. Services
# Disable unused services
# 6. Network
# Harden kernel parameters
# Disable IP forwarding
# Rate limiting
# 7. Monitoring
# Audit logging
# IDS
# 8. Encryption
# Full disk encryption
# TLS everywhere

Q1620: How do you design a monitoring system?

Section titled “Q1620: How do you design a monitoring system?”

Answer:

Terminal window
# Monitoring system design
# 1. Metrics
# Prometheus
# Node exporter
# Application metrics
# 2. Logs
# ELK Stack or Loki
# 3. Traces
# Jaeger or Zipkin
# 4. Alerting
# Prometheus AlertManager
# PagerDuty integration
# 5. Dashboards
# Grafana
# 6. SLOs
# Define error budgets
# 7. Runbooks
# Document responses
# 8. On-call
# Rotation schedule

Q1621: How do you optimize Linux performance?

Section titled “Q1621: How do you optimize Linux performance?”

Answer:

Terminal window
# Linux optimization
# 1. CPU
# Tune scheduler
# Process affinity
# Priority adjustment
# 2. Memory
# Swappiness
# Cache tuning
# Huge pages
# 3. I/O
# I/O scheduler
# Filesystem choice
# Mount options
# SSD optimization
# 4. Network
# Buffer sizes
# TCP tuning
# Offloading
# 5. Kernel
# Update regularly
# Tune parameters
# 6. Applications
# Profiling
# Optimization
# Tools
# perf
# sysbench
# fio
# iperf

Q1622: How do you design a disaster recovery plan?

Section titled “Q1622: How do you design a disaster recovery plan?”

Answer:

Terminal window
# DR planning
# 1. Risk assessment
# Identify critical systems
# RTO/RPO requirements
# 2. Strategy
# Backup & Restore
# Pilot Light
# Warm Standby
# Multi-region
# 3. Implementation
# Automated backups
# Replication
# Infrastructure as Code
# 4. Testing
# Regular DR tests
# Document results
# 5. Documentation
# Runbooks
# Contact list
# 6. Communication
# Stakeholder notification
# Status updates

Q1623: How do you implement zero-downtime deployments?

Section titled “Q1623: How do you implement zero-downtime deployments?”

Answer:

Terminal window
# Zero-downtime deployment
# 1. Load balancer
# Health checks
# Graceful removal
# 2. Application
# Signal handling
# Graceful shutdown
# 3. Database
# Schema migrations
# Backward compatibility
# 4. Strategies
# Rolling update
# Blue-green
# Canary
# Feature flags
# 5. Rollback plan
# Quick rollback capability
# 6. Testing
# Load testing
# Chaos engineering

Q1624: How do you handle capacity planning?

Section titled “Q1624: How do you handle capacity planning?”

Answer:

Terminal window
# Capacity planning
# 1. Current state
# Measure utilization
# 2. Trends
# Analyze growth
# 3. Forecasting
# Predict future needs
# 4. Planning
# Add capacity proactively
# 5. Optimization
# Right-size resources
# Use automation
# Metrics
# CPU
# Memory
# Disk
# Network
# Application-specific
# Tools
# Prometheus
# Grafana
# AWS Compute Optimizer
# Azure Advisor

Answer:

Terminal window
# Compliance implementation
# 1. Framework
# SOC 2, PCI-DSS, HIPAA, GDPR
# 2. Controls
# Access control
# Encryption
# Monitoring
# Auditing
# 3. Automation
# Policy as Code
# OPA/Gatekeeper
# 4. Evidence
# Automated collection
# Documentation
# 5. Training
# Security awareness
# 6. Testing
# Vulnerability scans
# Penetration tests
# 7. Remediation
# Track findings
# Fix issues

Answer:

Terminal window
# Designing for scale
# 1. Horizontal scaling
# Stateless applications
# Load balancers
# Auto-scaling
# 2. Database scaling
# Read replicas
# Sharding
# Partitioning
# Caching
# 3. Caching
# Multi-layer
# Redis/Memcached
# 4. Asynchronous
# Message queues
# Event-driven
# 5. CDN
# Static content
# 6. Optimization
# Profiling
# Database tuning
# 7. Monitoring
# Early detection

Q1627: How do you implement observability?

Section titled “Q1627: How do you implement observability?”

Answer:

Terminal window
# Observability
# 1. Metrics
# Prometheus
# Custom metrics
# 2. Logs
# Structured logging
# ELK/Loki
# 3. Traces
# Distributed tracing
# 4. Correlation
# Trace IDs
# Request IDs
# 5. Alerting
# Based on SLOs
# 6. Dashboards
# Service overview
# Troubleshooting
# 7. Post-mortems
# Blameless analysis
# Implementation
# OpenTelemetry
# Many tools

Q1628: How do you secure containerized applications?

Section titled “Q1628: How do you secure containerized applications?”

Answer:

Terminal window
# Container security
# 1. Images
# Minimal base
# No secrets in images
# Scan for vulnerabilities
# 2. Runtime
# Non-root user
# Read-only root
# Resource limits
# 3. Network
# Network policies
# Service mesh
# 4. Orchestrator
# RBAC
# Pod security policies
# 5. Secrets
# Use secrets manager
# Don't use env vars
# Tools
# Trivy
# Falco
# OPA

Q1629: How do you implement infrastructure as code?

Section titled “Q1629: How do you implement infrastructure as code?”

Answer:

Terminal window
# Infrastructure as Code
# 1. Version control
# Git
# 2. Modules
# Reusable components
# 3. State management
# Remote state
# State locking
# 4. Testing
# Validate
# Plan
# 5. CI/CD
# Automated deployment
# 6. Drift detection
# Detect changes
# Tools
# Terraform
# Pulumi
# CloudFormation
# Ansible

Q1630: How do you manage secrets in CI/CD?

Section titled “Q1630: How do you manage secrets in CI/CD?”

Answer:

Terminal window
# Secrets in CI/CD
# 1. Never commit secrets
# 2. Use secrets management
# HashiCorp Vault
# AWS Secrets Manager
# Azure Key Vault
# 3. Environment variables
# Inject at runtime
# 4. CI/CD integration
# GitHub Secrets
# GitLab CI variables
# 5. Rotation
# Auto-rotate secrets
# 6. Audit
# Log access

Q1631: How do you design a secure network?

Section titled “Q1631: How do you design a secure network?”

Answer:

Terminal window
# Secure network design
# 1. Segmentation
# DMZ
# Internal
# Database
# 2. Firewall
# Whitelist approach
# Default deny
# 3. Encryption
# TLS everywhere
# VPN for access
# 4. Monitoring
# IDS/IPS
# NetFlow
# 5. DDoS protection
# CDN
# WAF
# Rate limiting

Q1632: How do you handle database failover?

Section titled “Q1632: How do you handle database failover?”

Answer:

Terminal window
# Database failover
# 1. Automatic detection
# Health checks
# 2. Failover process
# Promote replica
# Update DNS
# 3. Application handling
# Connection retry
# Circuit breakers
# 4. Monitoring
# Alert on failover
# 5. Testing
# Regular drills

Answer:

Terminal window
# Caching strategy
# 1. CDN
# Static assets
# 2. Application cache
# Redis
# Memcached
# 3. Database cache
# Query cache
# Buffer pool
# 4. Browser cache
# Headers
# 5. Invalidation
# TTL
# Cache busting
# Patterns

Q1634: How do you design for high availability?

Section titled “Q1634: How do you design for high availability?”

Answer:

Terminal window
# High availability design
# 1. Redundancy
# Multiple AZs
# Multiple regions
# 2. Load balancing
# Health checks
# Failover
# 3. Data replication
# Synchronous
# Asynchronous
# 4. Monitoring
# Fast detection
# 5. Automation
# Self-healing
# 6. Testing
# Chaos engineering

Answer:

Terminal window
# Kubernetes security
# 1. RBAC
# Least privilege
# 2. Network policies
# Default deny
# 3. Pod security
# Standards
# 4. Secrets
# External
# 5. Images
# Scanning
# 6. Runtime
# Falco
# 7. Updates
# Regular

Answer:

Terminal window
# API security
# 1. Authentication
# OAuth 2.0
# JWT
# 2. Authorization
# RBAC
# Scopes
# 3. Rate limiting
# Throttling
# 4. Input validation
# Sanitization
# 5. TLS
# Encryption
# 6. Monitoring
# Anomaly detection

Answer:

Terminal window
# Logging implementation
# 1. Format
# JSON
# Structured
# 2. Levels
# DEBUG, INFO, WARN, ERROR
# 3. Correlation
# Trace IDs
# 4. Rotation
# Logrotate
# 5. Aggregation
# ELK/Loki
# 6. Retention
# Policy

Answer:

Terminal window
# Security design
# 1. Defense in depth
# Multiple layers
# 2. Least privilege
# Minimize access
# 3. Zero trust
# Verify always
# 4. Encryption
# Everywhere
# 5. Monitoring
# Continuous
# 6. Automation
# Respond fast

Q1639: How do you implement incident response?

Section titled “Q1639: How do you implement incident response?”

Answer:

Terminal window
# Incident response
# 1. Preparation
# Runbooks
# Tools
# 2. Detection
# Alerts
# 3. Containment
# Isolate
# 4. Eradication
# Fix
# 5. Recovery
# Restore
# 6. Lessons learned
# Post-mortem

Answer:

Terminal window
# Cost optimization
# 1. Right-sizing
# Match needs
# 2. Reservations
# Steady state
# 3. Spot
# Fault-tolerant
# 4. Automation
# Scale down
# 5. Cleanup
# Unused resources
# 6. Monitoring
# Alerts

Q1641: How do you implement change automation?

Section titled “Q1641: How do you implement change automation?”

Answer:

Terminal window
# Change automation
# 1. GitOps
# All changes in Git
# 2. CI/CD
# Automated testing
# 3. Approval gates
# Manual steps
# 4. Rollback
# Automatic
# 5. Monitoring
# Quick detection

Answer:

Terminal window
# Design for failure
# 1. Redundancy
# Multiple copies
# 2. Graceful degradation
# Partial service
# 3. Circuit breakers
# Prevent cascade
# 4. Bulkheads
# Isolate
# 5. Recovery
# Fast
# 6. Testing
# Chaos

Q1643: How do you implement access control?

Section titled “Q1643: How do you implement access control?”

Answer:

Terminal window
# Access control
# 1. Authentication
# MFA
# 2. Authorization
# RBAC
# 3. Least privilege
# Minimal access
# 4. Audit
# Log access
# 5. Review
# Regular

Answer:

Terminal window
# Data security
# 1. Classification
# Sensitivity
# 2. Encryption
# At rest
# In transit
# 3. Access control
# Need to know
# 4. Backup
# Encrypted
# 5. Monitoring
# Audit

Answer:

Terminal window
# API design
# 1. REST
# Resources
# HTTP verbs
# 2. Versioning
# URL path
# 3. Error handling
# Consistent
# 4. Pagination
# Large sets
# 5. Rate limiting
# Throttle
# 6. Documentation
# OpenAPI

Answer:

Terminal window
# Service mesh
# 1. Traffic management
# Routing
# 2. Security
# mTLS
# 3. Observability
# Tracing
# 4. Resilience
# Retries
# Tools
# Istio
# Linkerd
# Consul Connect

Answer:

Terminal window
# Database optimization
# 1. Indexing
# Proper indexes
# 2. Query optimization
# EXPLAIN
# 3. Caching
# Use cache
# 4. Connection pooling
# Pool
# 5. Scaling
# Read replicas
# Sharding
# 6. Configuration
# Tune parameters

Q1648: How do you implement secrets management?

Section titled “Q1648: How do you implement secrets management?”

Answer:

Terminal window
# Secrets management
# 1. Centralized
# Vault
# 2. Rotation
# Auto
# 3. Audit
# Log access
# 4. Encryption
# Encrypt
# 5. Access control
# Least privilege

Answer:

Terminal window
# Disaster recovery
# 1. Backup
# Regular
# 2. Replication
# Cross-region
# 3. Automation
# Fast recovery
# 4. Testing
# Regular
# 5. Documentation
# Runbooks

Q1650: How do you implement observability?

Section titled “Q1650: How do you implement observability?”

Answer:

Terminal window
# Observability
# 1. Metrics
# Prometheus
# 2. Logs
# ELK
# 3. Traces
# Jaeger
# 4. Correlation
# Trace IDs
# 5. Alerting
# SLO-based

Answer:

Terminal window
# Kernel upgrade
# 1. Test in staging
# 2. Check compatibility
# 3. Backup
# 4. Schedule window
# 5. Apply
# 6. Monitor
# 7. Rollback plan

Q1652: How do you design multi-tenant systems?

Section titled “Q1652: How do you design multi-tenant systems?”

Answer:

Terminal window
# Multi-tenancy
# 1. Isolation
# Namespaces
# RBAC
# 2. Quotas
# Resources
# 3. Billing
# Usage tracking
# 4. Data separation
# Logical/physical
# 5. Network
# Segmentation

Q1653: How do you implement edge computing?

Section titled “Q1653: How do you implement edge computing?”

Answer:

Terminal window
# Edge computing
# 1. Lightweight K8s
# K3s
# 2. Data processing
# Local first
# 3. Sync
# Periodic
# 4. Security
# Edge-specific
# 5. Management
# Centralized

Q1654: How do you optimize Linux for containers?

Section titled “Q1654: How do you optimize Linux for containers?”

Answer:

Terminal window
# Container optimization
# 1. OS
# Minimal OS
# 2. Kernel
# Tuned for containers
# 3. Storage
# Overlay2
# 4. Network
# CNI
# 5. Runtime
# containerd
# 6. Security
# Hardened

Answer:

Terminal window
# GDPR compliance
# 1. Data minimization
# Collect less
# 2. Consent
# Explicit
# 3. Right to erasure
# Delete capability
# 4. Portability
# Export data
# 5. Breach notification
# Process
# 6. DPO
# Appoint

Q1656: How do you implement zero-downtime patching?

Section titled “Q1656: How do you implement zero-downtime patching?”

Answer:

Terminal window
# Zero-downtime patching
# 1. Blue-green
# Two environments
# 2. Canary
# Gradual
# 3. Rolling
# One by one
# 4. Health checks
# Before switch
# 5. Rollback
# Quick

Answer:

Terminal window
# IoT architecture
# 1. Edge
# Local processing
# 2. Protocol
# MQTT
# 3. Security
# Device auth
# 4. Scale
# Millions
# 5. OTA updates
# Secure

Answer:

Terminal window
# RBAC implementation
# 1. Roles
# Define
# 2. Permissions
# Map
# 3. Assignment
# Users
# 4. Audit
# Regular review
# 5. Tools
# LDAP integration

Q1659: How do you optimize network performance?

Section titled “Q1659: How do you optimize network performance?”

Answer:

Terminal window
# Network optimization
# 1. Offloading
# Hardware
# 2. Buffer tuning
# TCP
# 3. Compression
# Accept encoding
# 4. CDN
# Static
# 5. Keepalive
# HTTP

Answer:

Terminal window
# Mobile optimization
# 1. API design
# Efficient
# 2. Compression
# gz/brotli
# 3. Caching
# Aggressive
# 4. Offline
# PWA
# 5. Security
# Certificate pinning

Q1661: How do you implement chaos engineering?

Section titled “Q1661: How do you implement chaos engineering?”

Answer:

Terminal window
# Chaos engineering
# 1. Define steady state
# What works
# 2. Hypothesize
# What will fail
# 3. Experiment
# Inject failure
# 4. Learn
# Observe
# 5. Improve
# Fix
# Tools
# Chaos Mesh
# Litmus
# Gremlin

Q1662: How do you implement immutable infrastructure?

Section titled “Q1662: How do you implement immutable infrastructure?”

Answer:

Terminal window
# Immutable infrastructure
# 1. Images
# Pre-built
# 2. No changes
# Rebuild
# 3. Versioned
# All
# 4. Rollback
# Previous image
# 5. Tools
# Packer
# Container

Q1663: How do you design for high performance?

Section titled “Q1663: How do you design for high performance?”

Answer:

Terminal window
# High performance design
# 1. Profiling
# Find bottleneck
# 2. Optimization
# Targeted
# 3. Caching
# Multi-layer
# 4. Async
# Non-blocking
# 5. Scaling
# Horizontal

Answer:

Terminal window
# Multi-cloud strategy
# 1. Abstraction
# Terraform
# 2. Portability
# Container
# 3. Vendor lock-in
# Avoid
# 4. Data
# Strategy
# 5. Operations
# Unified

Q1665: How do you implement cost allocation?

Section titled “Q1665: How do you implement cost allocation?”

Answer:

Terminal window
# Cost allocation
# 1. Tagging
# All resources
# 2. Tracking
# By team/project
# 3. Reporting
# Regular
# 4. Budgets
# Alerts
# 5. Accountability
# Showback

Q1666: How do you design for compliance automation?

Section titled “Q1666: How do you design for compliance automation?”

Answer:

Terminal window
# Compliance automation
# 1. Policy as code
# OPA
# 2. Scanning
# Automated
# 3. Evidence
# Auto-collect
# 4. Remediation
# Auto-fix
# 5. Audit
# Regular

Q1667: How do you implement API rate limiting?

Section titled “Q1667: How do you implement API rate limiting?”

Answer:

Terminal window
# API rate limiting
# 1. Token bucket
# Leaky bucket
# 2. Per-user
# By key
# 3. Headers
# Rate limit
# 4. Response
# 429
# 5. Throttling
# Graceful

Q1668: How do you design for IoT security?

Section titled “Q1668: How do you design for IoT security?”

Answer:

Terminal window
# IoT security
# 1. Device identity
# Certificates
# 2. OTA updates
# Signed
# 3. Network
# Segmentation
# 4. Data
# Encryption
# 5. Monitoring
# Anomaly

Q1669: How do you implement infrastructure monitoring?

Section titled “Q1669: How do you implement infrastructure monitoring?”

Answer:

Terminal window
# Infrastructure monitoring
# 1. Metrics
# Collect
# 2. Storage
# Time-series
# 3. Visualization
# Dashboards
# 4. Alerting
# Thresholds
# 5. Analysis
# Trends

Q1670: How do you implement database sharding?

Section titled “Q1670: How do you implement database sharding?”

Answer:

Terminal window
# Database sharding
# 1. Key strategy
# Choose shard key
# 2. Routing
# Application
# 3. Rebalancing
# Plan
# 4. Cross-shard
# Minimize
# 5. Monitoring
# Performance

Answer:

Terminal window
# 5G optimization
# 1. Edge computing
# Local processing
# 2. Network slicing
# Dedicated
# 3. Low latency
# Optimization
# 4. Massive IoT
# Scale

Q1672: How do you implement service discovery?

Section titled “Q1672: How do you implement service discovery?”

Answer:

Terminal window
# Service discovery
# 1. DNS
# Consul
# 2. Health checks
# Registration
# 3. Load balancing
# Client-side
# 4. Failover
# Automatic

Q1673: How do you optimize web performance?

Section titled “Q1673: How do you optimize web performance?”

Answer:

Terminal window
# Web performance
# 1. CDN
# Static assets
# 2. Compression
# gz/brotli
# 3. Caching
# Headers
# 4. Minification
# CSS/JS
# 5. Images
# Optimization

Q1674: How do you implement backup verification?

Section titled “Q1674: How do you implement backup verification?”

Answer:

Terminal window
# Backup verification
# 1. Test restore
# Regular
# 2. Automation
# Script
# 3. Checksums
# Verify
# 4. Documentation
# Procedures

Answer:

Terminal window
# Privacy design
# 1. Data minimization
# Collect less
# 2. Encryption
# Strong
# 3. Access control
# Strict
# 4. Audit
# Logging
# 5. Retention
# Policy

Q1676: How do you implement auto-remediation?

Section titled “Q1676: How do you implement auto-remediation?”

Answer:

Terminal window
# Auto-remediation
# 1. Detection
# Alerts
# 2. Classification
# Severity
# 3. Action
# Runbook
# 4. Automation
# Scripts
# 5. Verification
# Confirm fix

Answer:

Terminal window
# Storage optimization
# 1. Tiering
# Hot/cold
# 2. Compression
# Deduplication
# 3. Lifecycle
# Policies
# 4. Monitoring
# Usage
# 5. Cleanup
# Regular

Answer:

Terminal window
# MFA implementation
# 1. Factors
# Multiple
# 2. Methods
# TOTP/Push
# 3. Rollout
# Gradual
# 4. Backup
# Recovery codes
# 5. Enforcement
# Policy

Answer:

Terminal window
# Resilience design
# 1. Redundancy
# Multiple
# 2. Fault tolerance
# Graceful
# 3. Recovery
# Fast
# 4. Testing
# Chaos
# 5. Monitoring
# Real-time

Q1680: How do you implement cost reporting?

Section titled “Q1680: How do you implement cost reporting?”

Answer:

Terminal window
# Cost reporting
# 1. Tagging
# Comprehensive
# 2. Collection
# Automated
# 3. Analysis
# By team
# 4. Visualization
# Dashboards
# 5. Actions
# Optimization

Answer:

Terminal window
# IoT data management
# 1. Collection
# MQTT/HTTP
# 2. Processing
# Stream
# 3. Storage
# Time-series
# 4. Analysis
# Real-time
# 5. Retention
# Policy

Q1682: How do you implement service catalog?

Section titled “Q1682: How do you implement service catalog?”

Answer:

Terminal window
# Service catalog
# 1. Self-service
# Portal
# 2. Standardization
# Templates
# 3. Governance
# Approval
# 4. Documentation
# Auto-generated

Q1683: How do you optimize database queries?

Section titled “Q1683: How do you optimize database queries?”

Answer:

Terminal window
# Query optimization
# 1. EXPLAIN
# Analyze
# 2. Indexing
# Strategic
# 3. Rewriting
# Equivalent
# 4. Caching
# Query cache
# 5. Profiling
# Slow queries

Answer:

Terminal window
# API gateway
# 1. Routing
# Path-based
# 2. Authentication
# JWT
# 3. Rate limiting
# Quotas
# 4. Caching
# Response
# 5. Monitoring
# Usage

Answer:

Terminal window
# Compliance design
# 1. Controls
# Framework
# 2. Automation
# Policy
# 3. Evidence
# Collection
# 4. Monitoring
# Continuous
# 5. Audit
# Regular

Q1686: How do you implement incident automation?

Section titled “Q1686: How do you implement incident automation?”

Answer:

Terminal window
# Incident automation
# 1. Detection
# Automated
# 2. Triage
# Classification
# 3. Response
# Runbooks
# 4. Escalation
# Rules
# 5. Resolution
# Tracking

Answer:

Terminal window
# Kubernetes optimization
# 1. Resources
# Requests/limits
# 2. Scheduling
# Affinity
# 3. Networking
# CNI
# 4. Storage
# Classes
# 5. Autoscaling
# HPA/VPA

Q1688: How do you implement data governance?

Section titled “Q1688: How do you implement data governance?”

Answer:

Terminal window
# Data governance
# 1. Classification
# Sensitivity
# 2. Ownership
# Clear
# 3. Quality
# Rules
# 4. Lineage
# Tracking
# 5. Compliance
# Policy

Q1689: How do you design for ML infrastructure?

Section titled “Q1689: How do you design for ML infrastructure?”

Answer:

Terminal window
# ML infrastructure
# 1. Data pipeline
# ETL
# 2. Training
# Distributed
# 3. Serving
# Model serving
# 4. Monitoring
# Drift
# 5. MLOps
# Automation

Q1690: How do you implement cloud governance?

Section titled “Q1690: How do you implement cloud governance?”

Answer:

Terminal window
# Cloud governance
# 1. Policies
# Guardrails
# 2. Tagging
# Standards
# 3. Cost control
# Budgets
# 4. Security
# Baseline
# 5. Compliance
# Audit

Q1691: How do you design for edge security?

Section titled “Q1691: How do you design for edge security?”

Answer:

Terminal window
# Edge security
# 1. Device auth
# Certificates
# 2. Data encryption
# TLS
# 3. Network
# Segmentation
# 4. Updates
# Signed
# 5. Monitoring
# Centralized

Q1692: How do you implement container orchestration?

Section titled “Q1692: How do you implement container orchestration?”

Answer:

Terminal window
# Container orchestration
# 1. Scheduling
# Placement
# 2. Scaling
# Auto
# 3. Networking
# Service mesh
# 4. Storage
# CSI
# 5. Security
# Policies

Q1693: How do you optimize network latency?

Section titled “Q1693: How do you optimize network latency?”

Answer:

Terminal window
# Network latency optimization
# 1. CDN
# Geographic
# 2. Caching
# Multi-layer
# 3. Compression
# gz/brotli
# 4. HTTP/2
# Multiplexing
# 5. DNS
# Anycast

Q1694: How do you implement data protection?

Section titled “Q1694: How do you implement data protection?”

Answer:

Terminal window
# Data protection
# 1. Encryption
# At rest/transit
# 2. Access control
# RBAC
# 3. Backup
# Automated
# 4. Monitoring
# Audit
# 5. Incident
# Response

Q1695: How do you design for real-time processing?

Section titled “Q1695: How do you design for real-time processing?”

Answer:

Terminal window
# Real-time processing
# 1. Stream processing
# Kafka/Spark
# 2. Low latency
# Optimization
# 3. Scalability
# Horizontal
# 4. Monitoring
# Metrics
# 5. Backpressure
# Handling

Q1696: How do you implement application security?

Section titled “Q1696: How do you implement application security?”

Answer:

Terminal window
# Application security
# 1. SDLC
# Secure
# 2. SAST/DAST
# Scanning
# 3. Dependencies
# Scanning
# 4. Runtime
# Protection
# 5. Training
# Developers

Q1697: How do you optimize Linux for databases?

Section titled “Q1697: How do you optimize Linux for databases?”

Answer:

Terminal window
# Linux database optimization
# 1. Filesystem
# XFS/ext4
# 2. I/O scheduler
# Deadline/noop
# 3. Memory
# Huge pages
# 4. Network
# Buffer sizes
# 5. Disk
# SSD/NVMe

Q1698: How do you implement data retention?

Section titled “Q1698: How do you implement data retention?”

Answer:

Terminal window
# Data retention
# 1. Policy
# Defined
# 2. Classification
# By type
# 3. Automation
# Scripts
# 4. Compliance
# Legal holds
# 5. Verification
# Regular

Q1699: How do you design for compliance reporting?

Section titled “Q1699: How do you design for compliance reporting?”

Answer:

Terminal window
# Compliance reporting
# 1. Evidence
# Automated
# 2. Framework
# Mapping
# 3. Controls
# Validation
# 4. Audit
# Support
# 5. Remediation
# Tracking

Q1700: How do you implement Kubernetes networking?

Section titled “Q1700: How do you implement Kubernetes networking?”

Answer:

Terminal window
# Kubernetes networking
# 1. CNI plugin
# Calico/Flannel
# 2. Network policies
# Segmentation
# 3. Services
# Types
# 4. Ingress
# Controller
# 5. DNS
# CoreDNS

Q1701: How do you optimize database connections?

Section titled “Q1701: How do you optimize database connections?”

Answer:

Terminal window
# Database connection optimization
# 1. Pooling
# Connection pool
# 2. Sizing
# Pool size
# 3. Timeouts
# Configure
# 4. Monitoring
# Active connections
# 5. Tuning
# Database config

Q1702: How do you implement backup automation?

Section titled “Q1702: How do you implement backup automation?”

Answer:

Terminal window
# Backup automation
# 1. Scheduling
# Cron
# 2. Retention
# Policy
# 3. Verification
# Test restore
# 4. Offsite
# Replication
# 5. Monitoring
# Alerts

Q1703: How do you design for regulatory compliance?

Section titled “Q1703: How do you design for regulatory compliance?”

Answer:

Terminal window
# Regulatory compliance
# 1. Assessment
# Gap analysis
# 2. Controls
# Implementation
# 3. Monitoring
# Continuous
# 4. Documentation
# Evidence
# 5. Audit
# Support

Q1704: How do you implement service level objectives?

Section titled “Q1704: How do you implement service level objectives?”

Answer:

Terminal window
# SLO implementation
# 1. Define
# Metrics
# 2. Measurement
# Collection
# 3. Alerting
# Budget
# 4. Reporting
# Regular
# 5. Improvement
# Action

Answer:

Terminal window
# Linux storage optimization
# 1. Filesystem
# Choice
# 2. Mount options
# Tuning
# 3. LVM
# Flexible
# 4. RAID
# Configuration
# 5. Monitoring
# I/O

Q1706: How do you implement network segmentation?

Section titled “Q1706: How do you implement network segmentation?”

Answer:

Terminal window
# Network segmentation
# 1. VLANs
# Isolation
# 2. Firewalls
# Zones
# 3. Zero trust
# Micro-segmentation
# 4. Monitoring
# Traffic
# 5. Compliance
# Audit

Q1707: How do you design for ML model serving?

Section titled “Q1707: How do you design for ML model serving?”

Answer:

Terminal window
# ML model serving
# 1. Framework
# TensorFlow Serving
# 2. Scaling
# Horizontal
# 3. A/B testing
# Canary
# 4. Monitoring
# Drift
# 5. Updates
# Rolling

Q1708: How do you implement vulnerability management?

Section titled “Q1708: How do you implement vulnerability management?”

Answer:

Terminal window
# Vulnerability management
# 1. Scanning
# Regular
# 2. Prioritization
# Severity
# 3. Remediation
# Process
# 4. Verification
# Rescan
# 5. Reporting
# Metrics

Q1709: How do you optimize web application security?

Section titled “Q1709: How do you optimize web application security?”

Answer:

Terminal window
# Web application security
# 1. WAF
# Deploy
# 2. Headers
# Security
# 3. Input validation
# Sanitization
# 4. SQL injection
# Prevention
# 5. XSS
# Protection

Q1710: How do you design for compliance automation?

Section titled “Q1710: How do you design for compliance automation?”

Answer:

Terminal window
# Compliance automation
# 1. Policy as code
# OPA
# 2. Scanning
# Continuous
# 3. Remediation
# Auto
# 4. Evidence
# Collection
# 5. Reporting
# Automated

Q1711: How do you implement incident communication?

Section titled “Q1711: How do you implement incident communication?”

Answer:

Terminal window
# Incident communication
# 1. Stakeholders
# Identification
# 2. Status page
# Updates
# 3. Channels
# Multiple
# 4. Timing
# Regular
# 5. Post-incident
# Communication

Q1712: How do you optimize Kubernetes resources?

Section titled “Q1712: How do you optimize Kubernetes resources?”

Answer:

Terminal window
# Kubernetes resource optimization
# 1. Requests
# Set appropriately
# 2. Limits
# Configure
# 3. HPA
# Auto-scale
# 4. VPA
# Recommendations
# 5. Monitoring
# Usage

Q1713: How do you implement data classification?

Section titled “Q1713: How do you implement data classification?”

Answer:

Terminal window
# Data classification
# 1. Categories
# Public, Internal, Confidential
# 2. Labeling
# Automatic
# 3. Policies
# Based on class
# 4. Training
# Awareness
# 5. Auditing
# Regular

Q1714: How do you design for regulatory requirements?

Section titled “Q1714: How do you design for regulatory requirements?”

Answer:

Terminal window
# Regulatory requirements
# 1. Framework
# Selection
# 2. Controls
# Implementation
# 3. Monitoring
# Continuous
# 4. Evidence
# Automated
# 5. Audit
# Support

Q1715: How do you implement cost allocation tags?

Section titled “Q1715: How do you implement cost allocation tags?”

Answer:

Terminal window
# Cost allocation tags
# 1. Tagging policy
# Required tags
# 2. Enforcement
# SCP
# 3. Reporting
# By tag
# 4. Alerts
# Budget
# 5. Optimization
# Action

Q1716: How do you optimize Linux for networking?

Section titled “Q1716: How do you optimize Linux for networking?”

Answer:

Terminal window
# Linux network optimization
# 1. Buffer sizes
# Tuning
# 2. Offloading
# Enable
# 3. TCP
# Parameters
# 4. Queue
# Tuning
# 5. Monitoring
# Metrics

Q1717: How do you implement service mesh security?

Section titled “Q1717: How do you implement service mesh security?”

Answer:

Terminal window
# Service mesh security
# 1. mTLS
# Enable
# 2. Authorization
# Policies
# 3. Encryption
# Automatic
# 4. Audit
# Logging
# 5. Updates
# Regular

Q1718: How do you design for disaster recovery testing?

Section titled “Q1718: How do you design for disaster recovery testing?”

Answer:

Terminal window
# DR testing
# 1. Schedule
# Regular
# 2. Scope
# Defined
# 3. Documentation
# Runbooks
# 4. Validation
# Success
# 5. Improvements
# Action items

Q1719: How do you implement API versioning?

Section titled “Q1719: How do you implement API versioning?”

Answer:

Terminal window
# API versioning
# 1. Strategy
# URL path
# 2. Deprecation
# Policy
# 3. Documentation
# Swagger
# 4. Migration
# Guide
# 5. Support
# Timeline

Q1720: How do you optimize container images?

Section titled “Q1720: How do you optimize container images?”

Answer:

Terminal window
# Container image optimization
# 1. Base image
# Minimal
# 2. Layers
# Reduce
# 3. Caching
# Build cache
# 4. Multi-stage
# Build
# 5. Scanning
# Security

Q1721: How do you implement compliance monitoring?

Section titled “Q1721: How do you implement compliance monitoring?”

Answer:

Terminal window
# Compliance monitoring
# 1. Controls
# Continuous
# 2. Alerts
# Deviation
# 3. Reporting
# Regular
# 4. Remediation
# Tracking
# 5. Audit
# Support

Q1722: How do you design for data pipelines?

Section titled “Q1722: How do you design for data pipelines?”

Answer:

Terminal window
# Data pipeline design
# 1. Source
# Connectors
# 2. Processing
# ETL/ELT
# 3. Quality
# Validation
# 4. Destination
# Storage
# 5. Monitoring
# Alerts

Q1723: How do you implement zero trust network?

Section titled “Q1723: How do you implement zero trust network?”

Answer:

Terminal window
# Zero trust network
# 1. Verify
# Always
# 2. Least privilege
# Access
# 3. Micro-segmentation
# Network
# 4. Encryption
# All traffic
# 5. Monitoring
# Continuous

Q1724: How do you optimize Linux for high availability?

Section titled “Q1724: How do you optimize Linux for high availability?”

Answer:

Terminal window
# Linux HA optimization
# 1. Keepalived
# Configure
# 2. HAProxy
# Tune
# 3. Health checks
# Configure
# 4. Monitoring
# Comprehensive
# 5. Testing
# Regular

Q1725: How do you implement security automation?

Section titled “Q1725: How do you implement security automation?”

Answer:

Terminal window
# Security automation
# 1. Scanning
# Automated
# 2. Remediation
# Auto-fix
# 3. Response
# Playbooks
# 4. Integration
# CI/CD
# 5. Monitoring
# Continuous

Q1726: How do you design for event-driven architecture?

Section titled “Q1726: How do you design for event-driven architecture?”

Answer:

Terminal window
# Event-driven architecture
# 1. Event sourcing
# Design
# 2. Message broker
# Kafka
# 3. Consumers
# Scaling
# 4. Idempotency
# Handle
# 5. Monitoring
# Events

Q1727: How do you implement infrastructure testing?

Section titled “Q1727: How do you implement infrastructure testing?”

Answer:

Terminal window
# Infrastructure testing
# 1. Validation
# Terraform
# 2. Integration
# Kitchen
# 3. Compliance
# InSpec
# 4. Security
# Scanning
# 5. Chaos
# Engineering

Answer:

Terminal window
# DevOps optimization
# 1. CI/CD
# Optimize
# 2. Automation
# Everything
# 3. Monitoring
# Feedback
# 4. Collaboration
# Teams
# 5. Culture
# Improvement

Q1729: How do you implement data encryption?

Section titled “Q1729: How do you implement data encryption?”

Answer:

Terminal window
# Data encryption
# 1. At rest
# LUKS
# 2. In transit
# TLS
# 3. Application
# Field-level
# 4. Keys
# Management
# 5. Rotation
# Policy

Q1730: How do you design for incident recovery?

Section titled “Q1730: How do you design for incident recovery?”

Answer:

Terminal window
# Incident recovery
# 1. Detection
# Fast
# 2. Containment
# Quick
# 3. Eradication
# Complete
# 4. Recovery
# Fast
# 5. Post-incident
# Learning

Q1731: How do you implement container security scanning?

Section titled “Q1731: How do you implement container security scanning?”

Answer:

Terminal window
# Container security scanning
# 1. Build time
# Scan images
# 2. Registry
# Scan stored
# 3. Runtime
# Scan running
# 4. Policies
# Define
# 5. Automation
# CI/CD

Q1732: How do you optimize Linux for virtualization?

Section titled “Q1732: How do you optimize Linux for virtualization?”

Answer:

Terminal window
# Linux virtualization optimization
# 1. CPU
# Pinning
# 2. Memory
# Overcommit
# 3. Network
# Para-virtual
# 4. Storage
# VirtIO
# 5. Monitoring
# Per-VM

Q1733: How do you implement access certification?

Section titled “Q1733: How do you implement access certification?”

Answer:

Terminal window
# Access certification
# 1. Review schedule
# Quarterly
# 2. Certification
# Campaign
# 3. Remediation
# Tasks
# 4. Exceptions
# Approval
# 5. Reporting
# Audit

Q1734: How do you design for data recovery?

Section titled “Q1734: How do you design for data recovery?”

Answer:

Terminal window
# Data recovery
# 1. Backups
# Multiple
# 2. Point in time
# Capability
# 3. Testing
# Regular
# 4. Documentation
# Procedures
# 5. Team
# Training

Q1735: How do you implement API authentication?

Section titled “Q1735: How do you implement API authentication?”

Answer:

Terminal window
# API authentication
# 1. OAuth 2.0
# Implement
# 2. JWT
# Tokens
# 3. API keys
# Management
# 4. Rotation
# Policy
# 5. Monitoring
# Usage

Q1736: How do you optimize database indexing?

Section titled “Q1736: How do you optimize database indexing?”

Answer:

Terminal window
# Database indexing
# 1. Identify
# Slow queries
# 2. Analyze
# EXPLAIN
# 3. Create
# Appropriate
# 4. Composite
# Order
# 5. Maintenance
# Rebuild

Q1737: How do you implement incident triage?

Section titled “Q1737: How do you implement incident triage?”

Answer:

Terminal window
# Incident triage
# 1. Classification
# Severity
# 2. Impact
# Assessment
# 3. Prioritization
# Order
# 4. Assignment
# Owner
# 5. Escalation
# Path

Q1738: How do you design for cloud migration?

Section titled “Q1738: How do you design for cloud migration?”

Answer:

Terminal window
# Cloud migration
# 1. Assessment
# Discovery
# 2. Planning
# Strategy
# 3. Migration
# Execute
# 4. Validation
# Testing
# 5. Optimization
# Post-migration

Q1739: How do you implement security policies?

Section titled “Q1739: How do you implement security policies?”

Answer:

Terminal window
# Security policies
# 1. Framework
# Define