Linux_Practical_Interview_1501 1750

Linux Practical Interview Questions (1501-1750)

Linux System Security

Q1501: How do you implement SELinux policies?

Answer:

# Check SELinux status
getenforce
sestatus

# SELinux contexts
# View file contexts
ls -Z /var/www/html
ls -Zd /var/www/html

# View process contexts
ps auxZ | grep nginx

# Change context
chcon -t httpd_sys_content_t /var/www/html/file.html
semanage fcontext -a -t httpd_sys_content_t "/web(/.*)?"
restorecon -Rv /web

# Boolean values
getsebool -a
setsebool -P httpd_can_network_connect on

# Create custom policy module
# 1. Generate Type Enforcement file
# myapp.te
module myapp 1.0;
require {
    type httpd_t;
    type myapp_log_t;
    class file { read write };
}
allow httpd_t myapp_log_t:file { read write };

# 2. Compile and install
checkmodule -M -m -o myapp.mod myapp.te
semodule_package -o myapp.pp -m myapp.mod
semodule -i myapp.pp

Q1502: How do you configure AppArmor profiles?

Answer:

# Install AppArmor
apt install apparmor apparmor-utils

# View profiles
aa-status
ls /etc/apparmor.d/

# Create profile
aa-genprof /usr/bin/myapp

# Profile syntax
# /etc/apparmor.d/usr.bin.myapp
#include <tunables/global>
/usr/bin/myapp {
    #include <abstractions/base>
    #include <abstractions/bash>

    # Allow read /etc
    /etc/** r,

    # Allow write to log
    /var/log/myapp/* w,

    # Deny access
    deny /etc/shadow r,
    deny /var/log/secure w,

    # Network
    network inet stream,
}

# Enable/disable
aa-disable /usr/bin/myapp
aa-enforce /usr/bin/myapp
aa-complain /usr/bin/myapp

# Reload
apparmor_parser -r /etc/apparmor.d/usr.bin.myapp

Q1503: How do you implement Linux capabilities?

Answer:

# View capabilities
# File capabilities
getcap -r /usr/bin/

# Process capabilities
cat /proc/$$/status | grep Cap

# Set file capabilities
setcap 'cap_net_raw+ep' /usr/bin/ping
getcap /usr/bin/ping

# Remove capabilities
setcap -r /usr/bin/ping

# Run with specific capabilities
# Using run helper
# /etc/security/capability.conf
# none   root
# cap_net_raw user1
# cap_net_admin user2

# Use setcap in code
# In C
#include <sys/capability.h>
cap_t caps;
caps = cap_get_proc();
cap_set_flag(caps, CAP_EFFECTIVE, CAP_NET_RAW, 1);
cap_set_proc(caps);

Q1504: How do you secure Linux system services?

Answer:

# Disable unnecessary services
systemctl mask service_name
systemctl disable service_name

# View active services
systemctl list-units --type=service --state=running

# Secure SSH
# /etc/ssh/sshd_config
PermitRootLogin no
PasswordAuthentication no
PubkeyAuthentication yes
MaxAuthTries 3
ClientAliveInterval 300
X11Forwarding no
AllowUsers user1 user2
DenyUsers root

# Secure Cron
# /etc/cron.allow (only these users)
# /etc/cron.deny (deny these users)

# Secure at
# /etc/at.allow

# Secure system limits
# /etc/security/limits.conf
* hard maxlocks 100
* soft nproc 512

Q1505: How do you implement user authentication security?

Answer:

# Configure PAM
auth required pam_tally2.so deny=3 unlock_time=600 onerr=fail

# Password policy
# /etc/pam.d/common-password
password required pam_pwhistory.so remember=5
password [default=1] pam_permit.so
password requisite pam_cracklib.so try_first_pass retry=3 minlen=12 dcredit=-1 ucredit=-1 lcredit=-1 ocredit=-1

# Set password expiry
# /etc/login.defs
PASS_MAX_DAYS 90
PASS_MIN_DAYS 1
PASS_WARN_AGE 14

# For user
passwd -x 90 -w 14 -n 1 username
chage -M 90 -W 14 username

# View aging
chage -l username

Linux Network Security

Q1506: How do you configure firewall rules?

Answer:

# Basic iptables rules
# Flush existing rules
iptables -F
iptables -X
iptables -t nat -F
iptables -t mangle -F

# Default policies
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT

# Allow loopback
iptables -A INPUT -i lo -j ACCEPT
iptables -A OUTPUT -o lo -j ACCEPT

# Allow established connections
iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow SSH
iptables -A INPUT -p tcp --dport 22 -m conntrack --ctstate NEW -m recent --set
iptables -A INPUT -p tcp --dport -m conntrack --ctstate NEW -m recent --update --seconds 60 --hitcount 4 -j DROP

# Allow HTTP/HTTPS
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# Save rules
iptables-save > /etc/iptables/rules.v4

Q1507: How do you implement network segmentation?

Answer:

# Create network namespaces
ip netns add dmz
ip netns add internal

# Configure VLANs
ip link add link eth0 name eth0.100 type vlan id 100
ip addr add 192.168.100.1/24 dev eth0.100
ip link set eth0.100 up

# Bridge isolation
ip link add name br-dmz type bridge
ip link set eth1 master br-dmz
ip link set eth2 master br-dmz

# iptables zone-based firewall
iptables -N DMZ-ZONE
iptables -N INTERNAL-ZONE
iptables -N EXTERNAL-ZONE

# DMZ rules
iptables -A DMZ-ZONE -p tcp --dport 80 -j ACCEPT
iptables -A DMZ-ZONE -p tcp --dport 443 -j ACCEPT
iptables -A DMZ-ZONE -j REJECT

# Internal rules
iptables -A INTERNAL-ZONE -j ACCEPT
iptables -A INTERNAL-ZONE -o eth0 -j MASQUERADE

Q1508: How do you configure IDS/IPS?

Answer:

# Install Snort
apt install snort

# Configure
# /etc/snort/snort.conf
ipvar HOME_NET 192.168.1.0/24
ipvar EXTERNAL_NET !$HOME_NET

# Custom rules
# /etc/snort/rules/local.rules
# Alert on ICMP
alert icmp any any -> $HOME_NET any (msg:"ICMP Ping"; sid:1000001; rev:1;)

# Alert on SSH attempts
alert tcp any any -> $HOME_NET 22 (msg:"SSH Connection Attempt"; \
    flow:to_server,established; content:"SSH"; nocase; sid:1000002; rev:1;)

# Alert on port scan
alert tcp any any -> $HOME_NET any (msg:"Port Scan"; \
    flow:to_server; detection_filter:track by_src,count 5,seconds 10; \
    sid:1000003; rev:1;)

# Run snort
snort -c /etc/snort/snort.conf -i eth0

# Suricata (modern alternative)
apt install suricata
suricata -c /etc/suricata/suricata.yaml -i eth0

Q1509: How do you implement DDoS protection?

Answer:

# Rate limiting with iptables
# Limit connections per IP
iptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \
    -m recent --set
iptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \
    -m recent --update --seconds 60 --hitcount 20 -j DROP

# Limit ICMP
iptables -A INPUT -p icmp --icmp-type echo-request \
    -m hashlimit --hashlimit-above 1/sec --hashlimit-burst 4 \
    --hashlimit-htable-size 100000 --hashlimit-mode srcip \
    --hashlimit-name icmp_limit -j DROP

# SYN flood protection
# /etc/sysctl.conf
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_syn_retries=2
net.ipv4.tcp_max_syn_backlog=4096

# Application layer
# Nginx rate limiting
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req zone=general burst=20 nodelay;

Q1510: How do you configure VPN security?

Answer:

# WireGuard setup
# Generate keys
wg genkey | tee private.key | wg pubkey > public.key

# Server configuration
# /etc/wireguard/wg0.conf
[Interface]
PrivateKey = <server-private-key>
Address = 10.0.0.1/24
ListenPort = 51820
PostUp = iptables -A FORWARD -i wg0 -j ACCEPT
PostUp = iptables -A FORWARD -o wg0 -j ACCEPT
PostUp = iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i wg0 -j ACCEPT
PostDown = iptables -D FORWARD -o wg0 -j ACCEPT
PostDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE

[Peer]
PublicKey = <client-public-key>
AllowedIPs = 10.0.0.2/32

# Client configuration
[Interface]
PrivateKey = <client-private-key>
Address = 10.0.0.2/24

[Peer]
PublicKey = <server-public-key>
Endpoint = server.example.com:51820
AllowedIPs = 0.0.0.0/0
PersistentKeepalive = 25

wg-quick up wg0

Linux Kernel Hardening

Q1511: How do you secure kernel parameters?

Answer:

# Network security
net.ipv4.conf.all.rp_filter=1
net.ipv4.conf.default.rp_filter=1
net.ipv4.icmp_echo_ignore_broadcasts=1
net.ipv4.icmp_ignore_bogus_error_responses=1
net.ipv4.conf.all.accept_redirects=0
net.ipv4.conf.default.accept_redirects=0
net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.default.send_redirects=0
net.ipv4.conf.all.accept_source_route=0
net.ipv4.conf.default.accept_source_route=0
net.ipv4.tcp_timestamps=0

# Kernel security
kernel.dmesg_restrict=1
kernel.kptr_restrict=2
kernel.yama.ptrace_scope=2
kernel.sysrq=0

# Memory protection
vm.mmap_min_addr=65536
vm.swappiness=10

# Apply
sysctl -p
sysctl --system

Q1512: How do you implementgrsecurity?

Answer:

# Install grsecurity
# Option 1: Use precompiled kernel
wget https:// kernels.org/pub/linux/kernel/v4.x/linux-4.14.12-grsec.tar.xz

# Option 2: Use PaX/gradm
apt install paxctl gradm

# PaX flags
paxctl -C /usr/bin/nginx
paxctl -m /usr/bin/nginx  # Enable MPROTECT
paxctl -s /usr/bin/nginx   # Enable SEGMEXEC
paxctl -r /usr/bin/nginx   # Enable RANDEX

# gradm configuration
# /etc/gradm/admin
# admin:password:0:0

# Enable learning mode
gradm -L /etc/gradm/learning
# Run application in learning mode
gradm -L /etc/gradm/learning -E /usr/bin/nginx

# Compile rules
gradm -F -O /etc/gradm/default.policies

# Enable
gradm -e nginx

Q1513: How do you implement mandatory access control?

Answer:

# SELinux configuration
SELINUX=enforcing
SELINUXTYPE=targeted

# Create custom policy
# myapp.te
policy_module(myapp, 1.0)
type myapp_t;
type myapp_exec_t;
role system_r types myapp_t;
type_transition system_r myapp_exec_t:process myapp_t;

# Compile and install
make -f /usr/share/selinux/devel/Makefile myapp.pp
semodule -i myapp.pp

# AppArmor configuration
# Already covered in previous question

# SMACK (Simplified Mandatory Access Control)
# Enable in kernel
# CONFIG_SECURITY_SMACK=y

# Configure
# /etc/smack/accesses
# Format: subject object access
_ _ r
root myapp rw
myapp _ rw

Q1514: How do you implement container security?

Answer:

# Docker security
# Run without privileges
docker run --rm -it --cap-drop ALL --user 1000:1000 nginx

# Read-only root filesystem
docker run --rm -it --read-only nginx

# Resource limits
docker run --rm -it --memory=256m --cpus=0.5 nginx

# Network isolation
docker run --rm -it --network none nginx

# SELinux/AppArmor
docker run --rm -it --securitymor:default nginx

#-opt appar Seccomp profile
docker run --rm -it --security-opt seccomp:default nginx

# Rootless Docker
# Install
apt install docker-ce-rootless-extras

# Setup
dockerd-rootless.sh

# Verify
docker info

# Check capabilities
docker run --rm -it --rm nginx capsh --print

Q1515: How do you secure boot process?

Answer:

# UEFI secure boot
# Check status
mokutil --sb-state

# Enroll keys
mokutil --import key.der

# GRUB password
# Generate hash
grub-mkpasswd-pbkdf2
# Add to /etc/grub.d/40_custom
set superusers="admin"
password_pbkdf2 admin grub.pbkdf2.sha512...hash...

# Rebuild GRUB
update-grub

# Disable USB boot
# /etc/modprobe.d/blacklist-usb.conf
install usb-storage /bin/true

# Boot kernel parameters
# /etc/default/grub
GRUB_CMDLINE_LINUX="secure boot=1"

# TPM measured boot
# Install
apt install tpm2-tools

# Measure boot
tpm2_pcrread

# Verify
tpm2_quote -c -k key.file -g sha256 -f quote.out -q "my quote"

Linux Advanced Networking

Q1516: How do you configure advanced routing?

Answer:

# Policy routing
# Add table
echo "200 wan2" >> /etc/iproute2/rt_tables

# Add route
ip route add default via 192.168.2.1 dev eth1 table wan2

# Add rule
ip rule add from 192.168.2.10 table wan2
ip rule add to 192.168.2.0/24 table wan2

# NAT with iptables
# SNAT
iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth0 -j SNAT --to-source 203.0.113.10

# DNAT
iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 8080 -j DNAT --to-destination 192.168.1.10:80

# Masquerade
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

# Multi-path routing
ip route add default scope global nexthop via 192.168.1.1 dev eth0 weight 1 \
    nexthop via 192.168.2.1 dev eth1 weight 1

Q1517: How do you configure network bonding for high availability?

Answer:

# Load bonding module
modprobe bonding mode=1 miimon=100

DEVICE=bond0
BONDING_OPTS="mode=1 miimon=100 primary=eth0"
IPADDR=192.168.1.10
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
ONBOOT=yes

# /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
MASTER=bond0
SLAVE=yes
ONBOOT=yes

# /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
MASTER=bond0
SLAVE=yes
ONBOOT=yes

# Mode 4 (LACP)
# /etc/sysconfig/network-scripts/ifcfg-bond0
BONDING_OPTS="mode=4 miimon=100 lacp_rate=1"

# Monitor
cat /proc/net/bonding/bond0

# ethtool
ethtool -S bond0

Q1518: How do you configure IPv6 security?

Answer:

# Disable IPv6
net.ipv6.conf.all.disable_ipv6=1
net.ipv6.conf.default.disable_ipv6=1
net.ipv6.conf.lo.disable_ipv6=1

# Or via GRUB
# GRUB_CMDLINE_LINUX="ipv6.disable=1"

# IPv6 firewall rules
ip6tables -F
ip6tables -P INPUT DROP
ip6tables -P FORWARD DROP
ip6tables -P OUTPUT ACCEPT

# Allow established
ip6tables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

# Allow ICMPv6
ip6tables -A INPUT -p ipv6-icmp -j ACCEPT

# Allow SSH
ip6tables -A INPUT -p tcp --dport 22 -j ACCEPT

# Block routing header
ip6tables -A INPUT -m rt --rt-type 0 -j DROP

# RA guard
# On switch or router
# Configure Router Advertisement filtering

Q1519: How do you implement Quality of Service?

Answer:

# Traffic control with tc
# Create qdisc
tc qdisc add dev eth0 root handle 1: htb default 10

# Create classes
tc class add dev eth0 parent 1: classid 1:10 htb rate 10Mbit ceil 10Mbit
tc class add dev eth0 parent 1: classid 1:20 htb rate 5Mbit ceil 5Mbit

# Filter traffic
tc filter add dev eth0 parent 1: protocol all prio 1 u32 match ip dst 192.168.1.10 flowid 1:20

# Example: Prioritize SSH
tc qdisc add dev eth0 root handle 1: prio
tc filter add dev eth0 parent 1: protocol ip prio 10 u32 match ip dport 22 0xffff flowid 1:2
tc filter add dev eth0 parent 1: protocol ip prio 20 u32 match ip sport 22 0xffff flowid 1:2

# View
tc qdisc show
tc class show
tc filter show

Q1520: How do you configure DNS security?

Answer:

# DNSSEC with BIND
# Enable in named.conf
dnssec-validation auto;
dnssec-lookaside auto;

# Sign zone
dnssec-keygen -a RSASHA256 -b 2048 -n ZONE example.com
dnssec-signzone -S -o example.com db.example.com

# Configure resolver
# /etc/bind/named.conf.options
options {
    dnssec-enable yes;
    dnssec-validation yes;
    dnssec-lookaside auto;
};

# Test DNSSEC
dig +dnssec example.com
dig +cd secure.example.com

# Unbound configuration
# /etc/unbound/unbound.conf
server:
    val-log-level: 2
    harden-glue: yes
    harden-dnssec: yes
    use-caps-for-id: yes

# Query validation
drill -S example.com

Linux Performance Analysis

Q1521: How do you analyze CPU performance?

Answer:

# CPU info
lscpu
cat /proc/cpuinfo

# CPU usage over time
mpstat -P ALL 1
sar -u 1

# Per-CPU usage
mpstat -P ALL 1

# Process CPU usage
top
ps aux --sort=-%cpu
pidstat -p <pid> 1

# CPU steal (virtualization)
vmstat 1

# Scheduler
# View process priority
ps -eo pid,ni,pri,comm

# CPU affinity
taskset -c 0-3 program
taskset -p 0xF <pid>

# Check CPU frequency
cpupower frequency-info
cpupower frequency-set -g performance

Q1522: How do you analyze memory performance?

Answer:

# Memory info
free -h
cat /proc/meminfo

# Memory usage over time
vmstat 1
sar -B 1

# Per-process memory
ps aux --sort=-%mem
pmap -x <pid>
cat /proc/<pid>/status | grep -i vm

# Memory allocation issues
# Check for OOM
dmesg | grep -i "out of memory"
cat /var/log/syslog | grep -i oom

# Slab info
slabtop

# Huge pages
cat /proc/meminfo | grep -i huge

# Transparent huge pages
cat /sys/kernel/mm/transparent_hugepage/enabled

# Memory pressure
cat /proc/pressure/memory

Q1523: How do you analyze I/O performance?

Answer:

# I/O statistics
iostat -xz 1
sar -d 1

# Per-process I/O
iotop
pidstat -d 1

# Block device info
lsblk
blkid

# I/O scheduler
cat /sys/block/sda/queue/scheduler
echo deadline > /sys/block/sda/queue/scheduler

# Queue depth
cat /sys/block/sda/queue/nr_requests

# Check for I/O waits
vmstat 1

# File system performance
# Read-ahead
cat /sys/block/sda/queue/read_ahead_kb

# Trace I/O
blktrace -d /dev/sda -o trace
blkparse -i trace

Q1524: How do you analyze network performance?

Answer:

# Network statistics
netstat -s
ss -s

# Per-interface statistics
ip -s link
netstat -i

# Connection states
ss -tan state established
ss -tan state time-wait

# Bandwidth monitoring
iftop
nethogs

# Packet capture
tcpdump -i eth0
tcpdump -i eth0 -w capture.pcap

# Network latency
ping -c 4 host
traceroute host
mtr host

# TCP analysis
# TCP retransmits
netstat -s | grep -i retrans

# Connection tracking
conntrack -L

# Socket statistics
ss -tulpn
lsof -i

Q1525: How do you use performance profiling tools?

Answer:

# perf
perf record -g -p <pid>
perf report
perf top

# Flame graph
# Install
git clone https://github.com/brendangregg/FlameGraph.git

# Generate
perf record -F 99 -g -p <pid>
perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flame.svg

# System-wide profiling
perf record -a -g -- sleep 10
perf report

# Valgrind
valgrind --tool=cachegrind ./program
cg_annotate cachegrind.out.*

# gprof
gcc -pg -g program.c -o program
./program
gprof program gmon.out > analysis.txt

# strace
strace -c -p <pid>
strace -T -tt -p <pid>

Linux Storage Advanced

Q1526: How do you configure RAID?

Answer:

# Create RAID5
mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[b-e]1

# Create RAID10
mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/sd[b-e]1

# Create RAID1 with spare
mdadm --create /dev/md0 --level=1 --raid-devices=2 --spare-devices=1 /dev/sd[b-d]1

# Monitor
mdadm --detail /dev/md0
cat /proc/mdstat

# Add to /etc/mdadm.conf
mdadm --examine --scan >> /etc/mdadm.conf

# Manage
mdadm /dev/md0 --add /dev/sdf1
mdadm /dev/md0 --remove /dev/sdb1
mdadm /dev/md0 --fail /dev/sdb1

# Stop/start
mdadm --stop /dev/md0
mdadm --assemble /dev/md0

Q1527: How do you configure LVM?

Answer:

# Create physical volume
pvcreate /dev/sdb1
pvdisplay
pvmove /dev/sdb1 /dev/sdc1

# Create volume group
vgcreate vg_data /dev/sdb1
vgextend vg_data /dev/sdc1
vgdisplay

# Create logical volume
lvcreate -L 10G -n lv_data vg_data
lvcreate -l 100%FREE -n lv_backup vg_data

# Create thin pool
lvcreate -L 100G --thinpool vg_data/thin_pool
lvcreate -V 10G --thin -n lv_thin vg_data/thin_pool

# Snapshot
lvcreate -s -L 5G -n lv_snap vg_data/lv_data

# Resize
lvextend -L +10G /dev/vg_data/lv_data
lvreduce -L -5G /dev/vg_data/lv_data

# Remove
lvremove /dev/vg_data/lv_data
vgremove vg_data
pvremove /dev/sdb1

Q1528: How do you configure encrypted filesystems?

Answer:

# LUKS encryption
cryptsetup luksFormat /dev/sdb1
cryptsetup luksOpen /dev/sdb1 encrypted
mkfs.xfs /dev/mapper/encrypted

# Add key
cryptsetup luksAddKey /dev/sdb1

# Backup header
cryptsetup luksHeaderBackup /dev/sdb1 --header-backup-file header.img

# Auto unlock
# /etc/crypttab
encrypted /dev/sdb1 none luks

# /etc/fstab
/dev/mapper/encrypted /mnt/data xfs defaults 0 2

# eCryptfs
mount -t ecryptfs /backup /encrypted
# Or use fscrypt
mkfs.ext4 -O encrypt /dev/sda1
mount /dev/sda1 /mnt
fscrypt setup
fscrypt encrypt /mnt

Q1529: How do you configure NFS security?

Answer:

# Server configuration
/data 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash,sec=krb5p)
/secure 192.168.1.0/24(rw,sync,sec=sys)

# Kerberized NFS
# Server
# /etc/exports
/data gss/krb5p(rw,sync,no_subtree_check)

# Export
exportfs -av

# Client
mount -t nfs -o sec=krb5p server:/data /mnt

# Security options
# sec=sys - UID/GID mapping
# sec=krb5 - Authentication only
# sec=krb5i - Integrity
# sec=krb5p - Privacy

# Firewall
# Allow NFS
iptables -A INPUT -p tcp --dport 2049 -j ACCEPT
iptables -A INPUT -p udp --dport 2049 -j ACCEPT

# Test
nfsstat -c
showmount -e server

Q1530: How do you configure disk quotas?

Answer:

# Enable quota
/dev/sda1 /home ext4 usrquota,grpquota 0 2

# Remount
mount -o remount /home

# Initialize quota
quotacheck -cug /home

# Enable quota
quotaon /home

# Set user quota
edquota -u username
# Edit soft/hard limits

# Set group quota
edquota -g groupname

# View quota
quota -u username
quota -g groupname
repquota -a

# Copy quota template
edquota -p template_user new_user

# Email reports
# /etc/cron.daily/quotas
quotacheck -avug
repquota -a | mail -s "Quota Report" admin@example.com

Linux Services Configuration

Q1531: How do you configure Apache advanced?

Answer:

# Virtual host with SSL
<VirtualHost *:443>
    ServerName example.com
    DocumentRoot /var/www/html

    SSLEngine on
    SSLCertificateFile /etc/ssl/certs/server.crt
    SSLCertificateKeyFile /etc/ssl/private/server.key
    SSLCertificateChainFile /etc/ssl/certs/ca.crt

    <Directory /var/www/html>
        Options -Indexes +FollowSymLinks
        AllowOverride None
        Require all granted
    </Directory>

    # Security headers
    Header always set X-Frame-Options "SAMEORIGIN"
    Header always set X-Content-Type-Options "nosniff"
    Header always set X-XSS-Protection "1; mode=block"

    # Performance
    KeepAlive On
    MaxKeepAliveRequests 100
    KeepAliveTimeout 5

    # Compression
    AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/javascript
</VirtualHost>

# Load balancer
<Proxy balancer://mycluster>
    BalancerMember http://192.168.1.10:8080 route=node1
    BalancerMember http://192.168.1.11:8080 route=node2
    ProxySet lbmethod=byrequests
</Proxy>

ProxyPass / balancer://mycluster/
ProxyPassReverse / balancer://mycluster/

Q1532: How do you configure Nginx advanced?

Answer:

# Worker configuration
worker_processes auto;
worker_rlimit_nofile 65535;

events {
    worker_connections 65535;
    use epoll;
    multi_accept on;
}

http {
    # Logging
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';

    access_log /var/log/nginx/access.log main;
    error_log /var/log/nginx/error.log warn;

    # Performance
    open_file_cache max=10000 inactive=30s;
    open_file_cache_valid 60s;
    open_file_cache_min_uses 2;
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;

    # Upstream with health check
    upstream backend {
        least_conn;
        server 192.168.1.10:8080 max_fails=3 fail_timeout=30s;
        server 192.168.1.11:8080 max_fails=3 fail_timeout=30s;
        keepalive 32;
    }

    server {
        location / {
            proxy_pass http://backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
        }
    }
}

Q1533: How do you configure PostgreSQL high availability?

Answer:

# Streaming replication
wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB
hot_standby = on

# Master: /etc/postgresql/14/main/pg_hba.conf
host replication replicator 192.168.1.0/24 md5

# Create replication user
psql -c "CREATE USER replicator REPLICATION LOGIN PASSWORD 'secret';"

# Backup on replica
pg_basebackup -h master -D /var/lib/postgresql/14/main -U replicator -P -Xs

# Replica: /etc/postgresql/14/main/postgresql.conf
hot_standby = on

# Replica: /etc/postgresql/14/main/recovery.conf
standby_mode = on
primary_conninfo = 'host=master port=5432 user=replicator password=secret'
trigger_file = /tmp/promote

# pgBouncer for connection pooling
# /etc/pgbouncer/pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
listen_addr = 127.0.0.1
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

Q1534: How do you configure Redis Sentinel?

Answer:

# Sentinel configuration
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1

# Start sentinel
redis-sentinel /etc/redis/sentinel.conf

# Client connection
# Python example
from redis.sentinel import Sentinel
sentinel = Sentinel([('localhost', 26379)], socket_timeout=0.1)
master = sentinel.master_for('mymaster', socket_timeout=0.1)
slave = sentinel.slave_for('mymaster', socket_timeout=0.1)

# Commands
redis-cli -p 26379 INFO SENTINEL
redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster

# Failover
# Sentinel automatically promotes replica to master
# Old master becomes replica when back online

Q1535: How do you configure MySQL Cluster?

Answer:

# MySQL NDB Cluster
# Install
apt install mysql-cluster-community-server

# Management node config
# /etc/mysql/my.cnf
[ndb_mgmd]
node-id=1
hostname=192.168.1.10
datadir=/var/lib/mysql-cluster

# Data nodes
# /etc/mysql/my.cnf
[ndbd]
node-id=2
hostname=192.168.1.11
datadir=/var/lib/mysql-cluster

[ndbd]
node-id=3
hostname=192.168.1.12
datadir=/var/lib/mysql-cluster

# SQL node
# /etc/mysql/my.cnf
[mysqld]
node-id=4

# Start management node
ndb_mgmd -f /etc/mysql/config.ini

# Start data nodes
ndbd --initial

# Start SQL node
mysqld --ndbcluster

# Check status
ndb_mgm -e show

Linux Automation Advanced

Q1536: How do you use Ansible Vault?

Answer:

# Create encrypted file
ansible-vault create secret.yml

# Encrypt existing file
ansible-vault encrypt secrets.yml

# Edit encrypted file
ansible-vault edit secrets.yml

# View encrypted file
ansible-vault view secrets.yml

# Decrypt file
ansible-vault decrypt secrets.yml

# Change password
ansible-vault rekey secret.yml

# Use in playbook
# playbook.yml
- hosts: all
  vars_files:
    - secrets.yml
  tasks:
    - name: Create user
      user:
        name: "{{ db_user }}"
        password: "{{ db_password }}"

# Run with vault password
ansible-playbook site.yml --ask-vault-pass
# or
ansible-playbook site.yml --vault-password-file ~/.vault_pass.txt

Q1537: How do you use Ansible roles?

Answer:

# Create role structure
ansible-galaxy init nginx

# Role structure
# nginx/
# ├── defaults/
# │   └── main.yml
# ├── handlers/
# │   └── main.yml
# ├── meta/
# │   └── main.yml
# ├── tasks/
# │   └── main.yml
# ├── templates/
# │   └── nginx.conf.j2
# ├── tests/
# │   ├── inventory
# │   └── test.yml
# └── vars/
#     └── main.yml

# defaults/main.yml
nginx_port: 80
nginx_workers: 4

# tasks/main.yml
- name: Install nginx
  apt:
    name: nginx
    state: present

- name: Configure nginx
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
  notify: restart nginx

# handlers/main.yml
- name: restart nginx
  service:
    name: nginx
    state: restarted

# Use role
# playbook.yml
- hosts: webservers
  roles:
    - nginx

Q1538: How do you use Terraform modules?

Answer:

# Module structure
# ├── ec2/
# │   ├── main.tf
# │   ├── variables.tf
# │   └── outputs.tf
# └── vpc/
#     ├── main.tf
#     ├── variables.tf
#     └── outputs.tf

# ec2/variables.tf
variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

variable "ami_id" {
  description = "AMI ID"
  type        = string
}

# ec2/outputs.tf
output "instance_id" {
  value = aws_instance.this.id
}

# Main configuration
# main.tf
module "vpc" {
  source = "./modules/vpc"

  cidr_block = "10.0.0.0/16"
}

module "ec2" {
  source = "./modules/ec2"

  ami_id      = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.micro"

  vpc_id = module.vpc.vpc_id
}

Q1539: How do you use Chef cookbooks?

Answer:

# Cookbook structure
# ├── metadata.rb
# ├── recipes/
# │   └── default.rb
# ├── templates/
# │   └── config.erb
# └── attributes/
#     └── default.rb

# metadata.rb
name 'myapp'
version '1.0.0'
depends 'nginx'

# attributes/default.rb
default['myapp']['port'] = 8080
default['myapp']['workers'] = 4

# recipes/default.rb
package 'myapp'

template '/etc/myapp/config.yml' do
  source 'config.erb'
  mode '0644'
  variables(
    port: node['myapp']['port']
  )
end

service 'myapp' do
  action [:enable, :start]
end

# Use cookbook
# Run list
chef-client -r "recipe[myapp]"

Q1540: How do you use Puppet modules?

Answer:

# Module structure
# ├── manifests/
# │   ├── init.pp
# │   └── config.pp
# ├── templates/
# │   └── nginx.conf.erb
# └── files/
#     └── index.html

# manifests/init.pp
class nginx {
  package { 'nginx':
    ensure => installed,
  }

  service { 'nginx':
    ensure     => running,
    enable     => true,
    hasrestart => true,
  }
}

# manifests/config.pp
class nginx::config inherits nginx {
  file { '/etc/nginx/nginx.conf':
    ensure  => file,
    content => template('nginx/nginx.conf.erb'),
    require => Package['nginx'],
    notify  => Service['nginx'],
  }
}

# Use module
# site.pp
node 'webserver.example.com' {
  include nginx
  include nginx::config
}

Linux Cloud Native

Q1541: How do you configure Kubernetes networking?

Answer:

# CNI plugins
# Install Flannel
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

# Install Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

# Network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF

# Allow specific traffic
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
spec:
  podSelector:
    matchLabels:
      app: frontend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: backend
EOF

# Service mesh (Istio)
istioctl install --set profile=demo
kubectl label namespace default istio-injection=enabled

Q1542: How do you configure Kubernetes storage?

Answer:

# PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: standard
  hostPath:
    path: /mnt/data

---
# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

---
# StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
  replication-type: regional-pd

---
# Use in Pod
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
  - name: myapp
    image: nginx
    volumeMounts:
    - name: my-storage
      mountPath: /data
  volumes:
  - name: my-storage
    persistentVolumeClaim:
      claimName: my-pvc

Q1543: How do you configure Kubernetes security?

Answer:

# RBAC
kubectl create serviceaccount myapp
kubectl create role myapp-reader --verb=get,list --resource=pods
kubectl create rolebinding myapp-reader-binding --role=myapp-reader --serviceaccount=default:myapp

# Use service account in pod
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  serviceAccountName: myapp
  containers:
  - name: myapp
    image: nginx

# Pod Security Policy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  privileged: false
  seLinux:
    rule: RunAsAny
  runAsUser:
    rule: MustRunAsNonRoot
  fsGroup:
    rule: RunAsAny

# Network policies
# See previous question

# Secrets
kubectl create secret generic mysecret \
  --from-literal=username=admin \
  --from-literal=password=secret

Q1544: How do you configure Helm workflows?

Answer:

# Create chart
helm create myapp

# Add dependencies
# Chart.yaml
dependencies:
  - name: nginx
    version: "1.0.0"
    repository: "https://charts.bitnami.com/bitnami"

# Install with dependencies
helm dependency build
helm dependency update

# Template functions
# values.yaml
replicaCount: 3

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "myapp.fullname" . }}
spec:
  replicas: {{ .Values.replicaCount }}

# Common functions
{{ .Values.image.repository }}:{{ .Values.image.tag }}
{{ include "myapp.fullname" . }}
{{ .Release.Name }}
{{ .Release.Namespace }}

# Hooks
hooks:
  - name: backup
    manifest: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: backup
    hook: pre-install
    weight: 10

Q1545: How do you implement GitOps with ArgoCD?

Answer:

# Install ArgoCD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Get password
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

# Create application
kubectl apply -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/org/repo.git
    targetRevision: HEAD
    path: k8s/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
EOF

# Sync
argocd app sync myapp
argocd app get myapp

# Sync waves
# Add annotations to resources
# metadata:
#   annotations:
#     argocd.argoproj.io/sync-wave: "1"

Linux Troubleshooting Advanced

Q1546: How do you debug kernel issues?

Answer:

# Kernel messages
dmesg
dmesg | tail -100

# Kernel panic
# Enable kdump
apt install kdump-tools
kdump-config load

# Test kdump
echo c > /proc/sysrq-trigger

# Analyze crash
crash /var/crump/ vmcore

# Kernel config
zcat /proc/config.gz
# or
cat /boot/config-$(uname -r)

# System calls
strace -c program
strace -f program

# ftrace
echo function > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace

# perf
perf record -g program
perf report

Q1547: How do you debug network issues?

Answer:

# Interface status
ip link show
ip addr show
ethtool eth0

# Routing
ip route
ip route get 8.8.8.8

# DNS
dig example.com
getent hosts example.com

# Connectivity
ping -c 4 8.8.8.8
traceroute 8.8.8.8

# Port status
netstat -tulpn
ss -tulpn

# Capture
tcpdump -i eth0 host 192.168.1.1
tcpdump -i eth0 port 80

# Firewall
iptables -L -n -v
iptables -t nat -L -n -v

# TCP issues
# Retransmits
netstat -s | grep -i retrans
# Connection states
ss -tan state time-wait

# ARP
ip neigh show
arp -a

Q1548: How do you debug storage issues?

Answer:

# Disk usage
df -h
df -i

# Find large files
find / -type f -size +100M 2>/dev/null | head -20

# I/O stats
iostat -xz 1
sar -d 1

# Mount issues
mount
cat /proc/mounts

# Filesystem check
fsck -n /dev/sda1

# LVM issues
lvs
pvs
vgs
lvdisplay

# NFS issues
showmount -e server
mount -v server:/share /mnt

# SMART status
smartctl -a /dev/sda
smartctl -H /dev/sda

# Lsof for deleted files
lsof +L1

Q1549: How do you debug service issues?

Answer:

# Service status
systemctl status service
systemctl list-failed

# Logs
journalctl -u service -n 50
journalctl -u service --since "1 hour ago"
journalctl -xe

# Process
ps auxf | grep service
lsof -p $(pgrep -f service)

# Configuration
service configtest
nginx -t

# Dependencies
systemctl list-dependencies service
systemctl is-active service

# Resources
cat /proc/$(pgrep -f service)/limits

# Network
netstat -tulpn | grep service

# Environment
cat /proc/$(pgrep -f service)/environ | tr '\0' '\n'

# Cgroups
systemd-cgls | grep service

Q1550: How do you debug application issues?

Answer:

# Core dumps
# Enable
ulimit -c unlimited

# /etc/security/limits.conf
* soft core unlimited

# Generate core
gcore <pid>

# Analyze
gdb program core
(gdb) bt
(gdb) info threads

# Memory leaks
valgrind --leak-check=full program

# Performance profiling
perf record -g program
perf report

# Python debugging
python -m pdb program.py
python -m cProfile program.py

# Java debugging
jstack <pid>
jmap -heap <pid>
jmap -dump:format=b,file=heap.bin <pid>

# Node.js debugging
node --inspect program.js
chrome://inspect

Linux Advanced Topics

Q1551: How do you implement zero-downtime restarts?

Answer:

# Nginx graceful reload
nginx -s reload
# or
systemctl reload nginx

# HAProxy
# Reload without downtime
systemctl reload haproxy

# Application with SIGTERM handling
# In application code
import signal
import sys

def sigterm_handler(signum, frame):
    # Stop accepting new connections
    # Wait for existing connections to complete
    # Then exit
    print("Shutting down gracefully...")
    sys.exit(0)

signal.signal(signal.SIGTERM, sigterm_handler)

# Kubernetes rolling update
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp

# Rollback if needed
kubectl rollout undo deployment/myapp

Q1552: How do you implement feature toggles?

Answer:

# Simple feature toggle
class FeatureToggle:
    def __init__(self):
        self.features = {}

    def enable(self, feature):
        self.features[feature] = True

    def disable(self, feature):
        self.features[feature] = False

    def is_enabled(self, feature):
        return self.features.get(feature, False)

# Usage
toggle = FeatureToggle()
toggle.enable('new_ui')

if toggle.is_enabled('new_ui'):
    show_new_ui()
else:
    show_old_ui()

# Environment-based
import os
if os.getenv('FEATURE_NEW_UI') == '1':
    show_new_ui()

# Database-backed
def is_feature_enabled(feature_name):
    result = db.query("SELECT enabled FROM features WHERE name = ?", feature_name)
    return result.enabled if result else False

Q1553: How do you implement circuit breaker pattern?

Answer:

import time
from functools import wraps

class CircuitBreaker:
    def __init__(self, failure_threshold=5, timeout=60):
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.failures = 0
        self.last_failure_time = None
        self.state = "CLOSED"  # CLOSED, OPEN, HALF_OPEN

    def call(self, func, *args, **kwargs):
        if self.state == "OPEN":
            if time.time() - self.last_failure_time > self.timeout:
                self.state = "HALF_OPEN"
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = func(*args, **kwargs)
            self._on_success()
            return result
        except Exception as e:
            self._on_failure()
            raise

    def _on_success(self):
        self.failures = 0
        self.state = "CLOSED"

    def _on_failure(self):
        self.failures += 1
        self.last_failure_time = time.time()
        if self.failures >= self.failure_threshold:
            self.state = "OPEN"

# Usage
breaker = CircuitBreaker()
result = breaker.call(risky_api_call)

Q1554: How do you implement rate limiting?

Answer:

import time
from collections import defaultdict

class RateLimiter:
    def __init__(self, max_requests, time_window):
        self.max_requests = max_requests
        self.time_window = time_window
        self.requests = defaultdict(list)

    def is_allowed(self, key):
        now = time.time()
        # Remove old requests
        self.requests[key] = [
            req_time for req_time in self.requests[key]
            if now - req_time < self.time_window
        ]

        if len(self.requests[key]) >= self.max_requests:
            return False

        self.requests[key].append(now)
        return True

# Usage (Flask)
limiter = RateLimiter(100, 60)

@app.route('/api')
def api():
    if not limiter.is_allowed(request.remote_addr):
        return "Too many requests", 429

    # Process request
    return "OK"

# Redis-based (distributed)
import redis

class RedisRateLimiter:
    def __init__(self, redis_client, max_requests, time_window):
        self.redis = redis_client
        self.max_requests = max_requests
        self.time_window = time_window

    def is_allowed(self, key):
        current = self.redis.incr(key)
        if current == 1:
            self.redis.expire(key, self.time_window)
        return current <= self.max_requests

Q1555: How do you implement service discovery?

Answer:

# Consul
# Install
apt install consul

# Configuration
# /etc/consul/config.json
{
  "datacenter": "dc1",
  "data_dir": "/var/consul",
  "ui_config": {
    "enabled": true
  },
  "retry_join": ["provider=aws tag_key=consul tag_value=server"],
  "server": true,
  "bootstrap_expect": 3
}

# Register service
# /etc/consul/service.json
{
  "service": {
    "name": "web",
    "port": 80,
    "check": {
      "http": "http://localhost:80/health",
      "interval": "10s"
    }
  }
}

# DNS interface
# Query service
dig @127.0.0.1 -p 8600 web.service.consul

# HTTP API
curl http://127.0.0.1:8500/v1/catalog/service/web

# Register in code
import consul
c = consul.Consul()

# Register service
c.agent.service.register(
    'web',
    service_id='web-1',
    port=80,
    check=consul.Check.http('http://localhost:80/health', '10s')
)

Linux Expert Topics

Q1556: How do you implement observability?

Answer:

# Distributed tracing with Jaeger
# Client integration
# Python
from opentelemetry import trace
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.sdk.trace import TracerProvider

trace.set_tracer_provider(TracerProvider())
jaeger_exporter = JaegerExporter(
    agent_host_name="jaeger",
    agent_port=6831,
)
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(jaeger_exporter)
)

tracer = trace.get_tracer(__name__)

with tracer.start_as_current_span("operation") as span:
    span.set_attribute("key", "value")
    # Do work

# Metrics with Prometheus
from prometheus_client import Counter, generate_latest

requests_total = Counter('requests_total', 'Total requests')

@app.route('/')
def hello():
    requests_total.inc()
    return 'Hello'

# Export metrics
@app.route('/metrics')
def metrics():
    return generate_latest()

# Logging structured
import logging
import json

logger = logging.getLogger(__name__)
logger.info("Request processed", extra={
    "user_id": user.id,
    "duration_ms": duration
})

Q1557: How do you implement chaos engineering?

Answer:

# Chaos Mesh
# Install
helm repo add chaos-mesh https://charts.chaos-mesh.org
helm install chaos-mesh chaos-mesh/chaos-mesh -n chaos-mesh --create-namespace

# Pod failure experiment
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: pod-failure
spec:
  action: pod-failure
  mode: one
  duration: 60s
  selector:
    namespaces:
      - default
    labelSelectors:
      app: myapp

# Network chaos
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: network-delay
spec:
  action: delay
  mode: one
  duration: 60s
  selector:
    namespaces:
      - default
  delay:
    latency: 100ms

# Litmus
# Install
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm
helm install litmuschaos litmuschaos/litmus

# Use with AWS
# Simulate EC2 instance termination
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0

Q1558: How do you implement multi-tenancy?

Answer:

# Kubernetes namespaces
kubectl create namespace tenant1
kubectl create namespace tenant2

# Resource quotas
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-quota
  namespace: tenant1
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi
    pods: "20"

# Limit ranges
apiVersion: v1
kind: LimitRange
metadata:
  name: tenant-limits
  namespace: tenant1
spec:
  limits:
  - max:
      cpu: "2"
      memory: "4Gi"
    min:
      cpu: "100m"
      memory: "128Mi"
    type: Container

# RBAC
kubectl create rolebinding tenant1-admin \
  --role=admin \
  --user=user1 \
  --namespace=tenant1

# Network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-cross-namespace
  namespace: tenant1
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          tenant: tenant1
EOF

Q1559: How do you implement disaster recovery?

Answer:

# Backup Kubernetes
# ETCD backup
ETCDCTL_API=3 etcdctl snapshot save backup.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Restore ETCD
ETCDCTL_API=3 etcdctl snapshot restore backup.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

# Velero (Kubernetes backup)
# Install
kubectl apply -f https://github.com/vmware-tanzu/velero/releases/download/v1.10.0/velero.yaml

# Backup
velero backup create backup-2024-01-01 --include-namespaces default

# Restore
velero restore create --from-backup backup-2024-01-01

# Schedule backups
velero schedule create daily --schedule="0 2am * * *"

# Database backup
mysqldump -u root -p mydb > backup.sql
pg_dump -U postgres mydb > backup.sql

# Object storage
aws s3 sync /data s3://bucket/backup/

Q1560: How do you implement security scanning?

Answer:

# Container scanning
# Trivy
trivy image myimage:latest
trivy image --severity HIGH,CRITICAL myimage:latest
trivy image --exit-code 1 --severity CRITICAL myimage:latest

# Clair
docker run -p 5432:5432 -d quay.io/coreos/clair:latest
clair-scanner myimage

# Infrastructure scanning
# Kube-bench
kube-bench run --targets node

# Kube-hunter
kubectl run --rm -it --image=kubehunter/kubehunter --name kubehunter

# SAST
# Bandit (Python)
bandit -r myapp/

# Semgrep
semgrep --config=auto mycode/

# DAST
# OWASP ZAP
zap-baseline.py -t https://myapp.example.com

# Secret scanning
# TruffleHog
trufflehog filesystem myrepo/

# gitleaks
gitleaks --path=mydir --verbose

Linux Best Practices

Q1561: How do you implement backup strategy?

Answer:

# 3-2-1 backup rule
# 3 copies of data
# 2 different storage types
# 1 offsite copy

# Backup types
# Full backup
tar -czf full-backup-$(date +%Y%m%d).tar.gz /data

# Incremental backup
# First full backup
tar -czf backup-$(date +%Y%m%d).tar.gz -g /var/log/backup.snar /data

# Differential backup
# After first full backup
tar -czf differential-$(date +%Y%m%d).tar.gz -N "2024-01-01" /data

# Database backup
mysqldump -u root -p --all-databases > all-databases-$(date +%Y%m%d).sql
pg_dumpall -U postgres > all-databases-$(date +%Y%m%d).sql

# Automated backup script
#!/bin/bash
BACKUP_DIR="/backup"
DATE=$(date +%Y%m%d)

# Database
mysqldump -u root mydb | gzip > $BACKUP_DIR/mydb-$DATE.sql.gz

# Files
tar -czf $BACKUP_DIR/files-$DATE.tar.gz /data

# Retention
find $BACKUP_DIR -type f -mtime +30 -delete

Q1562: How do you implement monitoring strategy?

Answer:

# Prometheus + Grafana
# Install
helm install prometheus stable/prometheus-operator \
  --set grafana.service.type=LoadBalancer

# Define metrics
# node_exporter
# - node_cpu_seconds_total
# - node_memory_MemTotal_bytes
# - node_filesystem_size_bytes

# Custom application metrics
from prometheus_client import Counter, Gauge, Histogram

requests_total = Counter('app_requests_total', 'Total requests')
processing_duration = Histogram('app_processing_duration_seconds')

# Alerting rules
# prometheus.rules
- alert: HighCPU
  expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
  for: 5m
  labels:
    severity: warning

# AlertManager
# alertmanager.yaml
route:
  group_by: ['alertname']
  receiver: 'team'
receivers:
- name: 'team'
  email_configs:
  - to: 'team@example.com'
  slack_configs:
  - api_url: 'https://hooks.slack.com/...'

Q1563: How do you implement logging strategy?

Answer:

# ELK Stack
# Filebeat
filebeat.inputs:
- type: log
  paths:
    - /var/log/*.log
  fields:
    type: syslog
  fields_under_root: true

output.logstash:
  hosts: ["logstash:5044"]

# Logstash
input { beats { port => 5044 } }
filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{DATA:program}: %{GREEDYDATA:message}" }
    }
  }
}
output {
  elasticsearch {
    hosts => ["elasticsearch:9200"]
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
  }
}

# Kibana
# Create index pattern
# Create dashboards

# Structured logging
# JSON format
import logging
import json

class JSONFormatter(logging.Formatter):
    def format(self, record):
        log_data = {
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'message': record.getMessage(),
            'module': record.module
        }
        return json.dumps(log_data)

Q1564: How do you implement incident response?

Answer:

# Incident response plan
# 1. Detection
# Monitor alerts -> PagerDuty -> On-call engineer

# 2. Assessment
# Check severity -> Determine impact

# 3. Communication
# Create incident channel
# Update status page

# 4. Mitigation
# Stop bleeding
# Restore service

# 5. Resolution
# Fix root cause
# Deploy fix

# Runbook example
# Runbook: Database Connection Issues
# 1. Check database status
#    systemctl status postgresql
# 2. Check connections
#    psql -c "SELECT count(*) FROM pg_stat_activity"
# 3. Check slow queries
#    psql -c "SELECT * FROM pg_stat_activity WHERE state != 'idle' LIMIT 10"
# 4. Kill long-running queries
#    SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query_start < NOW() - INTERVAL '5 minutes';
# 5. If needed, restart database
#    systemctl restart postgresql

# Post-incident
# 1. Document timeline
# 2. Identify root cause
# 3. Implement fix
# 4. Review and improve

Q1565: How do you implement capacity planning?

Answer:

# Metrics collection
# CPU
sar -u 1 60 > cpu_usage.csv
# Memory
sar -r 1 60 > memory_usage.csv
# I/O
sar -d 1 60 > io_usage.csv
# Network
sar -n DEV 1 60 > network_usage.csv

# Analysis
# Growth rate
# (current_value - past_value) / days_between

# Capacity planning formula
# CPU: (peak_usage * growth_factor * buffer) / cores
# Memory: (peak_usage * growth_factor * buffer) / available
# Disk: (current_usage * (1 + growth_rate)^years)
# Network: peak_bandwidth * redundancy_factor

# Tools
# Google SRE capacity planning
# Horizontal Pod Autoscaler metrics
kubectl autoscale deployment myapp --cpu-percent=80 --min=2 --max=10

# Vertical Pod Autoscaler
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"
EOF

Linux Expert Level

Q1566: How do you design highly available systems?

Answer:

# HA architecture
# Load balancer -> Web servers -> Database (primary + replica)
#             \-> Cache (Redis Sentinel)
#             \-> Message queue (Kafka/RabbitMQ cluster)

# Keepalived + HAProxy
# /etc/keepalived/keepalived.conf
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100

    virtual_ipaddress {
        192.168.1.100
    }

    track_script {
        check_haproxy
    }
}

vrrp_script check_haproxy {
    script "pkill -0 haproxy"
    interval 2
    weight 2
}

# HAProxy backend
backend web
    balance roundrobin
    option httpchk
    http-check expect status 200
    server web1 192.168.1.10:80 check inter 2000 fall 3 rise 2
    server web2 192.168.1.11:80 check inter 2000 fall 3 rise 2 backup

# Database HA
# See PostgreSQL replication earlier

# DNS failover
# Route 53 health checks

Q1567: How do you design scalable systems?

Answer:

# Horizontal scaling
# Add more instances behind load balancer
# Auto-scaling based on metrics

# Vertical scaling
# Increase instance size
# Requires downtime

# Database scaling
# Read replicas
# Sharding
# Partitioning

# Cache scaling
# Redis cluster mode
redis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 \
  127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1

# Message queue scaling
# Kafka
# Partition across brokers
# Replicate for redundancy

# CDN
# CloudFront, Cloudflare
# Cache static assets at edge

# Stateless application design
# Store sessions in Redis
# Store files in S3
# Database for persistent data

Q1568: How do you design secure systems?

Answer:

# Defense in depth
# 1. Network security
# - Firewalls
# - Network segmentation
# - VPN

# 2. Application security
# - Input validation
# - Output encoding
# - Parameterized queries
# - Security headers

# 3. Data security
# - Encryption at rest
# - Encryption in transit
# - Key management
# - Backup encryption

# 4. Identity and access
# - RBAC
# - MFA
# - Least privilege
# - Regular access review

# 5. Monitoring
# - SIEM
# - IDS/IPS
# - Vulnerability scanning
# - Penetration testing

# Compliance
# - GDPR, HIPAA, PCI-DSS
# - Audit logging
# - Data retention policies

Q1569: How do you implement immutable infrastructure?

Answer:

# Packer
# Build immutable images
packer build template.json

# No SSH access in production
# Use Systems Manager Session Manager

# Cloud-init for configuration
#cloud-config
package_update: true
packages:
  - nginx

# Container-based deployment
# Never modify running containers
# Rebuild and redeploy

# Infrastructure as Code
# Terraform
terraform apply -var-file=prod.tfvars

# GitOps
# ArgoCD
argocd app sync myapp

# Blue-green deployments
# Deploy to new environment
# Switch traffic
# Keep old environment for rollback

Q1570: How do you implement cost optimization?

Answer:

# Right-sizing
# Use smaller instances
# Monitor utilization

# Reserved instances
# For steady-state workloads

# Spot instances
# For batch jobs
# With checkpointing

# Autoscaling
# Scale down during off-hours

# Storage optimization
# Use appropriate storage classes
# Delete unused data
# Implement lifecycle policies

# Network optimization
# Use private subnets
# Use VPC endpoints
# Use CDN for static content

# Cost monitoring
# AWS Cost Explorer
# Budget alerts

# Tools
# cloud-custodian
# Filter and take action on resources
custodian run -s output.yml policy.yml

Linux DevOps Advanced

Q1571: How do you implement CI/CD pipelines?

Answer:

stages:
  - build
  - test
  - security
  - deploy

variables:
  DOCKER_DRIVER: overlay2

build:
  stage: build
  image: docker:latest
  services:
    - docker:dind
  script:
    - docker build -t $IMAGE:$CI_COMMIT_SHA .
    - docker push $IMAGE:$CI_COMMIT_SHA

test:
  stage: test
  image: $IMAGE
  script:
    - npm test
    - npm run lint
  coverage: '/Coverage: \d+\.\d+%/'

security:
  stage: security
  image: aquasec/trivy:latest
  script:
    - trivy image --exit-code 0 --severity HIGH,CRITICAL $IMAGE
  allow_failure: true

deploy-staging:
  stage: deploy
  script:
    - kubectl set image deployment/myapp myapp=$IMAGE
    - kubectl rollout status deployment/myapp
  environment:
    name: staging

deploy-production:
  stage: deploy
  script:
    - kubectl set image deployment/myapp myapp=$IMAGE
    - kubectl rollout status deployment/myapp
  environment:
    name: production
  when: manual
  only:
    - main

Q1572: How do you implement infrastructure testing?

Answer:

# Infrastructure as Code testing
# terraform validate
terraform validate
terraform plan -out=tfplan

# Terratest (Go)
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
)

func TestTerraform(t *testing.T) {
    terraformOptions := &terraform.Options{
        TerraformDir: "../examples/basic",
    }

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
}

# InSpec
# controls/server.rb
control 'server-01' do
  impact 1.0
  title 'Server should be configured properly'

  describe package('nginx') do
    it { should be_installed }
  end

  describe service('nginx') do
    it { should be_running }
    it { should be_enabled }
  end
end

# Run
inspec exec profile/

Q1573: How do you implement secret management?

Answer:

# HashiCorp Vault
# Install
vault server -config=config.hcl

# Enable secrets engine
vault secrets enable -path=secret kv

# Write secret
vault kv put secret/myapp/db password=secretpassword

# Read secret
vault kv get secret/myapp/db

# Use with Kubernetes
# Install Vault Agent Injector
helm install vault hashicorp/vault \
  --set "injector.enabled=true"

# Annotate pod
# metadata:
#   annotations:
#     vault.hashicorp.com/agent-inject: "true"
#     vault.hashicorp.com/role: "myapp"
#     vault.hashicorp.com/agent-inject-secret-db: "secret/data/myapp/db"

# Use in application
# Read from /vault/secrets/db file

# AWS Secrets Manager
aws secretsmanager create-secret \
  --name myapp/db \
  --secret-string '{"username":"admin","password":"secret"}'

# Kubernetes Secrets
kubectl create secret generic myapp-secrets \
  --from-literal=username=admin \
  --from-literal=password=secret

Q1574: How do you implement service mesh?

Answer:

# Istio installation
istioctl install --set profile=demo

# Deploy application
kubectl apply -f myapp.yaml

# Enable mutual TLS
kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
spec:
  mtls:
    mode: STRICT
EOF

# Traffic management
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp
        subset: v1
      weight: 90
    - destination:
        host: myapp
        subset: v2
      weight: 10

# Observability
# Enable tracing
istioctl install --set values.telemetry.enabled=true

# View dashboards
istioctl dashboard kiali

Q1575: How do you implement edge computing?

Answer:

# K3s (lightweight Kubernetes)
curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -

# KubeEdge
# Cloud node
helm install cloudcore kubeedge/cloudcore --namespace kubeedge

# Edge node
# Install edgecore
wget https://github.com/kubeedge/kubeedge/releases/download/v1.12.0/kubeedge_1.12.0_linux_amd64.tar.gz
tar -xzf kubeedge_1.12.0_linux_amd64.tar.gz

# Run edgecore
edgecore --config=/etc/kubeedge/config/edgecore.yaml

# Deploy to edge
kubectl apply -f deployment.yaml

# Use case: IoT
# Collect sensor data at edge
# Process locally
# Send aggregated data to cloud

Linux Expert Scenarios

Q1576: How do you handle production incidents?

Answer:

# Incident response workflow
# 1. Detection
# - Alerts from monitoring
# - User reports

# 2. Triage
# - Assess severity (SEV1-4)
# - Identify impact
# - Determine if customer-facing

# 3. Communication
# - Create incident channel
# - Update status page
# - Notify stakeholders

# 4. Mitigation
# - Stop bleeding (rollbacks, traffic shift)
# - Apply fix

# 5. Resolution
# - Verify fix
# - Confirm recovery

# 6. Post-mortem
# - Document timeline
# - Identify root cause (5 whys)
# - Action items

# Example incident
# Database down
# 1. Check status
# systemctl status postgresql
# 2. Attempt restart
# systemctl restart postgresql
# 3. If failed, promote replica
# pg_ctl promote -D /var/lib/postgresql/data
# 4. Verify
# psql -c "SELECT 1"
# 5. Document

Q1577: How do you perform root cause analysis?

Answer:

# 5 Whys Analysis
# Problem: API response time increased
# Why 1: Database queries slow
# Why 2: Missing index
# Why 3: New feature added without proper schema review
# Why 4: Code review didn't catch it
# Why 5: Process doesn't require schema review

# Corrective Action: Implement schema review in CI/CD

# Tools for RCA
# Logs
journalctl -u service -n 100

# Metrics
# Compare before/after
sar -q

# Traces
# Jaeger, Zipkin

# Dumps
# Core files, heap dumps

# Timeline
# Create incident timeline
# 14:00 - Alert triggered
# 14:05 - On-call acknowledged
# 14:10 - Root cause identified
# 14:15 - Fix deployed
# 14:20 - Service recovered

Q1578: How do you optimize cloud costs?

Answer:

# Cost optimization strategies
# 1. Right-sizing instances
# Use CloudWatch metrics
aws ec2 describe-instance-types --instance-type t3.micro

# 2. Reserved instances
# For predictable workloads

# 3. Spot instances
# For fault-tolerant workloads

# 4. Autoscaling
# Scale in when not needed

# 5. Storage lifecycle
# Move cold data to Glacier
aws s3 ls
aws s3api put-bucket-lifecycle-configuration --bucket mybucket \
  --lifecycle-configuration file://lifecycle.json

# 6. Delete unused resources
# Find unattached volumes
aws ec2 describe-volumes --filters Name=status,Values=available

# 7. Use managed services
# RDS, Lambda instead of EC2

# 8. Budget alerts
aws budgets create-budget \
  --account-id 123456789012 \
  --budget file://budget.json

Q1579: How do you implement compliance?

Answer:

# Compliance frameworks
# SOC 2, PCI-DSS, HIPAA, GDPR

# Audit logging
# Enable auditd
auditd

# Rules
# /etc/audit/audit.rules
-w /etc/passwd -p wa -k passwd_changes
-w /etc/shadow -p wa -k shadow_changes
-w /etc/sudoers -p wa -k sudoers_changes

# Review logs
aureport -f
ausearch -k passwd_changes

# Vulnerability scanning
# OpenVAS, Nessus, Qualys

# Penetration testing
# Annual third-party pen tests

# Data encryption
# At rest
# LUKS, TDE

# In transit
# TLS 1.2+

# Access reviews
# Quarterly user access review

# Documentation
# Policies and procedures
# Evidence collection
# Compliance reports

Q1580: How do you design disaster recovery?

Answer:

# DR strategies
# RTO (Recovery Time Objective)
# RPO (Recovery Point Objective)

# Strategy comparison
# Backup & Restore
# - RTO: Hours
# - RPO: Days

# Pilot Light
# - RTO: Minutes to hours
# - RPO: Hours

# Warm Standby
# - RTO: Minutes
# - RPO: Minutes

# Multi-Region Active-Active
# - RTO: Near zero
# - RPO: Near zero

# Implementation
# 1. Backup data
# mysqldump --all-databases | aws s3 cp - s3://bucket/backup.sql

# 2. Replicate data
# PostgreSQL streaming replication to DR region

# 3. Infrastructure as Code
# terraform import
# terraform apply

# 4. Regular DR testing
# Quarterly DR tests
# Document results

# 5. Runbook
# Document recovery procedures

Linux Expert Scenarios

Q1581: How do you handle zero-downtime deployment?

Answer:

# Blue-green deployment
# Deploy to green environment
# Test green
# Switch traffic
# Monitor
# If issues, rollback to blue

# Rolling deployment
# Update one instance at a time
kubectl rolling-update myapp --image=myapp:v2

# Canary deployment
# Route 10% to new version
# Monitor metrics
# Gradually increase
# Rollback if issues

# Kubernetes
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: myapp
spec:
  replicas: 10
  strategy:
    canary:
      maxSurge: "25%"
      maxUnavailable: 0
      steps:
      - setWeight: 10
      - pause: {duration: 10m}
      - setWeight: 30
      - pause: {duration: 10m}
      - setWeight: 50
      - pause: {duration: 10m}
      - setWeight: 100

# Feature flags
# See earlier question

Q1582: How do you handle database migrations?

Answer:

# Zero-downtime migrations
# 1. Add new column (nullable)
ALTER TABLE users ADD COLUMN new_field VARCHAR(255);

# 2. Write to both columns
# Application code change

# 3. Backfill data
UPDATE users SET new_field = old_field;

# 4. Make new column NOT NULL
ALTER TABLE users MODIFY COLUMN new_field VARCHAR(255) NOT NULL;

# 5. Remove old column
ALTER TABLE users DROP COLUMN old_field;

# For PostgreSQL
# Use pg_online
# Create index concurrently
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);

# For MySQL
# Use pt-online-schema-change
pt-online-schema-change D=t,s=users --alter "ADD COLUMN new_field VARCHAR(255)" \
  --execute

# Rollback plan
# Keep old column
# Dual write
# Test thoroughly

Q1583: How do you handle capacity emergencies?

Answer:

# Emergency response
# 1. Immediate mitigation
# Scale up
kubectl scale deployment myapp --replicas=20

# Add capacity
# In AWS
aws autoscaling set-desired-capacity \
  --auto-scaling-group-name my-asg \
  --desired-capacity 10

# 2. Identify root cause
# Check metrics
# Check logs
# Common issues
# - Traffic spike
# - Slow query
# - Memory leak

# 3. Short-term fix
# Clear cache
redis-cli FLUSHALL

# Kill expensive queries
# PostgreSQL
SELECT pg_terminate_backend(pid) FROM pg_stat_activity
  WHERE query_start < NOW() - INTERVAL '5 minutes';

# 4. Long-term fix
# Optimize code
# Add capacity
# Implement caching

Q1584: How do you handle security incidents?

Answer:

# Security incident response
# 1. Detection
# - SIEM alerts
# - IDS alerts
# - User reports

# 2. Containment
# Isolate affected systems
# iptables -I INPUT -s attacker_ip -j DROP
# iptables -I OUTPUT -d attacker_ip -j DROP

# 3. Investigation
# Collect evidence
# tcpdump -i eth0 -w capture.pcap
# Forensics

# 4. Eradication
# Remove malware
# Patch vulnerability
# Reset compromised credentials

# 5. Recovery
# Restore from clean backup
# Verify system integrity

# 6. Lessons learned
# Document incident
# Update security controls

# Tools
# - CHKRootkit
# - RKHunter
# - ClamAV
# - OSSEC

Q1585: How do you handle data corruption?

Answer:

# Data corruption response
# 1. Identify corruption
# Check logs
# Verify checksums
# md5sum

# 2. Stop writes
# Read-only mount
# mount -o remount,ro /data

# 3. Restore from backup
# Find last good backup
# Restore
# mysql -u root -p mydb < backup.sql

# 4. Point-in-time recovery
# PostgreSQL
# Find transaction ID
# pg_restore -P "2024-01-01 12:00:00" backup.dump

# 5. Verify integrity
# Check application data
# Run database checks

# 6. Prevention
# Enable checksums
# Regular backups
# Monitoring

Q1586: How do you handle network outages?

Answer:

# Network outage response
# 1. Verify outage
# ping gateway
# ping 8.8.8.8

# 2. Check interfaces
# ip link
# ip addr

# 3. Check DNS
# cat /etc/resolv.conf
# nslookup example.com

# 4. Check routes
# ip route

# 5. Recovery steps
# Reset network
systemctl restart networking

# Or
# ip link set eth0 down
# ip link set eth0 up

# For DNS issues
# systemd-resolve --flush-caches

# For cloud
# AWS
aws ec2 describe-instance-status --instance-id i-xxx

# 6. Contact provider
# If not resolvable internally

Q1587: How do you handle performance degradation?

Answer:

# Performance troubleshooting
# 1. Identify symptoms
# Check metrics
# top
# iostat 1

# 2. Locate bottleneck
# CPU bound?
top
ps aux --sort=-%cpu

# Memory bound?
free -h
vmstat 1

# I/O bound?
iostat -xz 1

# Network bound?
iftop
nethogs

# 3. Fix
# CPU: Scale, optimize code
# Memory: Add RAM, fix leaks
# I/O: Use faster storage
# Network: Optimize queries

# 4. Verify
# Monitor metrics
# Compare before/after

Q1588: How do you handle authentication failures?

Answer:

# Authentication troubleshooting
# 1. Check logs
journalctl -u sshd | tail -50
tail -f /var/log/auth.log

# 2. Verify user exists
getent passwd username
id username

# 3. Check SSH configuration
# /etc/ssh/sshd_config
# PasswordAuthentication yes
# PubkeyAuthentication yes
# AllowUsers username

# 4. Test authentication
# SSH with debug
ssh -vvv user@host

# 5. Reset password
passwd username

# 6. Check PAM
# /etc/pam.d/sshd

# 7. For LDAP
# Check connectivity
ldapsearch -x -D "cn=admin,dc=example,dc=com" -W

# Check sssd
sssd -i -d 10

Q1589: How do you handle storage full?

Answer:

# Storage full response
# 1. Find large files
du -sh /*
du -sh /var/*
du -sh /var/log/*

# 2. Find large directories
du -ah / | sort -rh | head -20

# 3. Clean logs
journalctl --vacuum-size=100M
find /var/log -type f -mtime +30 -delete

# 4. Clean tmp
rm -rf /tmp/*
rm -rf /var/tmp/*

# 5. Clean package cache
apt clean
yum clean all

# 6. Docker cleanup
docker system prune -a

# 7. Find deleted files still open
lsof +L1

# 8. Extend storage
# Add volume
# Add to LVM

Q1590: How do you handle kernel panic?

Answer:

# Kernel panic response
# 1. Verify panic
# Check logs
dmesg | tail -100

# 2. Configure kdump
apt install kdump-tools

# 3. Analyze crash
# /var/crash/
crash /var/crash/202401011200/vmcore /usr/lib/debug/boot/vmlinux-$(uname -r)

# 4. Common causes
# - Hardware failure (RAM, disk)
# - Driver issues
# - OOM
# - Kernel bugs

# 5. Fixes
# Update kernel
# Disable problematic driver
# Add RAM
# Fix OOM settings

# 6. Prevention
# Monitor resources
# Keep kernel updated
# Use hardware from compatibility list

Q1591: How do you implement infrastructure monitoring?

Answer:

# Infrastructure monitoring
# Prometheus + Grafana
# Node exporter
node_exporter --collector.filesystem.mount-points-exclude="^/(sys|proc|run)"

# Custom metrics
# Python client
from prometheus_client import Counter

requests_total = Counter('app_requests_total', 'Total requests')

# Alert rules
- alert: HighCPU
  expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
  for: 5m

- alert: HighMemory
  expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90

# Dashboards
# Import from Grafana.com

# Logs
# ELK Stack
# Loki + Grafana

Q1592: How do you implement application monitoring?

Answer:

# APM (Application Performance Monitoring)
# Jaeger
# Python
from opentelemetry import trace

tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("operation") as span:
    span.set_attribute("key", "value")

# Prometheus metrics
from prometheus_client import Counter, Histogram, Gauge

request_count = Counter('http_requests_total', 'Total HTTP requests')
request_duration = Histogram('http_request_duration_seconds')
active_users = Gauge('active_users', 'Number of active users')

# Health checks
# Kubernetes liveness/readiness probes
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Q1593: How do you implement log analysis?

Answer:

# Log analysis
# ELK Stack
# Elasticsearch
# Logstash
# Kibana

# Loki
# Grafana + Loki

# Structured logging
# JSON format
import json
import logging

class JSONFormatter(logging.Formatter):
    def format(self, record):
        return json.dumps({
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'message': record.getMessage(),
            'module': record.module
        })

# Log levels
# DEBUG - Detailed info
# INFO - Confirmation
# WARNING - Something unexpected
# ERROR - Serious problem
# CRITICAL - Very serious problem

# Analysis queries
# Find errors
grep -i error /var/log/app.log

# Count by hour
awk '{print $2}' /var/log/app.log | sort | uniq -c

# Slow requests
awk '$9 > 5 {print}' /var/log/nginx/access.log

Q1594: How do you implement alerting?

Answer:

# Alerting
# Prometheus + AlertManager
# alertmanager.yaml
route:
  group_by: ['alertname']
  receiver: 'team'
  group_wait: 10s
  group_interval: 10s

receivers:
- name: 'team'
  email_configs:
  - to: 'team@example.com'
  slack_configs:
  - api_url: 'https://hooks.slack.com/...'
    channel: '#alerts'

# PagerDuty integration
- name: 'pagerduty'
  pagerduty_configs:
  - service_key: 'KEY'

# Best practices
# 1. Alert on symptoms, not causes
# 2. Set appropriate thresholds
# 3. Avoid alert fatigue
# 4. Have runbooks
# 5. Test alerts regularly

Q1595: How do you implement backup verification?

Answer:

# Backup verification
# 1. Test restoration
# Restore to test environment
mysql -u root -p test < backup.sql
psql -U postgres test < backup.sql

# 2. Automated verification
#!/bin/bash
BACKUP_FILE=$1

# Verify backup file exists
if [ ! -f "$BACKUP_FILE" ]; then
    echo "Backup file not found"
    exit 1
fi

# Verify file size
SIZE=$(stat -f%z "$BACKUP_FILE")
if [ "$SIZE" -lt 1000 ]; then
    echo "Backup file too small"
    exit 1
fi

# Verify file integrity
if [[ "$BACKUP_FILE" == *.gz ]]; then
    gzip -t "$BACKUP_FILE"
elif [[ "$BACKUP_FILE" == *.sql ]]; then
    head -1 "$BACKUP_FILE" | grep -q "MySQL"
fi

# Verify database can be restored
# (Run in isolated environment)
# Report status

Linux Expert Advanced

Q1596: How do you design multi-region architecture?

Answer:

# Multi-region design
# DNS failover
# Route 53 health checks
aws route53 create-health-check --health-check-config '{"Type":"HTTPS","FullyQualifiedDomainName":"example.com","Port":443,"ResourcePath":"/health"}'

# Database replication
# PostgreSQL
# Primary in us-east-1
# Replica in us-west-2

# Object storage
# S3 cross-region replication
aws s3api put-bucket-replication \
  --bucket source-bucket \
  --replication-configuration file://replication.json

# Cache
# Redis Global
aws elasticache create-global-replication-group \
  --global-replication-group-id my-global \
  --primary-replication-group-id primary-id

# CDN
# CloudFront
aws cloudfront create-distribution \
  --origin-domain-name mybucket.s3.amazonaws.com

# Traffic management
# Global Accelerator
aws global-accelerator create-accelerator

Q1597: How do you implement zero trust?

Answer:

# Zero trust architecture
# 1. Identity verification
# MFA everywhere
# Conditional access policies

# 2. Network segmentation
# Micro-segmentation
# Private links
# Service mesh

# 3. Device trust
# Endpoint detection
# Mobile Device Management

# 4. Application security
# OAuth 2.0
# JWT validation

# 5. Data protection
# Encryption everywhere

# Implementation
# Kubernetes network policies
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
EOF

# Service mesh mTLS
# Istio
istioctl install --set profile=strict

# BeyondCorp
# Access proxy
# No VPN needed

Q1598: How do you implement chaos engineering?

Answer:

# Chaos engineering
# 1. Define steady state
# 2. Hypothesize
# 3. Run experiment
# 4. Observe
# 5. Fix

# Tools
# Chaos Monkey (Netflix)
# Litmus
# Chaos Mesh

# Example: Kill random pod
apiVersion: chaos-mesh.org/v1alpha1
kind: PodChaos
metadata:
  name: random-pod-kill
spec:
  action: pod-failure
  mode: random
  duration: 60s

# Example: Network delay
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: network-latency
spec:
  action: delay
  mode: one
  duration: 60s
  delay:
    latency: 100ms

# Runbook
# Document expected behavior
# Monitor during experiment
# Have rollback plan

Q1599: How do you implement GitOps?

Answer:

# GitOps
# 1. Store all configs in Git
# 2. Use CI/CD to apply changes
# 3. Automated drift detection

# ArgoCD
# Application definition
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
spec:
  project: default
  source:
    repoURL: https://github.com/org/repo.git
    targetRevision: HEAD
    path: k8s/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

# Flux
# Install
flux install

# Create source
flux create source git myapp \
  --url=https://github.com/org/repo \
  --branch=main

# Create kustomization
flux create kustomization myapp \
  --source=myapp \
  --path=./k8s/production

Q1600: How do you implement cost governance?

Answer:

# Cost governance
# 1. Tagging strategy
# All resources must have tags
# - Team: team-name
# - Project: project-name
# - CostCenter: cost-center

# 2. Budgets
# Set budgets per team/project
aws budgets create-budget \
  --account-id 123456789012 \
  --budget file://budget.json

# 3. Rightsizing
# Use AWS Compute Optimizer
aws compute-optimizer get-recommendation-resource-views

# 4. Reserved capacity
# For steady workloads
# Purchase reserved instances

# 5. Use spot
# For fault-tolerant workloads

# 6. Delete unused resources
# Find unattached volumes
aws ec2 describe-volumes --filters Name=status,Values=available

# 7. Regular review
# Weekly cost review meetings
# Track spend trends

# 8. Showback/Chargeback
# Report costs by team

Q1601: How do you implement compliance automation?

Answer:

# Compliance automation
# Open Policy Agent (OPA)
# Gatekeeper
# Prevents non-compliant resources

# Policy example
package kubernetes.admission

deny[msg] {
  input.request.kind.kind == "Deployment"
  input.request.object.spec.replicas > 10
  msg = "Cannot have more than 10 replicas"
}

# Install Gatekeeper
kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replications/replicationconstraint.yaml

# InSpec
# Compliance as code
# controls/nginx.rb
control 'nginx-01' do
  impact 1.0
  title 'Nginx should be configured securely'

  describe service('nginx') do
    it { should be_running }
  end

  describe file('/etc/nginx/nginx.conf') do
    its('content') { should_not match /server_tokens off;/ }
  end
end

# Run
inspec exec compliance/

Q1602: How do you implement disaster recovery automation?

Answer:

# DR automation
# 1. Backup automation
#!/bin/bash
# Automated backup
BACKUP_DATE=$(date +%Y%m%d)

# Database backup
mysqldump -u root -p mydb | gzip > s3://bucket/backup-$BACKUP_DATE.sql.gz

# File backup
tar -czf - /data | aws s3 cp - s3://bucket/data-$BACKUP_DATE.tar.gz

# Retention
aws s3 ls s3://bucket/ | awk '{print $2}' | while read prefix; do
    if [[ $(echo $prefix | grep -oP '\d{8}') < $(date -d '30 days ago' +%Y%m%d) ]]; then
        aws s3 rm s3://bucket/$prefix --recursive
    fi
done

# 2. DR playbook
# Documented runbooks
# Regular testing

# 3. Automated failover
# DNS failover
# Route 53 health checks + failover record
aws route53 change-resource-record-sets \
  --hosted-zone-id Z1234567890 \
  --change-batch file://failover.json

# Database failover
# Automatic replica promotion
# Connection string update

Q1603: How do you implement capacity management?

Answer:

# Capacity management
# 1. Monitor utilization
# CPU, Memory, Storage, Network

# 2. Trend analysis
# Weekly reviews
# Growth rate calculation

# 3. Forecasting
# Use ML
# aws ce get-forecast

# 4. Planning
# Add capacity before hitting limits

# 5. Optimization
# Right-size instances
# Use savings plans

# Kubernetes
# Vertical Pod Autoscaler
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: myapp-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"
EOF

# Horizontal Pod Autoscaler
kubectl autoscale deployment myapp \
  --cpu-percent=80 --min=2 --max=10

Q1604: How do you implement reliability engineering?

Answer:

# Reliability engineering
# SRE principles
# 1. SLOs (Service Level Objectives)
# - Availability: 99.9%
# - Latency: p99 < 200ms

# 2. Error budgets
# 100% - SLO = error budget
# If budget exhausted, freeze features

# 3. Toil reduction
# Automate manual tasks

# 4. Post-mortems
# Blameless
# Focus on process improvement

# 5. Releases
# Canary deployments
# Feature flags

# 6. Circuit breakers
# See earlier

# 7. Bulkheads
# Isolate failures

# 8. Self-healing
# Restart failed pods
# Replace unhealthy nodes

Q1605: How do you implement SRE practices?

Answer:

# SRE practices
# Error budgets
# https://sre.google/sre-book/availability-table/

# Toil management
# Identify
# Quantify
# Automate
# Eliminate

# Observability
# Metrics
# Logs
# Traces

# Incident management
# On-call rotation
# Runbooks
# Post-mortems

# Change management
# Canary releases
# Gradual rollouts

# SRE tools
# Prometheus
# Grafana
# Jaeger
# Loki

# On-call
# PagerDuty
# OpsGenie

# Automation
# Ansible
# Terraform
# Kubernetes

Q1606: How do you optimize Linux for cloud?

Answer:

# Cloud-optimized Linux
# Ubuntu Pro for AWS
# AWS-optimized kernel
# FIPS compliance
# Livepatch

# Cloud-specific optimizations
# Use instance store for temp data
# Use EBS for persistent data

# Network optimization
# ENA (Elastic Network Adapter)
# Use enhanced networking

# Storage optimization
# Use NVMe for high I/O
# Use EBS gp3 for balance

# CloudWatch Agent
# Install
# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

# Configure
# /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/metrics.json
{
  "metrics": {
    "namespace": "CustomNamespace",
    "metrics_collected": {
      "cpu": {
        "measurement": ["cpu_usage_idle"]
      },
      "mem": {
        "measurement": ["mem_used_percent"]
      }
    }
  }
}

Q1607: How do you implement FinOps?

Answer:

# FinOps
# Cloud financial management

# 1. Visibility
# Tag all resources
# Use cost explorer

# 2. Optimization
# Right-sizing
# Reservations
# Spot instances

# 3. Accountability
# Showback to teams
# Budgets

# Tools
# AWS Cost Explorer
# GCP Cloud Billing
# Azure Cost Management

# FinOps workflow
# 1. Inform
# Show costs by team
# Dashboards

# 2. Optimize
# Right-size resources
# Use savings plans

# 3. Operate
# Monitor daily spend
# Alerts

# Automation
# Script to find idle resources
aws ec2 describe-instances --filters "Name=instance-state-name,Values=running" \
  --query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value|[0],LaunchTime]' \
  --output table

Q1608: How do you implement platform engineering?

Answer:

# Platform engineering
# Internal Developer Platform (IDP)
# Self-service

# Components
# 1. CI/CD pipelines
# GitHub Actions
# GitLab CI

# 2. Service catalog
# Backstage
# Port

# 3. Infrastructure templates
# Terraform modules
# Helm charts

# 4. Observability
# Unified dashboards

# 5. Security
# Policy enforcement

# Implementation
# Platform team builds tools
# Developers consume

# Benefits
# Faster deployments
# Consistency
# Security
# Reduced cognitive load

# Backstage
# Create catalog
# Service templates
# Documentation

Q1609: How do you implement developer experience?

Answer:

# Developer experience
# 1. Local development
# Docker Compose
# localstack

# 2. Documentation
# OpenAPI specs
# Swagger UI

# 3. IDE integration
# LSP servers
# Debugging

# 4. Testing
# Fast feedback
# Unit tests
# Integration tests

# 5. Deployment
# Simple commands
# kubectl
# ArgoCD

# Example: Developer workflow
# 1. Clone repo
# 2. Make changes
# 3. Run tests locally
# 4. Push to branch
# 5. CI runs tests
# 6. Merge to main
# 7. CD deploys

# Self-service
# Create environment
# Deploy app
# View logs
# Scale application

Q1610: How do you implement cloud security?

Answer:

# Cloud security
# Shared responsibility model

# Identity
# IAM with least privilege
# MFA everywhere

# Network
# VPC with private subnets
# Security groups
# NACLs
# WAF

# Data
# Encryption at rest
# Encryption in transit
# Key management

# Compliance
# Regular audits
# Vulnerability scanning
# Penetration testing

# Tools
# AWS GuardDuty
# AWS Config
# AWS Security Hub

# Example: AWS security
# Enable CloudTrail
aws cloudtrail create-trail --name my-trail \
  --s3-bucket-name mybucket

# Enable GuardDuty
aws guardduty create-detector --enable

# Enable Security Hub
aws securityhub enable-organization-admin-account \
  --admin-account-id 123456789012

Q1611: How do you implement Kubernetes security?

Answer:

# Kubernetes security
# 1. RBAC
# Least privilege
kubectl create role pod-reader --verb=get,list --resource=pods
kubectl create rolebinding --role=pod-reader --user=dev

# 2. Network policies
# Default deny
kubectl apply -f network-policy.yaml

# 3. Pod security
# Pod security standards
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted

# 4. Secrets management
# Use Vault or AWS Secrets Manager
kubectl create secret generic mysecret \
  --from-literal=key=value

# 5. Image scanning
trivy image myimage:latest

# 6. Runtime security
# Falco
falco -r rules/myrules.yaml

# 7. API server security
# Disable anonymous auth
# Enable RBAC
# Use TLS

Q1612: How do you implement data protection?

Answer:

# Data protection
# 1. Classification
# Public, Internal, Confidential, Restricted

# 2. Encryption
# At rest
cryptsetup luksFormat /dev/sdb1

# In transit
# TLS 1.2+

# 3. Access control
# IAM policies
# Database permissions

# 4. Backup
# Regular backups
# Test restoration
# Offsite backup

# 5. Monitoring
# Audit logs
# Alerts on suspicious access

# 6. Data loss prevention
# Block sensitive data exfiltration

# Tools
# AWS Macie
# GCP DLP
# Azure Purview

Q1613: How do you implement supply chain security?

Answer:

# Supply chain security
# 1. Dependency scanning
# Snyk
# Dependabot

# 2. Container scanning
# Trivy
# Clair

# 3. SBOM (Software Bill of Materials)
# Generate SBOM
syft myimage:latest

# Sign artifacts
# Cosign
cosign sign myimage:latest

# Verify
cosign verify myimage:latest

# 4. SLSA compliance
# Build provenance
# GitHub Actions
# Tekton

# 5. Secure build pipeline
# No external dependencies at build time
# Use pinned versions
# Scan for secrets

Q1614: How do you implement incident management?

Answer:

# Incident management
# 1. Detection
# Monitoring alerts
# User reports

# 2. Response
# Acknowledge
# Assess severity
# Mitigate

# 3. Communication
# Status page
# Stakeholder updates

# 4. Resolution
# Fix root cause
# Verify recovery

# 5. Post-incident
# Blameless post-mortem
# Action items

# Tools
# PagerDuty
# OpsGenie
# VictorOps

# Runbook example
# Runbook: Database Connection Issues
# 1. Check database status
# systemctl status postgresql
# 2. Check connections
# psql -c "SELECT count(*) FROM pg_stat_activity"
# 3. Restart if needed
# systemctl restart postgresql

Q1615: How do you implement change management?

Answer:

# Change management
# 1. Request
# JIRA ticket
# RFC (Request for Change)

# 2. Review
# Technical review
# Security review

# 3. Approval
# Manager approval
# CAB (Change Advisory Board)

# 4. Implementation
# Schedule change window
# Implement change

# 5. Verification
# Test in staging
# Monitor in production

# 6. Documentation
# Update runbooks
# Document lessons learned

# 7. Emergency changes
# Expedited process
# Post-implementation review

# Tools
# ServiceNow
# Jira Service Management
# GitHub PRs

Linux Expert Interview Questions

Q1616: How do you design a highly available web application?

Answer:

# Architecture components
# 1. Load balancer (HAProxy/ALB)
# 2. Web servers (multiple)
# 3. Application servers (multiple)
# 4. Database (primary + replica)
# 5. Cache (Redis Sentinel/Cluster)
# 6. Message queue (Kafka cluster)
# 7. CDN for static content
# 8. Object storage (S3)

# Implementation
# Multi-AZ deployment
# Auto-scaling groups
# Health checks
# Graceful degradation

# DNS
# Route 53 with health checks

# Database
# PostgreSQL with streaming replication

# Caching
# Redis with Sentinel or Cluster

# Monitoring
# Comprehensiveability

# DR observ
# Multi-region deployment

Q1617: How do you troubleshoot a slow database?

Answer:

# Database troubleshooting
# 1. Check system resources
# CPU, Memory, I/O

# 2. Check database stats
# PostgreSQL
# pg_stat_activity
# pg_stat_statements

# MySQL
# SHOW PROCESSLIST;
# SHOW STATUS;

# 3. Check slow queries
# PostgreSQL
# pg_stat_statements
# EXPLAIN ANALYZE

# MySQL
# SHOW PROCESSLIST
# EXPLAIN

# 4. Check indexes
# PostgreSQL
# \d table_name

# MySQL
# SHOW INDEX FROM table

# 5. Fixes
# Add indexes
# Optimize queries
# Tune configuration
# Scale horizontally
# Add read replicas

Q1618: How do you design a backup strategy?

Answer:

# Backup strategy
# 1. RPO/RTO definition
# Recovery Point Objective
# Recovery Time Objective

# 2. Backup types
# Full
# Incremental
# Differential

# 3. Frequency
# Full: Weekly
# Incremental: Daily
# Transaction logs: Every 15 minutes

# 4. Retention
# Daily: 30 days
# Weekly: 12 weeks
# Monthly: 12 months
# Yearly: 7 years

# 5. Testing
# Monthly restoration tests
# Document procedures

# 6. Offsite
# Cross-region replication
# Different cloud provider

# 7. Automation
# Cron jobs
# CI/CD pipelines

Q1619: How do you secure a Linux system?

Answer:

# Linux security
# 1. Updates
# Regular patching

# 2. Firewall
# iptables/firewalld

# 3. SELinux/AppArmor
# Enable and configure

# 4. Users
# Disable root login
# SSH keys only
# Strong passwords

# 5. Services
# Disable unused services

# 6. Network
# Harden kernel parameters
# Disable IP forwarding
# Rate limiting

# 7. Monitoring
# Audit logging
# IDS

# 8. Encryption
# Full disk encryption
# TLS everywhere

Q1620: How do you design a monitoring system?

Answer:

# Monitoring system design
# 1. Metrics
# Prometheus
# Node exporter
# Application metrics

# 2. Logs
# ELK Stack or Loki

# 3. Traces
# Jaeger or Zipkin

# 4. Alerting
# Prometheus AlertManager
# PagerDuty integration

# 5. Dashboards
# Grafana

# 6. SLOs
# Define error budgets

# 7. Runbooks
# Document responses

# 8. On-call
# Rotation schedule

Q1621: How do you optimize Linux performance?

Answer:

# Linux optimization
# 1. CPU
# Tune scheduler
# Process affinity
# Priority adjustment

# 2. Memory
# Swappiness
# Cache tuning
# Huge pages

# 3. I/O
# I/O scheduler
# Filesystem choice
# Mount options
# SSD optimization

# 4. Network
# Buffer sizes
# TCP tuning
# Offloading

# 5. Kernel
# Update regularly
# Tune parameters

# 6. Applications
# Profiling
# Optimization

# Tools
# perf
# sysbench
# fio
# iperf

Q1622: How do you design a disaster recovery plan?

Answer:

# DR planning
# 1. Risk assessment
# Identify critical systems
# RTO/RPO requirements

# 2. Strategy
# Backup & Restore
# Pilot Light
# Warm Standby
# Multi-region

# 3. Implementation
# Automated backups
# Replication
# Infrastructure as Code

# 4. Testing
# Regular DR tests
# Document results

# 5. Documentation
# Runbooks
# Contact list

# 6. Communication
# Stakeholder notification
# Status updates

Q1623: How do you implement zero-downtime deployments?

Answer:

# Zero-downtime deployment
# 1. Load balancer
# Health checks
# Graceful removal

# 2. Application
# Signal handling
# Graceful shutdown

# 3. Database
# Schema migrations
# Backward compatibility

# 4. Strategies
# Rolling update
# Blue-green
# Canary
# Feature flags

# 5. Rollback plan
# Quick rollback capability

# 6. Testing
# Load testing
# Chaos engineering

Q1624: How do you handle capacity planning?

Answer:

# Capacity planning
# 1. Current state
# Measure utilization

# 2. Trends
# Analyze growth

# 3. Forecasting
# Predict future needs

# 4. Planning
# Add capacity proactively

# 5. Optimization
# Right-size resources
# Use automation

# Metrics
# CPU
# Memory
# Disk
# Network
# Application-specific

# Tools
# Prometheus
# Grafana
# AWS Compute Optimizer
# Azure Advisor

Q1625: How do you implement compliance?

Answer:

# Compliance implementation
# 1. Framework
# SOC 2, PCI-DSS, HIPAA, GDPR

# 2. Controls
# Access control
# Encryption
# Monitoring
# Auditing

# 3. Automation
# Policy as Code
# OPA/Gatekeeper

# 4. Evidence
# Automated collection
# Documentation

# 5. Training
# Security awareness

# 6. Testing
# Vulnerability scans
# Penetration tests

# 7. Remediation
# Track findings
# Fix issues

Q1626: How do you design for scale?

Answer:

# Designing for scale
# 1. Horizontal scaling
# Stateless applications
# Load balancers
# Auto-scaling

# 2. Database scaling
# Read replicas
# Sharding
# Partitioning
# Caching

# 3. Caching
# Multi-layer
# Redis/Memcached

# 4. Asynchronous
# Message queues
# Event-driven

# 5. CDN
# Static content

# 6. Optimization
# Profiling
# Database tuning

# 7. Monitoring
# Early detection

Q1627: How do you implement observability?

Answer:

# Observability
# 1. Metrics
# Prometheus
# Custom metrics

# 2. Logs
# Structured logging
# ELK/Loki

# 3. Traces
# Distributed tracing

# 4. Correlation
# Trace IDs
# Request IDs

# 5. Alerting
# Based on SLOs

# 6. Dashboards
# Service overview
# Troubleshooting

# 7. Post-mortems
# Blameless analysis

# Implementation
# OpenTelemetry
# Many tools

Q1628: How do you secure containerized applications?

Answer:

# Container security
# 1. Images
# Minimal base
# No secrets in images
# Scan for vulnerabilities

# 2. Runtime
# Non-root user
# Read-only root
# Resource limits

# 3. Network
# Network policies
# Service mesh

# 4. Orchestrator
# RBAC
# Pod security policies

# 5. Secrets
# Use secrets manager
# Don't use env vars

# Tools
# Trivy
# Falco
# OPA

Q1629: How do you implement infrastructure as code?

Answer:

# Infrastructure as Code
# 1. Version control
# Git

# 2. Modules
# Reusable components

# 3. State management
# Remote state
# State locking

# 4. Testing
# Validate
# Plan

# 5. CI/CD
# Automated deployment

# 6. Drift detection
# Detect changes

# Tools
# Terraform
# Pulumi
# CloudFormation
# Ansible

Q1630: How do you manage secrets in CI/CD?

Answer:

# Secrets in CI/CD
# 1. Never commit secrets

# 2. Use secrets management
# HashiCorp Vault
# AWS Secrets Manager
# Azure Key Vault

# 3. Environment variables
# Inject at runtime

# 4. CI/CD integration
# GitHub Secrets
# GitLab CI variables

# 5. Rotation
# Auto-rotate secrets

# 6. Audit
# Log access

Q1631: How do you design a secure network?

Answer:

# Secure network design
# 1. Segmentation
# DMZ
# Internal
# Database

# 2. Firewall
# Whitelist approach
# Default deny

# 3. Encryption
# TLS everywhere
# VPN for access

# 4. Monitoring
# IDS/IPS
# NetFlow

# 5. DDoS protection
# CDN
# WAF
# Rate limiting

Q1632: How do you handle database failover?

Answer:

# Database failover
# 1. Automatic detection
# Health checks

# 2. Failover process
# Promote replica
# Update DNS

# 3. Application handling
# Connection retry
# Circuit breakers

# 4. Monitoring
# Alert on failover

# 5. Testing
# Regular drills

Q1633: How do you implement caching?

Answer:

# Caching strategy
# 1. CDN
# Static assets

# 2. Application cache
# Redis
# Memcached

# 3. Database cache
# Query cache
# Buffer pool

# 4. Browser cache
# Headers

# 5. Invalidation
# TTL
# Cache busting
# Patterns

Q1634: How do you design for high availability?

Answer:

# High availability design
# 1. Redundancy
# Multiple AZs
# Multiple regions

# 2. Load balancing
# Health checks
# Failover

# 3. Data replication
# Synchronous
# Asynchronous

# 4. Monitoring
# Fast detection

# 5. Automation
# Self-healing

# 6. Testing
# Chaos engineering

Q1635: How do you secure Kubernetes?

Answer:

# Kubernetes security
# 1. RBAC
# Least privilege

# 2. Network policies
# Default deny

# 3. Pod security
# Standards

# 4. Secrets
# External

# 5. Images
# Scanning

# 6. Runtime
# Falco

# 7. Updates
# Regular

Q1636: How do you design API security?

Answer:

# API security
# 1. Authentication
# OAuth 2.0
# JWT

# 2. Authorization
# RBAC
# Scopes

# 3. Rate limiting
# Throttling

# 4. Input validation
# Sanitization

# 5. TLS
# Encryption

# 6. Monitoring
# Anomaly detection

Q1637: How do you implement logging?

Answer:

# Logging implementation
# 1. Format
# JSON
# Structured

# 2. Levels
# DEBUG, INFO, WARN, ERROR

# 3. Correlation
# Trace IDs

# 4. Rotation
# Logrotate

# 5. Aggregation
# ELK/Loki

# 6. Retention
# Policy

Q1638: How do you design for security?

Answer:

# Security design
# 1. Defense in depth
# Multiple layers

# 2. Least privilege
# Minimize access

# 3. Zero trust
# Verify always

# 4. Encryption
# Everywhere

# 5. Monitoring
# Continuous

# 6. Automation
# Respond fast

Q1639: How do you implement incident response?

Answer:

# Incident response
# 1. Preparation
# Runbooks
# Tools

# 2. Detection
# Alerts

# 3. Containment
# Isolate

# 4. Eradication
# Fix

# 5. Recovery
# Restore

# 6. Lessons learned
# Post-mortem

Q1640: How do you optimize cloud costs?

Answer:

# Cost optimization
# 1. Right-sizing
# Match needs

# 2. Reservations
# Steady state

# 3. Spot
# Fault-tolerant

# 4. Automation
# Scale down

# 5. Cleanup
# Unused resources

# 6. Monitoring
# Alerts

Q1641: How do you implement change automation?

Answer:

# Change automation
# 1. GitOps
# All changes in Git

# 2. CI/CD
# Automated testing

# 3. Approval gates
# Manual steps

# 4. Rollback
# Automatic

# 5. Monitoring
# Quick detection

Q1642: How do you design for failure?

Answer:

# Design for failure
# 1. Redundancy
# Multiple copies

# 2. Graceful degradation
# Partial service

# 3. Circuit breakers
# Prevent cascade

# 4. Bulkheads
# Isolate

# 5. Recovery
# Fast

# 6. Testing
# Chaos

Q1643: How do you implement access control?

Answer:

# Access control
# 1. Authentication
# MFA

# 2. Authorization
# RBAC

# 3. Least privilege
# Minimal access

# 4. Audit
# Log access

# 5. Review
# Regular

Q1644: How do you secure data?

Answer:

# Data security
# 1. Classification
# Sensitivity

# 2. Encryption
# At rest
# In transit

# 3. Access control
# Need to know

# 4. Backup
# Encrypted

# 5. Monitoring
# Audit

Q1645: How do you design APIs?

Answer:

# API design
# 1. REST
# Resources
# HTTP verbs

# 2. Versioning
# URL path

# 3. Error handling
# Consistent

# 4. Pagination
# Large sets

# 5. Rate limiting
# Throttle

# 6. Documentation
# OpenAPI

Q1646: How do you implement service mesh?

Answer:

# Service mesh
# 1. Traffic management
# Routing

# 2. Security
# mTLS

# 3. Observability
# Tracing

# 4. Resilience
# Retries

# Tools
# Istio
# Linkerd
# Consul Connect

Q1647: How do you optimize databases?

Answer:

# Database optimization
# 1. Indexing
# Proper indexes

# 2. Query optimization
# EXPLAIN

# 3. Caching
# Use cache

# 4. Connection pooling
# Pool

# 5. Scaling
# Read replicas
# Sharding

# 6. Configuration
# Tune parameters

Q1648: How do you implement secrets management?

Answer:

# Secrets management
# 1. Centralized
# Vault

# 2. Rotation
# Auto

# 3. Audit
# Log access

# 4. Encryption
# Encrypt

# 5. Access control
# Least privilege

Q1649: How do you design for disasters?

Answer:

# Disaster recovery
# 1. Backup
# Regular

# 2. Replication
# Cross-region

# 3. Automation
# Fast recovery

# 4. Testing
# Regular

# 5. Documentation
# Runbooks

Q1650: How do you implement observability?

Answer:

# Observability
# 1. Metrics
# Prometheus

# 2. Logs
# ELK

# 3. Traces
# Jaeger

# 4. Correlation
# Trace IDs

# 5. Alerting
# SLO-based

Linux Advanced Scenarios

Q1651: How do you handle kernel upgrades?

Answer:

# Kernel upgrade
# 1. Test in staging
# 2. Check compatibility
# 3. Backup
# 4. Schedule window
# 5. Apply
# 6. Monitor
# 7. Rollback plan

Q1652: How do you design multi-tenant systems?

Answer:

# Multi-tenancy
# 1. Isolation
# Namespaces
# RBAC

# 2. Quotas
# Resources

# 3. Billing
# Usage tracking

# 4. Data separation
# Logical/physical

# 5. Network
# Segmentation

Q1653: How do you implement edge computing?

Answer:

# Edge computing
# 1. Lightweight K8s
# K3s

# 2. Data processing
# Local first

# 3. Sync
# Periodic

# 4. Security
# Edge-specific

# 5. Management
# Centralized

Q1654: How do you optimize Linux for containers?

Answer:

# Container optimization
# 1. OS
# Minimal OS

# 2. Kernel
# Tuned for containers

# 3. Storage
# Overlay2

# 4. Network
# CNI

# 5. Runtime
# containerd

# 6. Security
# Hardened

Answer:

# GDPR compliance
# 1. Data minimization
# Collect less

# 2. Consent
# Explicit

# 3. Right to erasure
# Delete capability

# 4. Portability
# Export data

# 5. Breach notification
# Process

# 6. DPO
# Appoint

Q1656: How do you implement zero-downtime patching?

Answer:

# Zero-downtime patching
# 1. Blue-green
# Two environments

# 2. Canary
# Gradual

# 3. Rolling
# One by one

# 4. Health checks
# Before switch

# 5. Rollback
# Quick

Q1657: How do you design for IoT?

Answer:

# IoT architecture
# 1. Edge
# Local processing

# 2. Protocol
# MQTT

# 3. Security
# Device auth

# 4. Scale
# Millions

# 5. OTA updates
# Secure

Q1658: How do you implement RBAC?

Answer:

# RBAC implementation
# 1. Roles
# Define

# 2. Permissions
# Map

# 3. Assignment
# Users

# 4. Audit
# Regular review

# 5. Tools
# LDAP integration

Q1659: How do you optimize network performance?

Answer:

# Network optimization
# 1. Offloading
# Hardware

# 2. Buffer tuning
# TCP

# 3. Compression
# Accept encoding

# 4. CDN
# Static

# 5. Keepalive
# HTTP

Q1660: How do you design for mobile?

Answer:

# Mobile optimization
# 1. API design
# Efficient

# 2. Compression
# gz/brotli

# 3. Caching
# Aggressive

# 4. Offline
# PWA

# 5. Security
# Certificate pinning

Q1661: How do you implement chaos engineering?

Answer:

# Chaos engineering
# 1. Define steady state
# What works

# 2. Hypothesize
# What will fail

# 3. Experiment
# Inject failure

# 4. Learn
# Observe

# 5. Improve
# Fix

# Tools
# Chaos Mesh
# Litmus
# Gremlin

Q1662: How do you implement immutable infrastructure?

Answer:

# Immutable infrastructure
# 1. Images
# Pre-built

# 2. No changes
# Rebuild

# 3. Versioned
# All

# 4. Rollback
# Previous image

# 5. Tools
# Packer
# Container

Q1663: How do you design for high performance?

Answer:

# High performance design
# 1. Profiling
# Find bottleneck

# 2. Optimization
# Targeted

# 3. Caching
# Multi-layer

# 4. Async
# Non-blocking

# 5. Scaling
# Horizontal

Q1664: How do you implement multi-cloud?

Answer:

# Multi-cloud strategy
# 1. Abstraction
# Terraform

# 2. Portability
# Container

# 3. Vendor lock-in
# Avoid

# 4. Data
# Strategy

# 5. Operations
# Unified

Q1665: How do you implement cost allocation?

Answer:

# Cost allocation
# 1. Tagging
# All resources

# 2. Tracking
# By team/project

# 3. Reporting
# Regular

# 4. Budgets
# Alerts

# 5. Accountability
# Showback

Q1666: How do you design for compliance automation?

Answer:

# Compliance automation
# 1. Policy as code
# OPA

# 2. Scanning
# Automated

# 3. Evidence
# Auto-collect

# 4. Remediation
# Auto-fix

# 5. Audit
# Regular

Q1667: How do you implement API rate limiting?

Answer:

# API rate limiting
# 1. Token bucket
# Leaky bucket

# 2. Per-user
# By key

# 3. Headers
# Rate limit

# 4. Response
# 429

# 5. Throttling
# Graceful

Q1668: How do you design for IoT security?

Answer:

# IoT security
# 1. Device identity
# Certificates

# 2. OTA updates
# Signed

# 3. Network
# Segmentation

# 4. Data
# Encryption

# 5. Monitoring
# Anomaly

Q1669: How do you implement infrastructure monitoring?

Answer:

# Infrastructure monitoring
# 1. Metrics
# Collect

# 2. Storage
# Time-series

# 3. Visualization
# Dashboards

# 4. Alerting
# Thresholds

# 5. Analysis
# Trends

Q1670: How do you implement database sharding?

Answer:

# Database sharding
# 1. Key strategy
# Choose shard key

# 2. Routing
# Application

# 3. Rebalancing
# Plan

# 4. Cross-shard
# Minimize

# 5. Monitoring
# Performance

Q1671: How do you design for 5G?

Answer:

# 5G optimization
# 1. Edge computing
# Local processing

# 2. Network slicing
# Dedicated

# 3. Low latency
# Optimization

# 4. Massive IoT
# Scale

Q1672: How do you implement service discovery?

Answer:

# Service discovery
# 1. DNS
# Consul

# 2. Health checks
# Registration

# 3. Load balancing
# Client-side

# 4. Failover
# Automatic

Q1673: How do you optimize web performance?

Answer:

# Web performance
# 1. CDN
# Static assets

# 2. Compression
# gz/brotli

# 3. Caching
# Headers

# 4. Minification
# CSS/JS

# 5. Images
# Optimization

Q1674: How do you implement backup verification?

Answer:

# Backup verification
# 1. Test restore
# Regular

# 2. Automation
# Script

# 3. Checksums
# Verify

# 4. Documentation
# Procedures

Q1675: How do you design for privacy?

Answer:

# Privacy design
# 1. Data minimization
# Collect less

# 2. Encryption
# Strong

# 3. Access control
# Strict

# 4. Audit
# Logging

# 5. Retention
# Policy

Q1676: How do you implement auto-remediation?

Answer:

# Auto-remediation
# 1. Detection
# Alerts

# 2. Classification
# Severity

# 3. Action
# Runbook

# 4. Automation
# Scripts

# 5. Verification
# Confirm fix

Q1677: How do you optimize storage?

Answer:

# Storage optimization
# 1. Tiering
# Hot/cold

# 2. Compression
# Deduplication

# 3. Lifecycle
# Policies

# 4. Monitoring
# Usage

# 5. Cleanup
# Regular

Q1678: How do you implement MFA?

Answer:

# MFA implementation
# 1. Factors
# Multiple

# 2. Methods
# TOTP/Push

# 3. Rollout
# Gradual

# 4. Backup
# Recovery codes

# 5. Enforcement
# Policy

Q1679: How do you design for resilience?

Answer:

# Resilience design
# 1. Redundancy
# Multiple

# 2. Fault tolerance
# Graceful

# 3. Recovery
# Fast

# 4. Testing
# Chaos

# 5. Monitoring
# Real-time

Q1680: How do you implement cost reporting?

Answer:

# Cost reporting
# 1. Tagging
# Comprehensive

# 2. Collection
# Automated

# 3. Analysis
# By team

# 4. Visualization
# Dashboards

# 5. Actions
# Optimization

Q1681: How do you design for IoT data?

Answer:

# IoT data management
# 1. Collection
# MQTT/HTTP

# 2. Processing
# Stream

# 3. Storage
# Time-series

# 4. Analysis
# Real-time

# 5. Retention
# Policy

Q1682: How do you implement service catalog?

Answer:

# Service catalog
# 1. Self-service
# Portal

# 2. Standardization
# Templates

# 3. Governance
# Approval

# 4. Documentation
# Auto-generated

Q1683: How do you optimize database queries?

Answer:

# Query optimization
# 1. EXPLAIN
# Analyze

# 2. Indexing
# Strategic

# 3. Rewriting
# Equivalent

# 4. Caching
# Query cache

# 5. Profiling
# Slow queries

Q1684: How do you implement API gateway?

Answer:

# API gateway
# 1. Routing
# Path-based

# 2. Authentication
# JWT

# 3. Rate limiting
# Quotas

# 4. Caching
# Response

# 5. Monitoring
# Usage

Q1685: How do you design for compliance?

Answer:

# Compliance design
# 1. Controls
# Framework

# 2. Automation
# Policy

# 3. Evidence
# Collection

# 4. Monitoring
# Continuous

# 5. Audit
# Regular

Q1686: How do you implement incident automation?

Answer:

# Incident automation
# 1. Detection
# Automated

# 2. Triage
# Classification

# 3. Response
# Runbooks

# 4. Escalation
# Rules

# 5. Resolution
# Tracking

Q1687: How do you optimize Kubernetes?

Answer:

# Kubernetes optimization
# 1. Resources
# Requests/limits

# 2. Scheduling
# Affinity

# 3. Networking
# CNI

# 4. Storage
# Classes

# 5. Autoscaling
# HPA/VPA

Q1688: How do you implement data governance?

Answer:

# Data governance
# 1. Classification
# Sensitivity

# 2. Ownership
# Clear

# 3. Quality
# Rules

# 4. Lineage
# Tracking

# 5. Compliance
# Policy

Q1689: How do you design for ML infrastructure?

Answer:

# ML infrastructure
# 1. Data pipeline
# ETL

# 2. Training
# Distributed

# 3. Serving
# Model serving

# 4. Monitoring
# Drift

# 5. MLOps
# Automation

Q1690: How do you implement cloud governance?

Answer:

# Cloud governance
# 1. Policies
# Guardrails

# 2. Tagging
# Standards

# 3. Cost control
# Budgets

# 4. Security
# Baseline

# 5. Compliance
# Audit

Q1691: How do you design for edge security?

Answer:

# Edge security
# 1. Device auth
# Certificates

# 2. Data encryption
# TLS

# 3. Network
# Segmentation

# 4. Updates
# Signed

# 5. Monitoring
# Centralized

Q1692: How do you implement container orchestration?

Answer:

# Container orchestration
# 1. Scheduling
# Placement

# 2. Scaling
# Auto

# 3. Networking
# Service mesh

# 4. Storage
# CSI

# 5. Security
# Policies

Q1693: How do you optimize network latency?

Answer:

# Network latency optimization
# 1. CDN
# Geographic

# 2. Caching
# Multi-layer

# 3. Compression
# gz/brotli

# 4. HTTP/2
# Multiplexing

# 5. DNS
# Anycast

Q1694: How do you implement data protection?

Answer:

# Data protection
# 1. Encryption
# At rest/transit

# 2. Access control
# RBAC

# 3. Backup
# Automated

# 4. Monitoring
# Audit

# 5. Incident
# Response

Q1695: How do you design for real-time processing?

Answer:

# Real-time processing
# 1. Stream processing
# Kafka/Spark

# 2. Low latency
# Optimization

# 3. Scalability
# Horizontal

# 4. Monitoring
# Metrics

# 5. Backpressure
# Handling

Q1696: How do you implement application security?

Answer:

# Application security
# 1. SDLC
# Secure

# 2. SAST/DAST
# Scanning

# 3. Dependencies
# Scanning

# 4. Runtime
# Protection

# 5. Training
# Developers

Q1697: How do you optimize Linux for databases?

Answer:

# Linux database optimization
# 1. Filesystem
# XFS/ext4

# 2. I/O scheduler
# Deadline/noop

# 3. Memory
# Huge pages

# 4. Network
# Buffer sizes

# 5. Disk
# SSD/NVMe

Q1698: How do you implement data retention?

Answer:

# Data retention
# 1. Policy
# Defined

# 2. Classification
# By type

# 3. Automation
# Scripts

# 4. Compliance
# Legal holds

# 5. Verification
# Regular

Q1699: How do you design for compliance reporting?

Answer:

# Compliance reporting
# 1. Evidence
# Automated

# 2. Framework
# Mapping

# 3. Controls
# Validation

# 4. Audit
# Support

# 5. Remediation
# Tracking

Q1700: How do you implement Kubernetes networking?

Answer:

# Kubernetes networking
# 1. CNI plugin
# Calico/Flannel

# 2. Network policies
# Segmentation

# 3. Services
# Types

# 4. Ingress
# Controller

# 5. DNS
# CoreDNS

Q1701: How do you optimize database connections?

Answer:

# Database connection optimization
# 1. Pooling
# Connection pool

# 2. Sizing
# Pool size

# 3. Timeouts
# Configure

# 4. Monitoring
# Active connections

# 5. Tuning
# Database config

Q1702: How do you implement backup automation?

Answer:

# Backup automation
# 1. Scheduling
# Cron

# 2. Retention
# Policy

# 3. Verification
# Test restore

# 4. Offsite
# Replication

# 5. Monitoring
# Alerts

Q1703: How do you design for regulatory compliance?

Answer:

# Regulatory compliance
# 1. Assessment
# Gap analysis

# 2. Controls
# Implementation

# 3. Monitoring
# Continuous

# 4. Documentation
# Evidence

# 5. Audit
# Support

Q1704: How do you implement service level objectives?

Answer:

# SLO implementation
# 1. Define
# Metrics

# 2. Measurement
# Collection

# 3. Alerting
# Budget

# 4. Reporting
# Regular

# 5. Improvement
# Action

Q1705: How do you optimize Linux storage?

Answer:

# Linux storage optimization
# 1. Filesystem
# Choice

# 2. Mount options
# Tuning

# 3. LVM
# Flexible

# 4. RAID
# Configuration

# 5. Monitoring
# I/O

Q1706: How do you implement network segmentation?

Answer:

# Network segmentation
# 1. VLANs
# Isolation

# 2. Firewalls
# Zones

# 3. Zero trust
# Micro-segmentation

# 4. Monitoring
# Traffic

# 5. Compliance
# Audit

Q1707: How do you design for ML model serving?

Answer:

# ML model serving
# 1. Framework
# TensorFlow Serving

# 2. Scaling
# Horizontal

# 3. A/B testing
# Canary

# 4. Monitoring
# Drift

# 5. Updates
# Rolling

Q1708: How do you implement vulnerability management?

Answer:

# Vulnerability management
# 1. Scanning
# Regular

# 2. Prioritization
# Severity

# 3. Remediation
# Process

# 4. Verification
# Rescan

# 5. Reporting
# Metrics

Q1709: How do you optimize web application security?

Answer:

# Web application security
# 1. WAF
# Deploy

# 2. Headers
# Security

# 3. Input validation
# Sanitization

# 4. SQL injection
# Prevention

# 5. XSS
# Protection

Q1710: How do you design for compliance automation?

Answer:

# Compliance automation
# 1. Policy as code
# OPA

# 2. Scanning
# Continuous

# 3. Remediation
# Auto

# 4. Evidence
# Collection

# 5. Reporting
# Automated

Q1711: How do you implement incident communication?

Answer:

# Incident communication
# 1. Stakeholders
# Identification

# 2. Status page
# Updates

# 3. Channels
# Multiple

# 4. Timing
# Regular

# 5. Post-incident
# Communication

Q1712: How do you optimize Kubernetes resources?

Answer:

# Kubernetes resource optimization
# 1. Requests
# Set appropriately

# 2. Limits
# Configure

# 3. HPA
# Auto-scale

# 4. VPA
# Recommendations

# 5. Monitoring
# Usage

Q1713: How do you implement data classification?

Answer:

# Data classification
# 1. Categories
# Public, Internal, Confidential

# 2. Labeling
# Automatic

# 3. Policies
# Based on class

# 4. Training
# Awareness

# 5. Auditing
# Regular

Q1714: How do you design for regulatory requirements?

Answer:

# Regulatory requirements
# 1. Framework
# Selection

# 2. Controls
# Implementation

# 3. Monitoring
# Continuous

# 4. Evidence
# Automated

# 5. Audit
# Support

Q1715: How do you implement cost allocation tags?

Answer:

# Cost allocation tags
# 1. Tagging policy
# Required tags

# 2. Enforcement
# SCP

# 3. Reporting
# By tag

# 4. Alerts
# Budget

# 5. Optimization
# Action

Q1716: How do you optimize Linux for networking?

Answer:

# Linux network optimization
# 1. Buffer sizes
# Tuning

# 2. Offloading
# Enable

# 3. TCP
# Parameters

# 4. Queue
# Tuning

# 5. Monitoring
# Metrics

Q1717: How do you implement service mesh security?

Answer:

# Service mesh security
# 1. mTLS
# Enable

# 2. Authorization
# Policies

# 3. Encryption
# Automatic

# 4. Audit
# Logging

# 5. Updates
# Regular

Q1718: How do you design for disaster recovery testing?

Answer:

# DR testing
# 1. Schedule
# Regular

# 2. Scope
# Defined

# 3. Documentation
# Runbooks

# 4. Validation
# Success

# 5. Improvements
# Action items

Q1719: How do you implement API versioning?

Answer:

# API versioning
# 1. Strategy
# URL path

# 2. Deprecation
# Policy

# 3. Documentation
# Swagger

# 4. Migration
# Guide

# 5. Support
# Timeline

Q1720: How do you optimize container images?

Answer:

# Container image optimization
# 1. Base image
# Minimal

# 2. Layers
# Reduce

# 3. Caching
# Build cache

# 4. Multi-stage
# Build

# 5. Scanning
# Security

Q1721: How do you implement compliance monitoring?

Answer:

# Compliance monitoring
# 1. Controls
# Continuous

# 2. Alerts
# Deviation

# 3. Reporting
# Regular

# 4. Remediation
# Tracking

# 5. Audit
# Support

Q1722: How do you design for data pipelines?

Answer:

# Data pipeline design
# 1. Source
# Connectors

# 2. Processing
# ETL/ELT

# 3. Quality
# Validation

# 4. Destination
# Storage

# 5. Monitoring
# Alerts

Q1723: How do you implement zero trust network?

Answer:

# Zero trust network
# 1. Verify
# Always

# 2. Least privilege
# Access

# 3. Micro-segmentation
# Network

# 4. Encryption
# All traffic

# 5. Monitoring
# Continuous

Q1724: How do you optimize Linux for high availability?

Answer:

# Linux HA optimization
# 1. Keepalived
# Configure

# 2. HAProxy
# Tune

# 3. Health checks
# Configure

# 4. Monitoring
# Comprehensive

# 5. Testing
# Regular

Q1725: How do you implement security automation?

Answer:

# Security automation
# 1. Scanning
# Automated

# 2. Remediation
# Auto-fix

# 3. Response
# Playbooks

# 4. Integration
# CI/CD

# 5. Monitoring
# Continuous

Q1726: How do you design for event-driven architecture?

Answer:

# Event-driven architecture
# 1. Event sourcing
# Design

# 2. Message broker
# Kafka

# 3. Consumers
# Scaling

# 4. Idempotency
# Handle

# 5. Monitoring
# Events

Q1727: How do you implement infrastructure testing?

Answer:

# Infrastructure testing
# 1. Validation
# Terraform

# 2. Integration
# Kitchen

# 3. Compliance
# InSpec

# 4. Security
# Scanning

# 5. Chaos
# Engineering

Q1728: How do you optimize for DevOps?

Answer:

# DevOps optimization
# 1. CI/CD
# Optimize

# 2. Automation
# Everything

# 3. Monitoring
# Feedback

# 4. Collaboration
# Teams

# 5. Culture
# Improvement

Q1729: How do you implement data encryption?

Answer:

# Data encryption
# 1. At rest
# LUKS

# 2. In transit
# TLS

# 3. Application
# Field-level

# 4. Keys
# Management

# 5. Rotation
# Policy

Q1730: How do you design for incident recovery?

Answer:

# Incident recovery
# 1. Detection
# Fast

# 2. Containment
# Quick

# 3. Eradication
# Complete

# 4. Recovery
# Fast

# 5. Post-incident
# Learning

Q1731: How do you implement container security scanning?

Answer:

# Container security scanning
# 1. Build time
# Scan images

# 2. Registry
# Scan stored

# 3. Runtime
# Scan running

# 4. Policies
# Define

# 5. Automation
# CI/CD

Q1732: How do you optimize Linux for virtualization?

Answer:

# Linux virtualization optimization
# 1. CPU
# Pinning

# 2. Memory
# Overcommit

# 3. Network
# Para-virtual

# 4. Storage
# VirtIO

# 5. Monitoring
# Per-VM

Q1733: How do you implement access certification?

Answer:

# Access certification
# 1. Review schedule
# Quarterly

# 2. Certification
# Campaign

# 3. Remediation
# Tasks

# 4. Exceptions
# Approval

# 5. Reporting
# Audit

Q1734: How do you design for data recovery?

Answer:

# Data recovery
# 1. Backups
# Multiple

# 2. Point in time
# Capability

# 3. Testing
# Regular

# 4. Documentation
# Procedures

# 5. Team
# Training

Q1735: How do you implement API authentication?

Answer:

# API authentication
# 1. OAuth 2.0
# Implement

# 2. JWT
# Tokens

# 3. API keys
# Management

# 4. Rotation
# Policy

# 5. Monitoring
# Usage

Q1736: How do you optimize database indexing?

Answer:

# Database indexing
# 1. Identify
# Slow queries

# 2. Analyze
# EXPLAIN

# 3. Create
# Appropriate

# 4. Composite
# Order

# 5. Maintenance
# Rebuild

Q1737: How do you implement incident triage?

Answer:

# Incident triage
# 1. Classification
# Severity

# 2. Impact
# Assessment

# 3. Prioritization
# Order

# 4. Assignment
# Owner

# 5. Escalation
# Path

Q1738: How do you design for cloud migration?

Answer:

# Cloud migration
# 1. Assessment
# Discovery

# 2. Planning
# Strategy

# 3. Migration
# Execute

# 4. Validation
# Testing

# 5. Optimization
# Post-migration

Q1739: How do you implement security policies?

Answer:

# Security policies
# 1. Framework
# Define