Linux_Practical_Interview_1501 1750
Linux Practical Interview Questions (1501-1750)
Section titled “Linux Practical Interview Questions (1501-1750)”Linux System Security
Section titled “Linux System Security”Q1501: How do you implement SELinux policies?
Section titled “Q1501: How do you implement SELinux policies?”Answer:
# Check SELinux statusgetenforcesestatus
# SELinux contexts# View file contextsls -Z /var/www/htmlls -Zd /var/www/html
# View process contextsps auxZ | grep nginx
# Change contextchcon -t httpd_sys_content_t /var/www/html/file.htmlsemanage fcontext -a -t httpd_sys_content_t "/web(/.*)?"restorecon -Rv /web
# Boolean valuesgetsebool -asetsebool -P httpd_can_network_connect on
# Create custom policy module# 1. Generate Type Enforcement file# myapp.temodule myapp 1.0;require { type httpd_t; type myapp_log_t; class file { read write };}allow httpd_t myapp_log_t:file { read write };
# 2. Compile and installcheckmodule -M -m -o myapp.mod myapp.tesemodule_package -o myapp.pp -m myapp.modsemodule -i myapp.ppQ1502: How do you configure AppArmor profiles?
Section titled “Q1502: How do you configure AppArmor profiles?”Answer:
# Install AppArmorapt install apparmor apparmor-utils
# View profilesaa-statusls /etc/apparmor.d/
# Create profileaa-genprof /usr/bin/myapp
# Profile syntax# /etc/apparmor.d/usr.bin.myapp#include <tunables/global>/usr/bin/myapp { #include <abstractions/base> #include <abstractions/bash>
# Allow read /etc /etc/** r,
# Allow write to log /var/log/myapp/* w,
# Deny access deny /etc/shadow r, deny /var/log/secure w,
# Network network inet stream,}
# Enable/disableaa-disable /usr/bin/myappaa-enforce /usr/bin/myappaa-complain /usr/bin/myapp
# Reloadapparmor_parser -r /etc/apparmor.d/usr.bin.myappQ1503: How do you implement Linux capabilities?
Section titled “Q1503: How do you implement Linux capabilities?”Answer:
# View capabilities# File capabilitiesgetcap -r /usr/bin/
# Process capabilitiescat /proc/$$/status | grep Cap
# Set file capabilitiessetcap 'cap_net_raw+ep' /usr/bin/pinggetcap /usr/bin/ping
# Remove capabilitiessetcap -r /usr/bin/ping
# Run with specific capabilities# Using run helper# /etc/security/capability.conf# none root# cap_net_raw user1# cap_net_admin user2
# Use setcap in code# In C#include <sys/capability.h>cap_t caps;caps = cap_get_proc();cap_set_flag(caps, CAP_EFFECTIVE, CAP_NET_RAW, 1);cap_set_proc(caps);Q1504: How do you secure Linux system services?
Section titled “Q1504: How do you secure Linux system services?”Answer:
# Disable unnecessary servicessystemctl mask service_namesystemctl disable service_name
# View active servicessystemctl list-units --type=service --state=running
# Secure SSH# /etc/ssh/sshd_configPermitRootLogin noPasswordAuthentication noPubkeyAuthentication yesMaxAuthTries 3ClientAliveInterval 300X11Forwarding noAllowUsers user1 user2DenyUsers root
# Secure Cron# /etc/cron.allow (only these users)# /etc/cron.deny (deny these users)
# Secure at# /etc/at.allow
# Secure system limits# /etc/security/limits.conf* hard maxlocks 100* soft nproc 512Q1505: How do you implement user authentication security?
Section titled “Q1505: How do you implement user authentication security?”Answer:
# Configure PAMauth required pam_tally2.so deny=3 unlock_time=600 onerr=fail
# Password policy# /etc/pam.d/common-passwordpassword required pam_pwhistory.so remember=5password [default=1] pam_permit.sopassword requisite pam_cracklib.so try_first_pass retry=3 minlen=12 dcredit=-1 ucredit=-1 lcredit=-1 ocredit=-1
# Set password expiry# /etc/login.defsPASS_MAX_DAYS 90PASS_MIN_DAYS 1PASS_WARN_AGE 14
# For userpasswd -x 90 -w 14 -n 1 usernamechage -M 90 -W 14 username
# View agingchage -l usernameLinux Network Security
Section titled “Linux Network Security”Q1506: How do you configure firewall rules?
Section titled “Q1506: How do you configure firewall rules?”Answer:
# Basic iptables rules# Flush existing rulesiptables -Fiptables -Xiptables -t nat -Fiptables -t mangle -F
# Default policiesiptables -P INPUT DROPiptables -P FORWARD DROPiptables -P OUTPUT ACCEPT
# Allow loopbackiptables -A INPUT -i lo -j ACCEPTiptables -A OUTPUT -o lo -j ACCEPT
# Allow established connectionsiptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow SSHiptables -A INPUT -p tcp --dport 22 -m conntrack --ctstate NEW -m recent --setiptables -A INPUT -p tcp --dport -m conntrack --ctstate NEW -m recent --update --seconds 60 --hitcount 4 -j DROP
# Allow HTTP/HTTPSiptables -A INPUT -p tcp --dport 80 -j ACCEPTiptables -A INPUT -p tcp --dport 443 -j ACCEPT
# Save rulesiptables-save > /etc/iptables/rules.v4Q1507: How do you implement network segmentation?
Section titled “Q1507: How do you implement network segmentation?”Answer:
# Create network namespacesip netns add dmzip netns add internal
# Configure VLANsip link add link eth0 name eth0.100 type vlan id 100ip addr add 192.168.100.1/24 dev eth0.100ip link set eth0.100 up
# Bridge isolationip link add name br-dmz type bridgeip link set eth1 master br-dmzip link set eth2 master br-dmz
# iptables zone-based firewalliptables -N DMZ-ZONEiptables -N INTERNAL-ZONEiptables -N EXTERNAL-ZONE
# DMZ rulesiptables -A DMZ-ZONE -p tcp --dport 80 -j ACCEPTiptables -A DMZ-ZONE -p tcp --dport 443 -j ACCEPTiptables -A DMZ-ZONE -j REJECT
# Internal rulesiptables -A INTERNAL-ZONE -j ACCEPTiptables -A INTERNAL-ZONE -o eth0 -j MASQUERADEQ1508: How do you configure IDS/IPS?
Section titled “Q1508: How do you configure IDS/IPS?”Answer:
# Install Snortapt install snort
# Configure# /etc/snort/snort.confipvar HOME_NET 192.168.1.0/24ipvar EXTERNAL_NET !$HOME_NET
# Custom rules# /etc/snort/rules/local.rules# Alert on ICMPalert icmp any any -> $HOME_NET any (msg:"ICMP Ping"; sid:1000001; rev:1;)
# Alert on SSH attemptsalert tcp any any -> $HOME_NET 22 (msg:"SSH Connection Attempt"; \ flow:to_server,established; content:"SSH"; nocase; sid:1000002; rev:1;)
# Alert on port scanalert tcp any any -> $HOME_NET any (msg:"Port Scan"; \ flow:to_server; detection_filter:track by_src,count 5,seconds 10; \ sid:1000003; rev:1;)
# Run snortsnort -c /etc/snort/snort.conf -i eth0
# Suricata (modern alternative)apt install suricatasuricata -c /etc/suricata/suricata.yaml -i eth0Q1509: How do you implement DDoS protection?
Section titled “Q1509: How do you implement DDoS protection?”Answer:
# Rate limiting with iptables# Limit connections per IPiptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \ -m recent --setiptables -A INPUT -p tcp --dport 80 -m conntrack --ctstate NEW \ -m recent --update --seconds 60 --hitcount 20 -j DROP
# Limit ICMPiptables -A INPUT -p icmp --icmp-type echo-request \ -m hashlimit --hashlimit-above 1/sec --hashlimit-burst 4 \ --hashlimit-htable-size 100000 --hashlimit-mode srcip \ --hashlimit-name icmp_limit -j DROP
# SYN flood protection# /etc/sysctl.confnet.ipv4.tcp_syncookies=1net.ipv4.tcp_syn_retries=2net.ipv4.tcp_max_syn_backlog=4096
# Application layer# Nginx rate limitinglimit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;limit_req zone=general burst=20 nodelay;Q1510: How do you configure VPN security?
Section titled “Q1510: How do you configure VPN security?”Answer:
# WireGuard setup# Generate keyswg genkey | tee private.key | wg pubkey > public.key
# Server configuration# /etc/wireguard/wg0.conf[Interface]PrivateKey = <server-private-key>Address = 10.0.0.1/24ListenPort = 51820PostUp = iptables -A FORWARD -i wg0 -j ACCEPTPostUp = iptables -A FORWARD -o wg0 -j ACCEPTPostUp = iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADEPostDown = iptables -D FORWARD -i wg0 -j ACCEPTPostDown = iptables -D FORWARD -o wg0 -j ACCEPTPostDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
[Peer]PublicKey = <client-public-key>AllowedIPs = 10.0.0.2/32
# Client configuration[Interface]PrivateKey = <client-private-key>Address = 10.0.0.2/24
[Peer]PublicKey = <server-public-key>Endpoint = server.example.com:51820AllowedIPs = 0.0.0.0/0PersistentKeepalive = 25
wg-quick up wg0Linux Kernel Hardening
Section titled “Linux Kernel Hardening”Q1511: How do you secure kernel parameters?
Section titled “Q1511: How do you secure kernel parameters?”Answer:
# Network securitynet.ipv4.conf.all.rp_filter=1net.ipv4.conf.default.rp_filter=1net.ipv4.icmp_echo_ignore_broadcasts=1net.ipv4.icmp_ignore_bogus_error_responses=1net.ipv4.conf.all.accept_redirects=0net.ipv4.conf.default.accept_redirects=0net.ipv4.conf.all.send_redirects=0net.ipv4.conf.default.send_redirects=0net.ipv4.conf.all.accept_source_route=0net.ipv4.conf.default.accept_source_route=0net.ipv4.tcp_timestamps=0
# Kernel securitykernel.dmesg_restrict=1kernel.kptr_restrict=2kernel.yama.ptrace_scope=2kernel.sysrq=0
# Memory protectionvm.mmap_min_addr=65536vm.swappiness=10
# Applysysctl -psysctl --systemQ1512: How do you implementgrsecurity?
Section titled “Q1512: How do you implementgrsecurity?”Answer:
# Install grsecurity# Option 1: Use precompiled kernelwget https:// kernels.org/pub/linux/kernel/v4.x/linux-4.14.12-grsec.tar.xz
# Option 2: Use PaX/gradmapt install paxctl gradm
# PaX flagspaxctl -C /usr/bin/nginxpaxctl -m /usr/bin/nginx # Enable MPROTECTpaxctl -s /usr/bin/nginx # Enable SEGMEXECpaxctl -r /usr/bin/nginx # Enable RANDEX
# gradm configuration# /etc/gradm/admin# admin:password:0:0
# Enable learning modegradm -L /etc/gradm/learning# Run application in learning modegradm -L /etc/gradm/learning -E /usr/bin/nginx
# Compile rulesgradm -F -O /etc/gradm/default.policies
# Enablegradm -e nginxQ1513: How do you implement mandatory access control?
Section titled “Q1513: How do you implement mandatory access control?”Answer:
# SELinux configurationSELINUX=enforcingSELINUXTYPE=targeted
# Create custom policy# myapp.tepolicy_module(myapp, 1.0)type myapp_t;type myapp_exec_t;role system_r types myapp_t;type_transition system_r myapp_exec_t:process myapp_t;
# Compile and installmake -f /usr/share/selinux/devel/Makefile myapp.ppsemodule -i myapp.pp
# AppArmor configuration# Already covered in previous question
# SMACK (Simplified Mandatory Access Control)# Enable in kernel# CONFIG_SECURITY_SMACK=y
# Configure# /etc/smack/accesses# Format: subject object access_ _ rroot myapp rwmyapp _ rwQ1514: How do you implement container security?
Section titled “Q1514: How do you implement container security?”Answer:
# Docker security# Run without privilegesdocker run --rm -it --cap-drop ALL --user 1000:1000 nginx
# Read-only root filesystemdocker run --rm -it --read-only nginx
# Resource limitsdocker run --rm -it --memory=256m --cpus=0.5 nginx
# Network isolationdocker run --rm -it --network none nginx
# SELinux/AppArmordocker run --rm -it --securitymor:default nginx
#-opt appar Seccomp profiledocker run --rm -it --security-opt seccomp:default nginx
# Rootless Docker# Installapt install docker-ce-rootless-extras
# Setupdockerd-rootless.sh
# Verifydocker info
# Check capabilitiesdocker run --rm -it --rm nginx capsh --printQ1515: How do you secure boot process?
Section titled “Q1515: How do you secure boot process?”Answer:
# UEFI secure boot# Check statusmokutil --sb-state
# Enroll keysmokutil --import key.der
# GRUB password# Generate hashgrub-mkpasswd-pbkdf2# Add to /etc/grub.d/40_customset superusers="admin"password_pbkdf2 admin grub.pbkdf2.sha512...hash...
# Rebuild GRUBupdate-grub
# Disable USB boot# /etc/modprobe.d/blacklist-usb.confinstall usb-storage /bin/true
# Boot kernel parameters# /etc/default/grubGRUB_CMDLINE_LINUX="secure boot=1"
# TPM measured boot# Installapt install tpm2-tools
# Measure boottpm2_pcrread
# Verifytpm2_quote -c -k key.file -g sha256 -f quote.out -q "my quote"Linux Advanced Networking
Section titled “Linux Advanced Networking”Q1516: How do you configure advanced routing?
Section titled “Q1516: How do you configure advanced routing?”Answer:
# Policy routing# Add tableecho "200 wan2" >> /etc/iproute2/rt_tables
# Add routeip route add default via 192.168.2.1 dev eth1 table wan2
# Add ruleip rule add from 192.168.2.10 table wan2ip rule add to 192.168.2.0/24 table wan2
# NAT with iptables# SNATiptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth0 -j SNAT --to-source 203.0.113.10
# DNATiptables -t nat -A PREROUTING -i eth0 -p tcp --dport 8080 -j DNAT --to-destination 192.168.1.10:80
# Masqueradeiptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
# Multi-path routingip route add default scope global nexthop via 192.168.1.1 dev eth0 weight 1 \ nexthop via 192.168.2.1 dev eth1 weight 1Q1517: How do you configure network bonding for high availability?
Section titled “Q1517: How do you configure network bonding for high availability?”Answer:
# Load bonding modulemodprobe bonding mode=1 miimon=100
DEVICE=bond0BONDING_OPTS="mode=1 miimon=100 primary=eth0"IPADDR=192.168.1.10NETMASK=255.255.255.0GATEWAY=192.168.1.1ONBOOT=yes
# /etc/sysconfig/network-scripts/ifcfg-eth0DEVICE=eth0MASTER=bond0SLAVE=yesONBOOT=yes
# /etc/sysconfig/network-scripts/ifcfg-eth1DEVICE=eth1MASTER=bond0SLAVE=yesONBOOT=yes
# Mode 4 (LACP)# /etc/sysconfig/network-scripts/ifcfg-bond0BONDING_OPTS="mode=4 miimon=100 lacp_rate=1"
# Monitorcat /proc/net/bonding/bond0
# ethtoolethtool -S bond0Q1518: How do you configure IPv6 security?
Section titled “Q1518: How do you configure IPv6 security?”Answer:
# Disable IPv6net.ipv6.conf.all.disable_ipv6=1net.ipv6.conf.default.disable_ipv6=1net.ipv6.conf.lo.disable_ipv6=1
# Or via GRUB# GRUB_CMDLINE_LINUX="ipv6.disable=1"
# IPv6 firewall rulesip6tables -Fip6tables -P INPUT DROPip6tables -P FORWARD DROPip6tables -P OUTPUT ACCEPT
# Allow establishedip6tables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow ICMPv6ip6tables -A INPUT -p ipv6-icmp -j ACCEPT
# Allow SSHip6tables -A INPUT -p tcp --dport 22 -j ACCEPT
# Block routing headerip6tables -A INPUT -m rt --rt-type 0 -j DROP
# RA guard# On switch or router# Configure Router Advertisement filteringQ1519: How do you implement Quality of Service?
Section titled “Q1519: How do you implement Quality of Service?”Answer:
# Traffic control with tc# Create qdisctc qdisc add dev eth0 root handle 1: htb default 10
# Create classestc class add dev eth0 parent 1: classid 1:10 htb rate 10Mbit ceil 10Mbittc class add dev eth0 parent 1: classid 1:20 htb rate 5Mbit ceil 5Mbit
# Filter traffictc filter add dev eth0 parent 1: protocol all prio 1 u32 match ip dst 192.168.1.10 flowid 1:20
# Example: Prioritize SSHtc qdisc add dev eth0 root handle 1: priotc filter add dev eth0 parent 1: protocol ip prio 10 u32 match ip dport 22 0xffff flowid 1:2tc filter add dev eth0 parent 1: protocol ip prio 20 u32 match ip sport 22 0xffff flowid 1:2
# Viewtc qdisc showtc class showtc filter showQ1520: How do you configure DNS security?
Section titled “Q1520: How do you configure DNS security?”Answer:
# DNSSEC with BIND# Enable in named.confdnssec-validation auto;dnssec-lookaside auto;
# Sign zonednssec-keygen -a RSASHA256 -b 2048 -n ZONE example.comdnssec-signzone -S -o example.com db.example.com
# Configure resolver# /etc/bind/named.conf.optionsoptions { dnssec-enable yes; dnssec-validation yes; dnssec-lookaside auto;};
# Test DNSSECdig +dnssec example.comdig +cd secure.example.com
# Unbound configuration# /etc/unbound/unbound.confserver: val-log-level: 2 harden-glue: yes harden-dnssec: yes use-caps-for-id: yes
# Query validationdrill -S example.comLinux Performance Analysis
Section titled “Linux Performance Analysis”Q1521: How do you analyze CPU performance?
Section titled “Q1521: How do you analyze CPU performance?”Answer:
# CPU infolscpucat /proc/cpuinfo
# CPU usage over timempstat -P ALL 1sar -u 1
# Per-CPU usagempstat -P ALL 1
# Process CPU usagetopps aux --sort=-%cpupidstat -p <pid> 1
# CPU steal (virtualization)vmstat 1
# Scheduler# View process priorityps -eo pid,ni,pri,comm
# CPU affinitytaskset -c 0-3 programtaskset -p 0xF <pid>
# Check CPU frequencycpupower frequency-infocpupower frequency-set -g performanceQ1522: How do you analyze memory performance?
Section titled “Q1522: How do you analyze memory performance?”Answer:
# Memory infofree -hcat /proc/meminfo
# Memory usage over timevmstat 1sar -B 1
# Per-process memoryps aux --sort=-%mempmap -x <pid>cat /proc/<pid>/status | grep -i vm
# Memory allocation issues# Check for OOMdmesg | grep -i "out of memory"cat /var/log/syslog | grep -i oom
# Slab infoslabtop
# Huge pagescat /proc/meminfo | grep -i huge
# Transparent huge pagescat /sys/kernel/mm/transparent_hugepage/enabled
# Memory pressurecat /proc/pressure/memoryQ1523: How do you analyze I/O performance?
Section titled “Q1523: How do you analyze I/O performance?”Answer:
# I/O statisticsiostat -xz 1sar -d 1
# Per-process I/Oiotoppidstat -d 1
# Block device infolsblkblkid
# I/O schedulercat /sys/block/sda/queue/schedulerecho deadline > /sys/block/sda/queue/scheduler
# Queue depthcat /sys/block/sda/queue/nr_requests
# Check for I/O waitsvmstat 1
# File system performance# Read-aheadcat /sys/block/sda/queue/read_ahead_kb
# Trace I/Oblktrace -d /dev/sda -o traceblkparse -i traceQ1524: How do you analyze network performance?
Section titled “Q1524: How do you analyze network performance?”Answer:
# Network statisticsnetstat -sss -s
# Per-interface statisticsip -s linknetstat -i
# Connection statesss -tan state establishedss -tan state time-wait
# Bandwidth monitoringiftopnethogs
# Packet capturetcpdump -i eth0tcpdump -i eth0 -w capture.pcap
# Network latencyping -c 4 hosttraceroute hostmtr host
# TCP analysis# TCP retransmitsnetstat -s | grep -i retrans
# Connection trackingconntrack -L
# Socket statisticsss -tulpnlsof -iQ1525: How do you use performance profiling tools?
Section titled “Q1525: How do you use performance profiling tools?”Answer:
# perfperf record -g -p <pid>perf reportperf top
# Flame graph# Installgit clone https://github.com/brendangregg/FlameGraph.git
# Generateperf record -F 99 -g -p <pid>perf script | ./stackcollapse-perf.pl | ./flamegraph.pl > flame.svg
# System-wide profilingperf record -a -g -- sleep 10perf report
# Valgrindvalgrind --tool=cachegrind ./programcg_annotate cachegrind.out.*
# gprofgcc -pg -g program.c -o program./programgprof program gmon.out > analysis.txt
# stracestrace -c -p <pid>strace -T -tt -p <pid>Linux Storage Advanced
Section titled “Linux Storage Advanced”Q1526: How do you configure RAID?
Section titled “Q1526: How do you configure RAID?”Answer:
# Create RAID5mdadm --create /dev/md0 --level=5 --raid-devices=4 /dev/sd[b-e]1
# Create RAID10mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/sd[b-e]1
# Create RAID1 with sparemdadm --create /dev/md0 --level=1 --raid-devices=2 --spare-devices=1 /dev/sd[b-d]1
# Monitormdadm --detail /dev/md0cat /proc/mdstat
# Add to /etc/mdadm.confmdadm --examine --scan >> /etc/mdadm.conf
# Managemdadm /dev/md0 --add /dev/sdf1mdadm /dev/md0 --remove /dev/sdb1mdadm /dev/md0 --fail /dev/sdb1
# Stop/startmdadm --stop /dev/md0mdadm --assemble /dev/md0Q1527: How do you configure LVM?
Section titled “Q1527: How do you configure LVM?”Answer:
# Create physical volumepvcreate /dev/sdb1pvdisplaypvmove /dev/sdb1 /dev/sdc1
# Create volume groupvgcreate vg_data /dev/sdb1vgextend vg_data /dev/sdc1vgdisplay
# Create logical volumelvcreate -L 10G -n lv_data vg_datalvcreate -l 100%FREE -n lv_backup vg_data
# Create thin poollvcreate -L 100G --thinpool vg_data/thin_poollvcreate -V 10G --thin -n lv_thin vg_data/thin_pool
# Snapshotlvcreate -s -L 5G -n lv_snap vg_data/lv_data
# Resizelvextend -L +10G /dev/vg_data/lv_datalvreduce -L -5G /dev/vg_data/lv_data
# Removelvremove /dev/vg_data/lv_datavgremove vg_datapvremove /dev/sdb1Q1528: How do you configure encrypted filesystems?
Section titled “Q1528: How do you configure encrypted filesystems?”Answer:
# LUKS encryptioncryptsetup luksFormat /dev/sdb1cryptsetup luksOpen /dev/sdb1 encryptedmkfs.xfs /dev/mapper/encrypted
# Add keycryptsetup luksAddKey /dev/sdb1
# Backup headercryptsetup luksHeaderBackup /dev/sdb1 --header-backup-file header.img
# Auto unlock# /etc/crypttabencrypted /dev/sdb1 none luks
# /etc/fstab/dev/mapper/encrypted /mnt/data xfs defaults 0 2
# eCryptfsmount -t ecryptfs /backup /encrypted# Or use fscryptmkfs.ext4 -O encrypt /dev/sda1mount /dev/sda1 /mntfscrypt setupfscrypt encrypt /mntQ1529: How do you configure NFS security?
Section titled “Q1529: How do you configure NFS security?”Answer:
# Server configuration/data 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash,sec=krb5p)/secure 192.168.1.0/24(rw,sync,sec=sys)
# Kerberized NFS# Server# /etc/exports/data gss/krb5p(rw,sync,no_subtree_check)
# Exportexportfs -av
# Clientmount -t nfs -o sec=krb5p server:/data /mnt
# Security options# sec=sys - UID/GID mapping# sec=krb5 - Authentication only# sec=krb5i - Integrity# sec=krb5p - Privacy
# Firewall# Allow NFSiptables -A INPUT -p tcp --dport 2049 -j ACCEPTiptables -A INPUT -p udp --dport 2049 -j ACCEPT
# Testnfsstat -cshowmount -e serverQ1530: How do you configure disk quotas?
Section titled “Q1530: How do you configure disk quotas?”Answer:
# Enable quota/dev/sda1 /home ext4 usrquota,grpquota 0 2
# Remountmount -o remount /home
# Initialize quotaquotacheck -cug /home
# Enable quotaquotaon /home
# Set user quotaedquota -u username# Edit soft/hard limits
# Set group quotaedquota -g groupname
# View quotaquota -u usernamequota -g groupnamerepquota -a
# Copy quota templateedquota -p template_user new_user
# Email reports# /etc/cron.daily/quotasquotacheck -avugrepquota -a | mail -s "Quota Report" admin@example.comLinux Services Configuration
Section titled “Linux Services Configuration”Q1531: How do you configure Apache advanced?
Section titled “Q1531: How do you configure Apache advanced?”Answer:
# Virtual host with SSL<VirtualHost *:443> ServerName example.com DocumentRoot /var/www/html
SSLEngine on SSLCertificateFile /etc/ssl/certs/server.crt SSLCertificateKeyFile /etc/ssl/private/server.key SSLCertificateChainFile /etc/ssl/certs/ca.crt
<Directory /var/www/html> Options -Indexes +FollowSymLinks AllowOverride None Require all granted </Directory>
# Security headers Header always set X-Frame-Options "SAMEORIGIN" Header always set X-Content-Type-Options "nosniff" Header always set X-XSS-Protection "1; mode=block"
# Performance KeepAlive On MaxKeepAliveRequests 100 KeepAliveTimeout 5
# Compression AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css application/javascript</VirtualHost>
# Load balancer<Proxy balancer://mycluster> BalancerMember http://192.168.1.10:8080 route=node1 BalancerMember http://192.168.1.11:8080 route=node2 ProxySet lbmethod=byrequests</Proxy>
ProxyPass / balancer://mycluster/ProxyPassReverse / balancer://mycluster/Q1532: How do you configure Nginx advanced?
Section titled “Q1532: How do you configure Nginx advanced?”Answer:
# Worker configurationworker_processes auto;worker_rlimit_nofile 65535;
events { worker_connections 65535; use epoll; multi_accept on;}
http { # Logging log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main; error_log /var/log/nginx/error.log warn;
# Performance open_file_cache max=10000 inactive=30s; open_file_cache_valid 60s; open_file_cache_min_uses 2; sendfile on; tcp_nopush on; tcp_nodelay on;
# Upstream with health check upstream backend { least_conn; server 192.168.1.10:8080 max_fails=3 fail_timeout=30s; server 192.168.1.11:8080 max_fails=3 fail_timeout=30s; keepalive 32; }
server { location / { proxy_pass http://backend; proxy_http_version 1.1; proxy_set_header Connection ""; } }}Q1533: How do you configure PostgreSQL high availability?
Section titled “Q1533: How do you configure PostgreSQL high availability?”Answer:
# Streaming replicationwal_level = replicamax_wal_senders = 3wal_keep_size = 1GBhot_standby = on
# Master: /etc/postgresql/14/main/pg_hba.confhost replication replicator 192.168.1.0/24 md5
# Create replication userpsql -c "CREATE USER replicator REPLICATION LOGIN PASSWORD 'secret';"
# Backup on replicapg_basebackup -h master -D /var/lib/postgresql/14/main -U replicator -P -Xs
# Replica: /etc/postgresql/14/main/postgresql.confhot_standby = on
# Replica: /etc/postgresql/14/main/recovery.confstandby_mode = onprimary_conninfo = 'host=master port=5432 user=replicator password=secret'trigger_file = /tmp/promote
# pgBouncer for connection pooling# /etc/pgbouncer/pgbouncer.ini[databases]mydb = host=127.0.0.1 port=5432 dbname=mydb
[pgbouncer]listen_addr = 127.0.0.1listen_port = 6432auth_type = md5auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transactionmax_client_conn = 1000default_pool_size = 20Q1534: How do you configure Redis Sentinel?
Section titled “Q1534: How do you configure Redis Sentinel?”Answer:
# Sentinel configurationport 26379sentinel monitor mymaster 127.0.0.1 6379 2sentinel down-after-milliseconds mymaster 5000sentinel failover-timeout mymaster 180000sentinel parallel-syncs mymaster 1
# Start sentinelredis-sentinel /etc/redis/sentinel.conf
# Client connection# Python examplefrom redis.sentinel import Sentinelsentinel = Sentinel([('localhost', 26379)], socket_timeout=0.1)master = sentinel.master_for('mymaster', socket_timeout=0.1)slave = sentinel.slave_for('mymaster', socket_timeout=0.1)
# Commandsredis-cli -p 26379 INFO SENTINELredis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster
# Failover# Sentinel automatically promotes replica to master# Old master becomes replica when back onlineQ1535: How do you configure MySQL Cluster?
Section titled “Q1535: How do you configure MySQL Cluster?”Answer:
# MySQL NDB Cluster# Installapt install mysql-cluster-community-server
# Management node config# /etc/mysql/my.cnf[ndb_mgmd]node-id=1hostname=192.168.1.10datadir=/var/lib/mysql-cluster
# Data nodes# /etc/mysql/my.cnf[ndbd]node-id=2hostname=192.168.1.11datadir=/var/lib/mysql-cluster
[ndbd]node-id=3hostname=192.168.1.12datadir=/var/lib/mysql-cluster
# SQL node# /etc/mysql/my.cnf[mysqld]node-id=4
# Start management nodendb_mgmd -f /etc/mysql/config.ini
# Start data nodesndbd --initial
# Start SQL nodemysqld --ndbcluster
# Check statusndb_mgm -e showLinux Automation Advanced
Section titled “Linux Automation Advanced”Q1536: How do you use Ansible Vault?
Section titled “Q1536: How do you use Ansible Vault?”Answer:
# Create encrypted fileansible-vault create secret.yml
# Encrypt existing fileansible-vault encrypt secrets.yml
# Edit encrypted fileansible-vault edit secrets.yml
# View encrypted fileansible-vault view secrets.yml
# Decrypt fileansible-vault decrypt secrets.yml
# Change passwordansible-vault rekey secret.yml
# Use in playbook# playbook.yml- hosts: all vars_files: - secrets.yml tasks: - name: Create user user: name: "{{ db_user }}" password: "{{ db_password }}"
# Run with vault passwordansible-playbook site.yml --ask-vault-pass# oransible-playbook site.yml --vault-password-file ~/.vault_pass.txtQ1537: How do you use Ansible roles?
Section titled “Q1537: How do you use Ansible roles?”Answer:
# Create role structureansible-galaxy init nginx
# Role structure# nginx/# ├── defaults/# │ └── main.yml# ├── handlers/# │ └── main.yml# ├── meta/# │ └── main.yml# ├── tasks/# │ └── main.yml# ├── templates/# │ └── nginx.conf.j2# ├── tests/# │ ├── inventory# │ └── test.yml# └── vars/# └── main.yml
# defaults/main.ymlnginx_port: 80nginx_workers: 4
# tasks/main.yml- name: Install nginx apt: name: nginx state: present
- name: Configure nginx template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf notify: restart nginx
# handlers/main.yml- name: restart nginx service: name: nginx state: restarted
# Use role# playbook.yml- hosts: webservers roles: - nginxQ1538: How do you use Terraform modules?
Section titled “Q1538: How do you use Terraform modules?”Answer:
# Module structure# ├── ec2/# │ ├── main.tf# │ ├── variables.tf# │ └── outputs.tf# └── vpc/# ├── main.tf# ├── variables.tf# └── outputs.tf
# ec2/variables.tfvariable "instance_type" { description = "EC2 instance type" type = string default = "t3.micro"}
variable "ami_id" { description = "AMI ID" type = string}
# ec2/outputs.tfoutput "instance_id" { value = aws_instance.this.id}
# Main configuration# main.tfmodule "vpc" { source = "./modules/vpc"
cidr_block = "10.0.0.0/16"}
module "ec2" { source = "./modules/ec2"
ami_id = "ami-0c55b159cbfafe1f0" instance_type = "t3.micro"
vpc_id = module.vpc.vpc_id}Q1539: How do you use Chef cookbooks?
Section titled “Q1539: How do you use Chef cookbooks?”Answer:
# Cookbook structure# ├── metadata.rb# ├── recipes/# │ └── default.rb# ├── templates/# │ └── config.erb# └── attributes/# └── default.rb
# metadata.rbname 'myapp'version '1.0.0'depends 'nginx'
# attributes/default.rbdefault['myapp']['port'] = 8080default['myapp']['workers'] = 4
# recipes/default.rbpackage 'myapp'
template '/etc/myapp/config.yml' do source 'config.erb' mode '0644' variables( port: node['myapp']['port'] )end
service 'myapp' do action [:enable, :start]end
# Use cookbook# Run listchef-client -r "recipe[myapp]"Q1540: How do you use Puppet modules?
Section titled “Q1540: How do you use Puppet modules?”Answer:
# Module structure# ├── manifests/# │ ├── init.pp# │ └── config.pp# ├── templates/# │ └── nginx.conf.erb# └── files/# └── index.html
# manifests/init.ppclass nginx { package { 'nginx': ensure => installed, }
service { 'nginx': ensure => running, enable => true, hasrestart => true, }}
# manifests/config.ppclass nginx::config inherits nginx { file { '/etc/nginx/nginx.conf': ensure => file, content => template('nginx/nginx.conf.erb'), require => Package['nginx'], notify => Service['nginx'], }}
# Use module# site.ppnode 'webserver.example.com' { include nginx include nginx::config}Linux Cloud Native
Section titled “Linux Cloud Native”Q1541: How do you configure Kubernetes networking?
Section titled “Q1541: How do you configure Kubernetes networking?”Answer:
# CNI plugins# Install Flannelkubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# Install Calicokubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# Network policieskubectl apply -f - <<EOFapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-denyspec: podSelector: {} policyTypes: - Ingress - EgressEOF
# Allow specific traffickubectl apply -f - <<EOFapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: allow-frontendspec: podSelector: matchLabels: app: frontend ingress: - from: - podSelector: matchLabels: app: backendEOF
# Service mesh (Istio)istioctl install --set profile=demokubectl label namespace default istio-injection=enabledQ1542: How do you configure Kubernetes storage?
Section titled “Q1542: How do you configure Kubernetes storage?”Answer:
# PersistentVolumeapiVersion: v1kind: PersistentVolumemetadata: name: my-pvspec: capacity: storage: 10Gi accessModes: - ReadWriteOnce storageClassName: standard hostPath: path: /mnt/data
---# PersistentVolumeClaimapiVersion: v1kind: PersistentVolumeClaimmetadata: name: my-pvcspec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi
---# StorageClassapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: fastprovisioner: kubernetes.io/gce-pdparameters: type: pd-ssd replication-type: regional-pd
---# Use in PodapiVersion: v1kind: Podmetadata: name: mypodspec: containers: - name: myapp image: nginx volumeMounts: - name: my-storage mountPath: /data volumes: - name: my-storage persistentVolumeClaim: claimName: my-pvcQ1543: How do you configure Kubernetes security?
Section titled “Q1543: How do you configure Kubernetes security?”Answer:
# RBACkubectl create serviceaccount myappkubectl create role myapp-reader --verb=get,list --resource=podskubectl create rolebinding myapp-reader-binding --role=myapp-reader --serviceaccount=default:myapp
# Use service account in podapiVersion: v1kind: Podmetadata: name: mypodspec: serviceAccountName: myapp containers: - name: myapp image: nginx
# Pod Security PolicyapiVersion: policy/v1beta1kind: PodSecurityPolicymetadata: name: restrictedspec: privileged: false seLinux: rule: RunAsAny runAsUser: rule: MustRunAsNonRoot fsGroup: rule: RunAsAny
# Network policies# See previous question
# Secretskubectl create secret generic mysecret \ --from-literal=username=admin \ --from-literal=password=secretQ1544: How do you configure Helm workflows?
Section titled “Q1544: How do you configure Helm workflows?”Answer:
# Create charthelm create myapp
# Add dependencies# Chart.yamldependencies: - name: nginx version: "1.0.0" repository: "https://charts.bitnami.com/bitnami"
# Install with dependencieshelm dependency buildhelm dependency update
# Template functions# values.yamlreplicaCount: 3
# deployment.yamlapiVersion: apps/v1kind: Deploymentmetadata: name: {{ include "myapp.fullname" . }}spec: replicas: {{ .Values.replicaCount }}
# Common functions{{ .Values.image.repository }}:{{ .Values.image.tag }}{{ include "myapp.fullname" . }}{{ .Release.Name }}{{ .Release.Namespace }}
# Hookshooks: - name: backup manifest: | apiVersion: v1 kind: Pod metadata: name: backup hook: pre-install weight: 10Q1545: How do you implement GitOps with ArgoCD?
Section titled “Q1545: How do you implement GitOps with ArgoCD?”Answer:
# Install ArgoCDkubectl create namespace argocdkubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# Get passwordkubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
# Create applicationkubectl apply -f - <<EOFapiVersion: argoproj.io/v1alpha1kind: Applicationmetadata: name: myapp namespace: argocdspec: project: default source: repoURL: https://github.com/org/repo.git targetRevision: HEAD path: k8s/production destination: server: https://kubernetes.default.svc namespace: production syncPolicy: automated: prune: true selfHeal: trueEOF
# Syncargocd app sync myappargocd app get myapp
# Sync waves# Add annotations to resources# metadata:# annotations:# argocd.argoproj.io/sync-wave: "1"Linux Troubleshooting Advanced
Section titled “Linux Troubleshooting Advanced”Q1546: How do you debug kernel issues?
Section titled “Q1546: How do you debug kernel issues?”Answer:
# Kernel messagesdmesgdmesg | tail -100
# Kernel panic# Enable kdumpapt install kdump-toolskdump-config load
# Test kdumpecho c > /proc/sysrq-trigger
# Analyze crashcrash /var/crump/ vmcore
# Kernel configzcat /proc/config.gz# orcat /boot/config-$(uname -r)
# System callsstrace -c programstrace -f program
# ftraceecho function > /sys/kernel/debug/tracing/current_tracercat /sys/kernel/debug/tracing/trace
# perfperf record -g programperf reportQ1547: How do you debug network issues?
Section titled “Q1547: How do you debug network issues?”Answer:
# Interface statusip link showip addr showethtool eth0
# Routingip routeip route get 8.8.8.8
# DNSdig example.comgetent hosts example.com
# Connectivityping -c 4 8.8.8.8traceroute 8.8.8.8
# Port statusnetstat -tulpnss -tulpn
# Capturetcpdump -i eth0 host 192.168.1.1tcpdump -i eth0 port 80
# Firewalliptables -L -n -viptables -t nat -L -n -v
# TCP issues# Retransmitsnetstat -s | grep -i retrans# Connection statesss -tan state time-wait
# ARPip neigh showarp -aQ1548: How do you debug storage issues?
Section titled “Q1548: How do you debug storage issues?”Answer:
# Disk usagedf -hdf -i
# Find large filesfind / -type f -size +100M 2>/dev/null | head -20
# I/O statsiostat -xz 1sar -d 1
# Mount issuesmountcat /proc/mounts
# Filesystem checkfsck -n /dev/sda1
# LVM issueslvspvsvgslvdisplay
# NFS issuesshowmount -e servermount -v server:/share /mnt
# SMART statussmartctl -a /dev/sdasmartctl -H /dev/sda
# Lsof for deleted fileslsof +L1Q1549: How do you debug service issues?
Section titled “Q1549: How do you debug service issues?”Answer:
# Service statussystemctl status servicesystemctl list-failed
# Logsjournalctl -u service -n 50journalctl -u service --since "1 hour ago"journalctl -xe
# Processps auxf | grep servicelsof -p $(pgrep -f service)
# Configurationservice configtestnginx -t
# Dependenciessystemctl list-dependencies servicesystemctl is-active service
# Resourcescat /proc/$(pgrep -f service)/limits
# Networknetstat -tulpn | grep service
# Environmentcat /proc/$(pgrep -f service)/environ | tr '\0' '\n'
# Cgroupssystemd-cgls | grep serviceQ1550: How do you debug application issues?
Section titled “Q1550: How do you debug application issues?”Answer:
# Core dumps# Enableulimit -c unlimited
# /etc/security/limits.conf* soft core unlimited
# Generate coregcore <pid>
# Analyzegdb program core(gdb) bt(gdb) info threads
# Memory leaksvalgrind --leak-check=full program
# Performance profilingperf record -g programperf report
# Python debuggingpython -m pdb program.pypython -m cProfile program.py
# Java debuggingjstack <pid>jmap -heap <pid>jmap -dump:format=b,file=heap.bin <pid>
# Node.js debuggingnode --inspect program.jschrome://inspectLinux Advanced Topics
Section titled “Linux Advanced Topics”Q1551: How do you implement zero-downtime restarts?
Section titled “Q1551: How do you implement zero-downtime restarts?”Answer:
# Nginx graceful reloadnginx -s reload# orsystemctl reload nginx
# HAProxy# Reload without downtimesystemctl reload haproxy
# Application with SIGTERM handling# In application codeimport signalimport sys
def sigterm_handler(signum, frame): # Stop accepting new connections # Wait for existing connections to complete # Then exit print("Shutting down gracefully...") sys.exit(0)
signal.signal(signal.SIGTERM, sigterm_handler)
# Kubernetes rolling updatekubectl set image deployment/myapp myapp=myapp:v2kubectl rollout status deployment/myapp
# Rollback if neededkubectl rollout undo deployment/myappQ1552: How do you implement feature toggles?
Section titled “Q1552: How do you implement feature toggles?”Answer:
# Simple feature toggleclass FeatureToggle: def __init__(self): self.features = {}
def enable(self, feature): self.features[feature] = True
def disable(self, feature): self.features[feature] = False
def is_enabled(self, feature): return self.features.get(feature, False)
# Usagetoggle = FeatureToggle()toggle.enable('new_ui')
if toggle.is_enabled('new_ui'): show_new_ui()else: show_old_ui()
# Environment-basedimport osif os.getenv('FEATURE_NEW_UI') == '1': show_new_ui()
# Database-backeddef is_feature_enabled(feature_name): result = db.query("SELECT enabled FROM features WHERE name = ?", feature_name) return result.enabled if result else FalseQ1553: How do you implement circuit breaker pattern?
Section titled “Q1553: How do you implement circuit breaker pattern?”Answer:
import timefrom functools import wraps
class CircuitBreaker: def __init__(self, failure_threshold=5, timeout=60): self.failure_threshold = failure_threshold self.timeout = timeout self.failures = 0 self.last_failure_time = None self.state = "CLOSED" # CLOSED, OPEN, HALF_OPEN
def call(self, func, *args, **kwargs): if self.state == "OPEN": if time.time() - self.last_failure_time > self.timeout: self.state = "HALF_OPEN" else: raise Exception("Circuit breaker is OPEN")
try: result = func(*args, **kwargs) self._on_success() return result except Exception as e: self._on_failure() raise
def _on_success(self): self.failures = 0 self.state = "CLOSED"
def _on_failure(self): self.failures += 1 self.last_failure_time = time.time() if self.failures >= self.failure_threshold: self.state = "OPEN"
# Usagebreaker = CircuitBreaker()result = breaker.call(risky_api_call)Q1554: How do you implement rate limiting?
Section titled “Q1554: How do you implement rate limiting?”Answer:
import timefrom collections import defaultdict
class RateLimiter: def __init__(self, max_requests, time_window): self.max_requests = max_requests self.time_window = time_window self.requests = defaultdict(list)
def is_allowed(self, key): now = time.time() # Remove old requests self.requests[key] = [ req_time for req_time in self.requests[key] if now - req_time < self.time_window ]
if len(self.requests[key]) >= self.max_requests: return False
self.requests[key].append(now) return True
# Usage (Flask)limiter = RateLimiter(100, 60)
@app.route('/api')def api(): if not limiter.is_allowed(request.remote_addr): return "Too many requests", 429
# Process request return "OK"
# Redis-based (distributed)import redis
class RedisRateLimiter: def __init__(self, redis_client, max_requests, time_window): self.redis = redis_client self.max_requests = max_requests self.time_window = time_window
def is_allowed(self, key): current = self.redis.incr(key) if current == 1: self.redis.expire(key, self.time_window) return current <= self.max_requestsQ1555: How do you implement service discovery?
Section titled “Q1555: How do you implement service discovery?”Answer:
# Consul# Installapt install consul
# Configuration# /etc/consul/config.json{ "datacenter": "dc1", "data_dir": "/var/consul", "ui_config": { "enabled": true }, "retry_join": ["provider=aws tag_key=consul tag_value=server"], "server": true, "bootstrap_expect": 3}
# Register service# /etc/consul/service.json{ "service": { "name": "web", "port": 80, "check": { "http": "http://localhost:80/health", "interval": "10s" } }}
# DNS interface# Query servicedig @127.0.0.1 -p 8600 web.service.consul
# HTTP APIcurl http://127.0.0.1:8500/v1/catalog/service/web
# Register in codeimport consulc = consul.Consul()
# Register servicec.agent.service.register( 'web', service_id='web-1', port=80, check=consul.Check.http('http://localhost:80/health', '10s'))Linux Expert Topics
Section titled “Linux Expert Topics”Q1556: How do you implement observability?
Section titled “Q1556: How do you implement observability?”Answer:
# Distributed tracing with Jaeger# Client integration# Pythonfrom opentelemetry import tracefrom opentelemetry.exporter.jaeger.thrift import JaegerExporterfrom opentelemetry.sdk.trace import TracerProvider
trace.set_tracer_provider(TracerProvider())jaeger_exporter = JaegerExporter( agent_host_name="jaeger", agent_port=6831,)trace.get_tracer_provider().add_span_processor( BatchSpanProcessor(jaeger_exporter))
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("operation") as span: span.set_attribute("key", "value") # Do work
# Metrics with Prometheusfrom prometheus_client import Counter, generate_latest
requests_total = Counter('requests_total', 'Total requests')
@app.route('/')def hello(): requests_total.inc() return 'Hello'
# Export metrics@app.route('/metrics')def metrics(): return generate_latest()
# Logging structuredimport loggingimport json
logger = logging.getLogger(__name__)logger.info("Request processed", extra={ "user_id": user.id, "duration_ms": duration})Q1557: How do you implement chaos engineering?
Section titled “Q1557: How do you implement chaos engineering?”Answer:
# Chaos Mesh# Installhelm repo add chaos-mesh https://charts.chaos-mesh.orghelm install chaos-mesh chaos-mesh/chaos-mesh -n chaos-mesh --create-namespace
# Pod failure experimentapiVersion: chaos-mesh.org/v1alpha1kind: PodChaosmetadata: name: pod-failurespec: action: pod-failure mode: one duration: 60s selector: namespaces: - default labelSelectors: app: myapp
# Network chaosapiVersion: chaos-mesh.org/v1alpha1kind: NetworkChaosmetadata: name: network-delayspec: action: delay mode: one duration: 60s selector: namespaces: - default delay: latency: 100ms
# Litmus# Installhelm repo add litmuschaos https://litmuschaos.github.io/litmus-helmhelm install litmuschaos litmuschaos/litmus
# Use with AWS# Simulate EC2 instance terminationaws ec2 terminate-instances --instance-ids i-1234567890abcdef0Q1558: How do you implement multi-tenancy?
Section titled “Q1558: How do you implement multi-tenancy?”Answer:
# Kubernetes namespaceskubectl create namespace tenant1kubectl create namespace tenant2
# Resource quotasapiVersion: v1kind: ResourceQuotametadata: name: tenant-quota namespace: tenant1spec: hard: requests.cpu: "4" requests.memory: 8Gi limits.cpu: "8" limits.memory: 16Gi pods: "20"
# Limit rangesapiVersion: v1kind: LimitRangemetadata: name: tenant-limits namespace: tenant1spec: limits: - max: cpu: "2" memory: "4Gi" min: cpu: "100m" memory: "128Mi" type: Container
# RBACkubectl create rolebinding tenant1-admin \ --role=admin \ --user=user1 \ --namespace=tenant1
# Network policieskubectl apply -f - <<EOFapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: deny-cross-namespace namespace: tenant1spec: podSelector: {} policyTypes: - Ingress ingress: - from: - namespaceSelector: matchLabels: tenant: tenant1EOFQ1559: How do you implement disaster recovery?
Section titled “Q1559: How do you implement disaster recovery?”Answer:
# Backup Kubernetes# ETCD backupETCDCTL_API=3 etcdctl snapshot save backup.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key
# Restore ETCDETCDCTL_API=3 etcdctl snapshot restore backup.db \ --endpoints=https://127.0.0.1:2379 \ --cacert=/etc/kubernetes/pki/etcd/ca.crt \ --cert=/etc/kubernetes/pki/etcd/server.crt \ --key=/etc/kubernetes/pki/etcd/server.key
# Velero (Kubernetes backup)# Installkubectl apply -f https://github.com/vmware-tanzu/velero/releases/download/v1.10.0/velero.yaml
# Backupvelero backup create backup-2024-01-01 --include-namespaces default
# Restorevelero restore create --from-backup backup-2024-01-01
# Schedule backupsvelero schedule create daily --schedule="0 2am * * *"
# Database backupmysqldump -u root -p mydb > backup.sqlpg_dump -U postgres mydb > backup.sql
# Object storageaws s3 sync /data s3://bucket/backup/Q1560: How do you implement security scanning?
Section titled “Q1560: How do you implement security scanning?”Answer:
# Container scanning# Trivytrivy image myimage:latesttrivy image --severity HIGH,CRITICAL myimage:latesttrivy image --exit-code 1 --severity CRITICAL myimage:latest
# Clairdocker run -p 5432:5432 -d quay.io/coreos/clair:latestclair-scanner myimage
# Infrastructure scanning# Kube-benchkube-bench run --targets node
# Kube-hunterkubectl run --rm -it --image=kubehunter/kubehunter --name kubehunter
# SAST# Bandit (Python)bandit -r myapp/
# Semgrepsemgrep --config=auto mycode/
# DAST# OWASP ZAPzap-baseline.py -t https://myapp.example.com
# Secret scanning# TruffleHogtrufflehog filesystem myrepo/
# gitleaksgitleaks --path=mydir --verboseLinux Best Practices
Section titled “Linux Best Practices”Q1561: How do you implement backup strategy?
Section titled “Q1561: How do you implement backup strategy?”Answer:
# 3-2-1 backup rule# 3 copies of data# 2 different storage types# 1 offsite copy
# Backup types# Full backuptar -czf full-backup-$(date +%Y%m%d).tar.gz /data
# Incremental backup# First full backuptar -czf backup-$(date +%Y%m%d).tar.gz -g /var/log/backup.snar /data
# Differential backup# After first full backuptar -czf differential-$(date +%Y%m%d).tar.gz -N "2024-01-01" /data
# Database backupmysqldump -u root -p --all-databases > all-databases-$(date +%Y%m%d).sqlpg_dumpall -U postgres > all-databases-$(date +%Y%m%d).sql
# Automated backup script#!/bin/bashBACKUP_DIR="/backup"DATE=$(date +%Y%m%d)
# Databasemysqldump -u root mydb | gzip > $BACKUP_DIR/mydb-$DATE.sql.gz
# Filestar -czf $BACKUP_DIR/files-$DATE.tar.gz /data
# Retentionfind $BACKUP_DIR -type f -mtime +30 -deleteQ1562: How do you implement monitoring strategy?
Section titled “Q1562: How do you implement monitoring strategy?”Answer:
# Prometheus + Grafana# Installhelm install prometheus stable/prometheus-operator \ --set grafana.service.type=LoadBalancer
# Define metrics# node_exporter# - node_cpu_seconds_total# - node_memory_MemTotal_bytes# - node_filesystem_size_bytes
# Custom application metricsfrom prometheus_client import Counter, Gauge, Histogram
requests_total = Counter('app_requests_total', 'Total requests')processing_duration = Histogram('app_processing_duration_seconds')
# Alerting rules# prometheus.rules- alert: HighCPU expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 5m labels: severity: warning
# AlertManager# alertmanager.yamlroute: group_by: ['alertname'] receiver: 'team'receivers:- name: 'team' email_configs: - to: 'team@example.com' slack_configs: - api_url: 'https://hooks.slack.com/...'Q1563: How do you implement logging strategy?
Section titled “Q1563: How do you implement logging strategy?”Answer:
# ELK Stack# Filebeatfilebeat.inputs:- type: log paths: - /var/log/*.log fields: type: syslog fields_under_root: true
output.logstash: hosts: ["logstash:5044"]
# Logstashinput { beats { port => 5044 } }filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{DATA:program}: %{GREEDYDATA:message}" } } }}output { elasticsearch { hosts => ["elasticsearch:9200"] index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" }}
# Kibana# Create index pattern# Create dashboards
# Structured logging# JSON formatimport loggingimport json
class JSONFormatter(logging.Formatter): def format(self, record): log_data = { 'timestamp': self.formatTime(record), 'level': record.levelname, 'message': record.getMessage(), 'module': record.module } return json.dumps(log_data)Q1564: How do you implement incident response?
Section titled “Q1564: How do you implement incident response?”Answer:
# Incident response plan# 1. Detection# Monitor alerts -> PagerDuty -> On-call engineer
# 2. Assessment# Check severity -> Determine impact
# 3. Communication# Create incident channel# Update status page
# 4. Mitigation# Stop bleeding# Restore service
# 5. Resolution# Fix root cause# Deploy fix
# Runbook example# Runbook: Database Connection Issues# 1. Check database status# systemctl status postgresql# 2. Check connections# psql -c "SELECT count(*) FROM pg_stat_activity"# 3. Check slow queries# psql -c "SELECT * FROM pg_stat_activity WHERE state != 'idle' LIMIT 10"# 4. Kill long-running queries# SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query_start < NOW() - INTERVAL '5 minutes';# 5. If needed, restart database# systemctl restart postgresql
# Post-incident# 1. Document timeline# 2. Identify root cause# 3. Implement fix# 4. Review and improveQ1565: How do you implement capacity planning?
Section titled “Q1565: How do you implement capacity planning?”Answer:
# Metrics collection# CPUsar -u 1 60 > cpu_usage.csv# Memorysar -r 1 60 > memory_usage.csv# I/Osar -d 1 60 > io_usage.csv# Networksar -n DEV 1 60 > network_usage.csv
# Analysis# Growth rate# (current_value - past_value) / days_between
# Capacity planning formula# CPU: (peak_usage * growth_factor * buffer) / cores# Memory: (peak_usage * growth_factor * buffer) / available# Disk: (current_usage * (1 + growth_rate)^years)# Network: peak_bandwidth * redundancy_factor
# Tools# Google SRE capacity planning# Horizontal Pod Autoscaler metricskubectl autoscale deployment myapp --cpu-percent=80 --min=2 --max=10
# Vertical Pod Autoscalerkubectl apply -f - <<EOFapiVersion: autoscaling.k8s.io/v1kind: VerticalPodAutoscalermetadata: name: myapp-vpaspec: targetRef: apiVersion: "apps/v1" kind: Deployment name: myapp updatePolicy: updateMode: "Auto"EOFLinux Expert Level
Section titled “Linux Expert Level”Q1566: How do you design highly available systems?
Section titled “Q1566: How do you design highly available systems?”Answer:
# HA architecture# Load balancer -> Web servers -> Database (primary + replica)# \-> Cache (Redis Sentinel)# \-> Message queue (Kafka/RabbitMQ cluster)
# Keepalived + HAProxy# /etc/keepalived/keepalived.confvrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100
virtual_ipaddress { 192.168.1.100 }
track_script { check_haproxy }}
vrrp_script check_haproxy { script "pkill -0 haproxy" interval 2 weight 2}
# HAProxy backendbackend web balance roundrobin option httpchk http-check expect status 200 server web1 192.168.1.10:80 check inter 2000 fall 3 rise 2 server web2 192.168.1.11:80 check inter 2000 fall 3 rise 2 backup
# Database HA# See PostgreSQL replication earlier
# DNS failover# Route 53 health checksQ1567: How do you design scalable systems?
Section titled “Q1567: How do you design scalable systems?”Answer:
# Horizontal scaling# Add more instances behind load balancer# Auto-scaling based on metrics
# Vertical scaling# Increase instance size# Requires downtime
# Database scaling# Read replicas# Sharding# Partitioning
# Cache scaling# Redis cluster moderedis-cli --cluster create 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 \ 127.0.0.1:7004 127.0.0.1:7005 127.0.0.1:7006 --cluster-replicas 1
# Message queue scaling# Kafka# Partition across brokers# Replicate for redundancy
# CDN# CloudFront, Cloudflare# Cache static assets at edge
# Stateless application design# Store sessions in Redis# Store files in S3# Database for persistent dataQ1568: How do you design secure systems?
Section titled “Q1568: How do you design secure systems?”Answer:
# Defense in depth# 1. Network security# - Firewalls# - Network segmentation# - VPN
# 2. Application security# - Input validation# - Output encoding# - Parameterized queries# - Security headers
# 3. Data security# - Encryption at rest# - Encryption in transit# - Key management# - Backup encryption
# 4. Identity and access# - RBAC# - MFA# - Least privilege# - Regular access review
# 5. Monitoring# - SIEM# - IDS/IPS# - Vulnerability scanning# - Penetration testing
# Compliance# - GDPR, HIPAA, PCI-DSS# - Audit logging# - Data retention policiesQ1569: How do you implement immutable infrastructure?
Section titled “Q1569: How do you implement immutable infrastructure?”Answer:
# Packer# Build immutable imagespacker build template.json
# No SSH access in production# Use Systems Manager Session Manager
# Cloud-init for configuration#cloud-configpackage_update: truepackages: - nginx
# Container-based deployment# Never modify running containers# Rebuild and redeploy
# Infrastructure as Code# Terraformterraform apply -var-file=prod.tfvars
# GitOps# ArgoCDargocd app sync myapp
# Blue-green deployments# Deploy to new environment# Switch traffic# Keep old environment for rollbackQ1570: How do you implement cost optimization?
Section titled “Q1570: How do you implement cost optimization?”Answer:
# Right-sizing# Use smaller instances# Monitor utilization
# Reserved instances# For steady-state workloads
# Spot instances# For batch jobs# With checkpointing
# Autoscaling# Scale down during off-hours
# Storage optimization# Use appropriate storage classes# Delete unused data# Implement lifecycle policies
# Network optimization# Use private subnets# Use VPC endpoints# Use CDN for static content
# Cost monitoring# AWS Cost Explorer# Budget alerts
# Tools# cloud-custodian# Filter and take action on resourcescustodian run -s output.yml policy.ymlLinux DevOps Advanced
Section titled “Linux DevOps Advanced”Q1571: How do you implement CI/CD pipelines?
Section titled “Q1571: How do you implement CI/CD pipelines?”Answer:
stages: - build - test - security - deploy
variables: DOCKER_DRIVER: overlay2
build: stage: build image: docker:latest services: - docker:dind script: - docker build -t $IMAGE:$CI_COMMIT_SHA . - docker push $IMAGE:$CI_COMMIT_SHA
test: stage: test image: $IMAGE script: - npm test - npm run lint coverage: '/Coverage: \d+\.\d+%/'
security: stage: security image: aquasec/trivy:latest script: - trivy image --exit-code 0 --severity HIGH,CRITICAL $IMAGE allow_failure: true
deploy-staging: stage: deploy script: - kubectl set image deployment/myapp myapp=$IMAGE - kubectl rollout status deployment/myapp environment: name: staging
deploy-production: stage: deploy script: - kubectl set image deployment/myapp myapp=$IMAGE - kubectl rollout status deployment/myapp environment: name: production when: manual only: - mainQ1572: How do you implement infrastructure testing?
Section titled “Q1572: How do you implement infrastructure testing?”Answer:
# Infrastructure as Code testing# terraform validateterraform validateterraform plan -out=tfplan
# Terratest (Go)package test
import ( "testing" "github.com/gruntwork-io/terratest/modules/terraform")
func TestTerraform(t *testing.T) { terraformOptions := &terraform.Options{ TerraformDir: "../examples/basic", }
defer terraform.Destroy(t, terraformOptions) terraform.InitAndApply(t, terraformOptions)}
# InSpec# controls/server.rbcontrol 'server-01' do impact 1.0 title 'Server should be configured properly'
describe package('nginx') do it { should be_installed } end
describe service('nginx') do it { should be_running } it { should be_enabled } endend
# Runinspec exec profile/Q1573: How do you implement secret management?
Section titled “Q1573: How do you implement secret management?”Answer:
# HashiCorp Vault# Installvault server -config=config.hcl
# Enable secrets enginevault secrets enable -path=secret kv
# Write secretvault kv put secret/myapp/db password=secretpassword
# Read secretvault kv get secret/myapp/db
# Use with Kubernetes# Install Vault Agent Injectorhelm install vault hashicorp/vault \ --set "injector.enabled=true"
# Annotate pod# metadata:# annotations:# vault.hashicorp.com/agent-inject: "true"# vault.hashicorp.com/role: "myapp"# vault.hashicorp.com/agent-inject-secret-db: "secret/data/myapp/db"
# Use in application# Read from /vault/secrets/db file
# AWS Secrets Manageraws secretsmanager create-secret \ --name myapp/db \ --secret-string '{"username":"admin","password":"secret"}'
# Kubernetes Secretskubectl create secret generic myapp-secrets \ --from-literal=username=admin \ --from-literal=password=secretQ1574: How do you implement service mesh?
Section titled “Q1574: How do you implement service mesh?”Answer:
# Istio installationistioctl install --set profile=demo
# Deploy applicationkubectl apply -f myapp.yaml
# Enable mutual TLSkubectl apply -f - <<EOFapiVersion: security.istio.io/v1beta1kind: PeerAuthenticationmetadata: name: defaultspec: mtls: mode: STRICTEOF
# Traffic managementapiVersion: networking.istio.io/v1beta1kind: VirtualServicemetadata: name: myappspec: hosts: - myapp http: - route: - destination: host: myapp subset: v1 weight: 90 - destination: host: myapp subset: v2 weight: 10
# Observability# Enable tracingistioctl install --set values.telemetry.enabled=true
# View dashboardsistioctl dashboard kialiQ1575: How do you implement edge computing?
Section titled “Q1575: How do you implement edge computing?”Answer:
# K3s (lightweight Kubernetes)curl -sfL https://get.k3s.io | K3S_KUBECONFIG_MODE="644" sh -
# KubeEdge# Cloud nodehelm install cloudcore kubeedge/cloudcore --namespace kubeedge
# Edge node# Install edgecorewget https://github.com/kubeedge/kubeedge/releases/download/v1.12.0/kubeedge_1.12.0_linux_amd64.tar.gztar -xzf kubeedge_1.12.0_linux_amd64.tar.gz
# Run edgecoreedgecore --config=/etc/kubeedge/config/edgecore.yaml
# Deploy to edgekubectl apply -f deployment.yaml
# Use case: IoT# Collect sensor data at edge# Process locally# Send aggregated data to cloudLinux Expert Scenarios
Section titled “Linux Expert Scenarios”Q1576: How do you handle production incidents?
Section titled “Q1576: How do you handle production incidents?”Answer:
# Incident response workflow# 1. Detection# - Alerts from monitoring# - User reports
# 2. Triage# - Assess severity (SEV1-4)# - Identify impact# - Determine if customer-facing
# 3. Communication# - Create incident channel# - Update status page# - Notify stakeholders
# 4. Mitigation# - Stop bleeding (rollbacks, traffic shift)# - Apply fix
# 5. Resolution# - Verify fix# - Confirm recovery
# 6. Post-mortem# - Document timeline# - Identify root cause (5 whys)# - Action items
# Example incident# Database down# 1. Check status# systemctl status postgresql# 2. Attempt restart# systemctl restart postgresql# 3. If failed, promote replica# pg_ctl promote -D /var/lib/postgresql/data# 4. Verify# psql -c "SELECT 1"# 5. DocumentQ1577: How do you perform root cause analysis?
Section titled “Q1577: How do you perform root cause analysis?”Answer:
# 5 Whys Analysis# Problem: API response time increased# Why 1: Database queries slow# Why 2: Missing index# Why 3: New feature added without proper schema review# Why 4: Code review didn't catch it# Why 5: Process doesn't require schema review
# Corrective Action: Implement schema review in CI/CD
# Tools for RCA# Logsjournalctl -u service -n 100
# Metrics# Compare before/aftersar -q
# Traces# Jaeger, Zipkin
# Dumps# Core files, heap dumps
# Timeline# Create incident timeline# 14:00 - Alert triggered# 14:05 - On-call acknowledged# 14:10 - Root cause identified# 14:15 - Fix deployed# 14:20 - Service recoveredQ1578: How do you optimize cloud costs?
Section titled “Q1578: How do you optimize cloud costs?”Answer:
# Cost optimization strategies# 1. Right-sizing instances# Use CloudWatch metricsaws ec2 describe-instance-types --instance-type t3.micro
# 2. Reserved instances# For predictable workloads
# 3. Spot instances# For fault-tolerant workloads
# 4. Autoscaling# Scale in when not needed
# 5. Storage lifecycle# Move cold data to Glacieraws s3 lsaws s3api put-bucket-lifecycle-configuration --bucket mybucket \ --lifecycle-configuration file://lifecycle.json
# 6. Delete unused resources# Find unattached volumesaws ec2 describe-volumes --filters Name=status,Values=available
# 7. Use managed services# RDS, Lambda instead of EC2
# 8. Budget alertsaws budgets create-budget \ --account-id 123456789012 \ --budget file://budget.jsonQ1579: How do you implement compliance?
Section titled “Q1579: How do you implement compliance?”Answer:
# Compliance frameworks# SOC 2, PCI-DSS, HIPAA, GDPR
# Audit logging# Enable auditdauditd
# Rules# /etc/audit/audit.rules-w /etc/passwd -p wa -k passwd_changes-w /etc/shadow -p wa -k shadow_changes-w /etc/sudoers -p wa -k sudoers_changes
# Review logsaureport -fausearch -k passwd_changes
# Vulnerability scanning# OpenVAS, Nessus, Qualys
# Penetration testing# Annual third-party pen tests
# Data encryption# At rest# LUKS, TDE
# In transit# TLS 1.2+
# Access reviews# Quarterly user access review
# Documentation# Policies and procedures# Evidence collection# Compliance reportsQ1580: How do you design disaster recovery?
Section titled “Q1580: How do you design disaster recovery?”Answer:
# DR strategies# RTO (Recovery Time Objective)# RPO (Recovery Point Objective)
# Strategy comparison# Backup & Restore# - RTO: Hours# - RPO: Days
# Pilot Light# - RTO: Minutes to hours# - RPO: Hours
# Warm Standby# - RTO: Minutes# - RPO: Minutes
# Multi-Region Active-Active# - RTO: Near zero# - RPO: Near zero
# Implementation# 1. Backup data# mysqldump --all-databases | aws s3 cp - s3://bucket/backup.sql
# 2. Replicate data# PostgreSQL streaming replication to DR region
# 3. Infrastructure as Code# terraform import# terraform apply
# 4. Regular DR testing# Quarterly DR tests# Document results
# 5. Runbook# Document recovery proceduresLinux Expert Scenarios
Section titled “Linux Expert Scenarios”Q1581: How do you handle zero-downtime deployment?
Section titled “Q1581: How do you handle zero-downtime deployment?”Answer:
# Blue-green deployment# Deploy to green environment# Test green# Switch traffic# Monitor# If issues, rollback to blue
# Rolling deployment# Update one instance at a timekubectl rolling-update myapp --image=myapp:v2
# Canary deployment# Route 10% to new version# Monitor metrics# Gradually increase# Rollback if issues
# KubernetesapiVersion: argoproj.io/v1alpha1kind: Rolloutmetadata: name: myappspec: replicas: 10 strategy: canary: maxSurge: "25%" maxUnavailable: 0 steps: - setWeight: 10 - pause: {duration: 10m} - setWeight: 30 - pause: {duration: 10m} - setWeight: 50 - pause: {duration: 10m} - setWeight: 100
# Feature flags# See earlier questionQ1582: How do you handle database migrations?
Section titled “Q1582: How do you handle database migrations?”Answer:
# Zero-downtime migrations# 1. Add new column (nullable)ALTER TABLE users ADD COLUMN new_field VARCHAR(255);
# 2. Write to both columns# Application code change
# 3. Backfill dataUPDATE users SET new_field = old_field;
# 4. Make new column NOT NULLALTER TABLE users MODIFY COLUMN new_field VARCHAR(255) NOT NULL;
# 5. Remove old columnALTER TABLE users DROP COLUMN old_field;
# For PostgreSQL# Use pg_online# Create index concurrentlyCREATE INDEX CONCURRENTLY idx_users_email ON users(email);
# For MySQL# Use pt-online-schema-changept-online-schema-change D=t,s=users --alter "ADD COLUMN new_field VARCHAR(255)" \ --execute
# Rollback plan# Keep old column# Dual write# Test thoroughlyQ1583: How do you handle capacity emergencies?
Section titled “Q1583: How do you handle capacity emergencies?”Answer:
# Emergency response# 1. Immediate mitigation# Scale upkubectl scale deployment myapp --replicas=20
# Add capacity# In AWSaws autoscaling set-desired-capacity \ --auto-scaling-group-name my-asg \ --desired-capacity 10
# 2. Identify root cause# Check metrics# Check logs# Common issues# - Traffic spike# - Slow query# - Memory leak
# 3. Short-term fix# Clear cacheredis-cli FLUSHALL
# Kill expensive queries# PostgreSQLSELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query_start < NOW() - INTERVAL '5 minutes';
# 4. Long-term fix# Optimize code# Add capacity# Implement cachingQ1584: How do you handle security incidents?
Section titled “Q1584: How do you handle security incidents?”Answer:
# Security incident response# 1. Detection# - SIEM alerts# - IDS alerts# - User reports
# 2. Containment# Isolate affected systems# iptables -I INPUT -s attacker_ip -j DROP# iptables -I OUTPUT -d attacker_ip -j DROP
# 3. Investigation# Collect evidence# tcpdump -i eth0 -w capture.pcap# Forensics
# 4. Eradication# Remove malware# Patch vulnerability# Reset compromised credentials
# 5. Recovery# Restore from clean backup# Verify system integrity
# 6. Lessons learned# Document incident# Update security controls
# Tools# - CHKRootkit# - RKHunter# - ClamAV# - OSSECQ1585: How do you handle data corruption?
Section titled “Q1585: How do you handle data corruption?”Answer:
# Data corruption response# 1. Identify corruption# Check logs# Verify checksums# md5sum
# 2. Stop writes# Read-only mount# mount -o remount,ro /data
# 3. Restore from backup# Find last good backup# Restore# mysql -u root -p mydb < backup.sql
# 4. Point-in-time recovery# PostgreSQL# Find transaction ID# pg_restore -P "2024-01-01 12:00:00" backup.dump
# 5. Verify integrity# Check application data# Run database checks
# 6. Prevention# Enable checksums# Regular backups# MonitoringQ1586: How do you handle network outages?
Section titled “Q1586: How do you handle network outages?”Answer:
# Network outage response# 1. Verify outage# ping gateway# ping 8.8.8.8
# 2. Check interfaces# ip link# ip addr
# 3. Check DNS# cat /etc/resolv.conf# nslookup example.com
# 4. Check routes# ip route
# 5. Recovery steps# Reset networksystemctl restart networking
# Or# ip link set eth0 down# ip link set eth0 up
# For DNS issues# systemd-resolve --flush-caches
# For cloud# AWSaws ec2 describe-instance-status --instance-id i-xxx
# 6. Contact provider# If not resolvable internallyQ1587: How do you handle performance degradation?
Section titled “Q1587: How do you handle performance degradation?”Answer:
# Performance troubleshooting# 1. Identify symptoms# Check metrics# top# iostat 1
# 2. Locate bottleneck# CPU bound?topps aux --sort=-%cpu
# Memory bound?free -hvmstat 1
# I/O bound?iostat -xz 1
# Network bound?iftopnethogs
# 3. Fix# CPU: Scale, optimize code# Memory: Add RAM, fix leaks# I/O: Use faster storage# Network: Optimize queries
# 4. Verify# Monitor metrics# Compare before/afterQ1588: How do you handle authentication failures?
Section titled “Q1588: How do you handle authentication failures?”Answer:
# Authentication troubleshooting# 1. Check logsjournalctl -u sshd | tail -50tail -f /var/log/auth.log
# 2. Verify user existsgetent passwd usernameid username
# 3. Check SSH configuration# /etc/ssh/sshd_config# PasswordAuthentication yes# PubkeyAuthentication yes# AllowUsers username
# 4. Test authentication# SSH with debugssh -vvv user@host
# 5. Reset passwordpasswd username
# 6. Check PAM# /etc/pam.d/sshd
# 7. For LDAP# Check connectivityldapsearch -x -D "cn=admin,dc=example,dc=com" -W
# Check sssdsssd -i -d 10Q1589: How do you handle storage full?
Section titled “Q1589: How do you handle storage full?”Answer:
# Storage full response# 1. Find large filesdu -sh /*du -sh /var/*du -sh /var/log/*
# 2. Find large directoriesdu -ah / | sort -rh | head -20
# 3. Clean logsjournalctl --vacuum-size=100Mfind /var/log -type f -mtime +30 -delete
# 4. Clean tmprm -rf /tmp/*rm -rf /var/tmp/*
# 5. Clean package cacheapt cleanyum clean all
# 6. Docker cleanupdocker system prune -a
# 7. Find deleted files still openlsof +L1
# 8. Extend storage# Add volume# Add to LVMQ1590: How do you handle kernel panic?
Section titled “Q1590: How do you handle kernel panic?”Answer:
# Kernel panic response# 1. Verify panic# Check logsdmesg | tail -100
# 2. Configure kdumpapt install kdump-tools
# 3. Analyze crash# /var/crash/crash /var/crash/202401011200/vmcore /usr/lib/debug/boot/vmlinux-$(uname -r)
# 4. Common causes# - Hardware failure (RAM, disk)# - Driver issues# - OOM# - Kernel bugs
# 5. Fixes# Update kernel# Disable problematic driver# Add RAM# Fix OOM settings
# 6. Prevention# Monitor resources# Keep kernel updated# Use hardware from compatibility listQ1591: How do you implement infrastructure monitoring?
Section titled “Q1591: How do you implement infrastructure monitoring?”Answer:
# Infrastructure monitoring# Prometheus + Grafana# Node exporternode_exporter --collector.filesystem.mount-points-exclude="^/(sys|proc|run)"
# Custom metrics# Python clientfrom prometheus_client import Counter
requests_total = Counter('app_requests_total', 'Total requests')
# Alert rules- alert: HighCPU expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 5m
- alert: HighMemory expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90
# Dashboards# Import from Grafana.com
# Logs# ELK Stack# Loki + GrafanaQ1592: How do you implement application monitoring?
Section titled “Q1592: How do you implement application monitoring?”Answer:
# APM (Application Performance Monitoring)# Jaeger# Pythonfrom opentelemetry import trace
tracer = trace.get_tracer(__name__)with tracer.start_as_current_span("operation") as span: span.set_attribute("key", "value")
# Prometheus metricsfrom prometheus_client import Counter, Histogram, Gauge
request_count = Counter('http_requests_total', 'Total HTTP requests')request_duration = Histogram('http_request_duration_seconds')active_users = Gauge('active_users', 'Number of active users')
# Health checks# Kubernetes liveness/readiness probeslivenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 30 periodSeconds: 10
readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5Q1593: How do you implement log analysis?
Section titled “Q1593: How do you implement log analysis?”Answer:
# Log analysis# ELK Stack# Elasticsearch# Logstash# Kibana
# Loki# Grafana + Loki
# Structured logging# JSON formatimport jsonimport logging
class JSONFormatter(logging.Formatter): def format(self, record): return json.dumps({ 'timestamp': self.formatTime(record), 'level': record.levelname, 'message': record.getMessage(), 'module': record.module })
# Log levels# DEBUG - Detailed info# INFO - Confirmation# WARNING - Something unexpected# ERROR - Serious problem# CRITICAL - Very serious problem
# Analysis queries# Find errorsgrep -i error /var/log/app.log
# Count by hourawk '{print $2}' /var/log/app.log | sort | uniq -c
# Slow requestsawk '$9 > 5 {print}' /var/log/nginx/access.logQ1594: How do you implement alerting?
Section titled “Q1594: How do you implement alerting?”Answer:
# Alerting# Prometheus + AlertManager# alertmanager.yamlroute: group_by: ['alertname'] receiver: 'team' group_wait: 10s group_interval: 10s
receivers:- name: 'team' email_configs: - to: 'team@example.com' slack_configs: - api_url: 'https://hooks.slack.com/...' channel: '#alerts'
# PagerDuty integration- name: 'pagerduty' pagerduty_configs: - service_key: 'KEY'
# Best practices# 1. Alert on symptoms, not causes# 2. Set appropriate thresholds# 3. Avoid alert fatigue# 4. Have runbooks# 5. Test alerts regularlyQ1595: How do you implement backup verification?
Section titled “Q1595: How do you implement backup verification?”Answer:
# Backup verification# 1. Test restoration# Restore to test environmentmysql -u root -p test < backup.sqlpsql -U postgres test < backup.sql
# 2. Automated verification#!/bin/bashBACKUP_FILE=$1
# Verify backup file existsif [ ! -f "$BACKUP_FILE" ]; then echo "Backup file not found" exit 1fi
# Verify file sizeSIZE=$(stat -f%z "$BACKUP_FILE")if [ "$SIZE" -lt 1000 ]; then echo "Backup file too small" exit 1fi
# Verify file integrityif [[ "$BACKUP_FILE" == *.gz ]]; then gzip -t "$BACKUP_FILE"elif [[ "$BACKUP_FILE" == *.sql ]]; then head -1 "$BACKUP_FILE" | grep -q "MySQL"fi
# Verify database can be restored# (Run in isolated environment)# Report statusLinux Expert Advanced
Section titled “Linux Expert Advanced”Q1596: How do you design multi-region architecture?
Section titled “Q1596: How do you design multi-region architecture?”Answer:
# Multi-region design# DNS failover# Route 53 health checksaws route53 create-health-check --health-check-config '{"Type":"HTTPS","FullyQualifiedDomainName":"example.com","Port":443,"ResourcePath":"/health"}'
# Database replication# PostgreSQL# Primary in us-east-1# Replica in us-west-2
# Object storage# S3 cross-region replicationaws s3api put-bucket-replication \ --bucket source-bucket \ --replication-configuration file://replication.json
# Cache# Redis Globalaws elasticache create-global-replication-group \ --global-replication-group-id my-global \ --primary-replication-group-id primary-id
# CDN# CloudFrontaws cloudfront create-distribution \ --origin-domain-name mybucket.s3.amazonaws.com
# Traffic management# Global Acceleratoraws global-accelerator create-acceleratorQ1597: How do you implement zero trust?
Section titled “Q1597: How do you implement zero trust?”Answer:
# Zero trust architecture# 1. Identity verification# MFA everywhere# Conditional access policies
# 2. Network segmentation# Micro-segmentation# Private links# Service mesh
# 3. Device trust# Endpoint detection# Mobile Device Management
# 4. Application security# OAuth 2.0# JWT validation
# 5. Data protection# Encryption everywhere
# Implementation# Kubernetes network policieskubectl apply -f - <<EOFapiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: default-denyspec: podSelector: {} policyTypes: - Ingress - EgressEOF
# Service mesh mTLS# Istioistioctl install --set profile=strict
# BeyondCorp# Access proxy# No VPN neededQ1598: How do you implement chaos engineering?
Section titled “Q1598: How do you implement chaos engineering?”Answer:
# Chaos engineering# 1. Define steady state# 2. Hypothesize# 3. Run experiment# 4. Observe# 5. Fix
# Tools# Chaos Monkey (Netflix)# Litmus# Chaos Mesh
# Example: Kill random podapiVersion: chaos-mesh.org/v1alpha1kind: PodChaosmetadata: name: random-pod-killspec: action: pod-failure mode: random duration: 60s
# Example: Network delayapiVersion: chaos-mesh.org/v1alpha1kind: NetworkChaosmetadata: name: network-latencyspec: action: delay mode: one duration: 60s delay: latency: 100ms
# Runbook# Document expected behavior# Monitor during experiment# Have rollback planQ1599: How do you implement GitOps?
Section titled “Q1599: How do you implement GitOps?”Answer:
# GitOps# 1. Store all configs in Git# 2. Use CI/CD to apply changes# 3. Automated drift detection
# ArgoCD# Application definitionapiVersion: argoproj.io/v1alpha1kind: Applicationmetadata: name: myappspec: project: default source: repoURL: https://github.com/org/repo.git targetRevision: HEAD path: k8s/production destination: server: https://kubernetes.default.svc namespace: production syncPolicy: automated: prune: true selfHeal: true
# Flux# Installflux install
# Create sourceflux create source git myapp \ --url=https://github.com/org/repo \ --branch=main
# Create kustomizationflux create kustomization myapp \ --source=myapp \ --path=./k8s/productionQ1600: How do you implement cost governance?
Section titled “Q1600: How do you implement cost governance?”Answer:
# Cost governance# 1. Tagging strategy# All resources must have tags# - Team: team-name# - Project: project-name# - CostCenter: cost-center
# 2. Budgets# Set budgets per team/projectaws budgets create-budget \ --account-id 123456789012 \ --budget file://budget.json
# 3. Rightsizing# Use AWS Compute Optimizeraws compute-optimizer get-recommendation-resource-views
# 4. Reserved capacity# For steady workloads# Purchase reserved instances
# 5. Use spot# For fault-tolerant workloads
# 6. Delete unused resources# Find unattached volumesaws ec2 describe-volumes --filters Name=status,Values=available
# 7. Regular review# Weekly cost review meetings# Track spend trends
# 8. Showback/Chargeback# Report costs by teamQ1601: How do you implement compliance automation?
Section titled “Q1601: How do you implement compliance automation?”Answer:
# Compliance automation# Open Policy Agent (OPA)# Gatekeeper# Prevents non-compliant resources
# Policy examplepackage kubernetes.admission
deny[msg] { input.request.kind.kind == "Deployment" input.request.object.spec.replicas > 10 msg = "Cannot have more than 10 replicas"}
# Install Gatekeeperkubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/master/library/general/replications/replicationconstraint.yaml
# InSpec# Compliance as code# controls/nginx.rbcontrol 'nginx-01' do impact 1.0 title 'Nginx should be configured securely'
describe service('nginx') do it { should be_running } end
describe file('/etc/nginx/nginx.conf') do its('content') { should_not match /server_tokens off;/ } endend
# Runinspec exec compliance/Q1602: How do you implement disaster recovery automation?
Section titled “Q1602: How do you implement disaster recovery automation?”Answer:
# DR automation# 1. Backup automation#!/bin/bash# Automated backupBACKUP_DATE=$(date +%Y%m%d)
# Database backupmysqldump -u root -p mydb | gzip > s3://bucket/backup-$BACKUP_DATE.sql.gz
# File backuptar -czf - /data | aws s3 cp - s3://bucket/data-$BACKUP_DATE.tar.gz
# Retentionaws s3 ls s3://bucket/ | awk '{print $2}' | while read prefix; do if [[ $(echo $prefix | grep -oP '\d{8}') < $(date -d '30 days ago' +%Y%m%d) ]]; then aws s3 rm s3://bucket/$prefix --recursive fidone
# 2. DR playbook# Documented runbooks# Regular testing
# 3. Automated failover# DNS failover# Route 53 health checks + failover recordaws route53 change-resource-record-sets \ --hosted-zone-id Z1234567890 \ --change-batch file://failover.json
# Database failover# Automatic replica promotion# Connection string updateQ1603: How do you implement capacity management?
Section titled “Q1603: How do you implement capacity management?”Answer:
# Capacity management# 1. Monitor utilization# CPU, Memory, Storage, Network
# 2. Trend analysis# Weekly reviews# Growth rate calculation
# 3. Forecasting# Use ML# aws ce get-forecast
# 4. Planning# Add capacity before hitting limits
# 5. Optimization# Right-size instances# Use savings plans
# Kubernetes# Vertical Pod Autoscalerkubectl apply -f - <<EOFapiVersion: autoscaling.k8s.io/v1kind: VerticalPodAutoscalermetadata: name: myapp-vpaspec: targetRef: apiVersion: "apps/v1" kind: Deployment name: myapp updatePolicy: updateMode: "Auto"EOF
# Horizontal Pod Autoscalerkubectl autoscale deployment myapp \ --cpu-percent=80 --min=2 --max=10Q1604: How do you implement reliability engineering?
Section titled “Q1604: How do you implement reliability engineering?”Answer:
# Reliability engineering# SRE principles# 1. SLOs (Service Level Objectives)# - Availability: 99.9%# - Latency: p99 < 200ms
# 2. Error budgets# 100% - SLO = error budget# If budget exhausted, freeze features
# 3. Toil reduction# Automate manual tasks
# 4. Post-mortems# Blameless# Focus on process improvement
# 5. Releases# Canary deployments# Feature flags
# 6. Circuit breakers# See earlier
# 7. Bulkheads# Isolate failures
# 8. Self-healing# Restart failed pods# Replace unhealthy nodesQ1605: How do you implement SRE practices?
Section titled “Q1605: How do you implement SRE practices?”Answer:
# SRE practices# Error budgets# https://sre.google/sre-book/availability-table/
# Toil management# Identify# Quantify# Automate# Eliminate
# Observability# Metrics# Logs# Traces
# Incident management# On-call rotation# Runbooks# Post-mortems
# Change management# Canary releases# Gradual rollouts
# SRE tools# Prometheus# Grafana# Jaeger# Loki
# On-call# PagerDuty# OpsGenie
# Automation# Ansible# Terraform# KubernetesQ1606: How do you optimize Linux for cloud?
Section titled “Q1606: How do you optimize Linux for cloud?”Answer:
# Cloud-optimized Linux# Ubuntu Pro for AWS# AWS-optimized kernel# FIPS compliance# Livepatch
# Cloud-specific optimizations# Use instance store for temp data# Use EBS for persistent data
# Network optimization# ENA (Elastic Network Adapter)# Use enhanced networking
# Storage optimization# Use NVMe for high I/O# Use EBS gp3 for balance
# CloudWatch Agent# Install# /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
# Configure# /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/metrics.json{ "metrics": { "namespace": "CustomNamespace", "metrics_collected": { "cpu": { "measurement": ["cpu_usage_idle"] }, "mem": { "measurement": ["mem_used_percent"] } } }}Q1607: How do you implement FinOps?
Section titled “Q1607: How do you implement FinOps?”Answer:
# FinOps# Cloud financial management
# 1. Visibility# Tag all resources# Use cost explorer
# 2. Optimization# Right-sizing# Reservations# Spot instances
# 3. Accountability# Showback to teams# Budgets
# Tools# AWS Cost Explorer# GCP Cloud Billing# Azure Cost Management
# FinOps workflow# 1. Inform# Show costs by team# Dashboards
# 2. Optimize# Right-size resources# Use savings plans
# 3. Operate# Monitor daily spend# Alerts
# Automation# Script to find idle resourcesaws ec2 describe-instances --filters "Name=instance-state-name,Values=running" \ --query 'Reservations[*].Instances[*].[InstanceId,Tags[?Key==`Name`].Value|[0],LaunchTime]' \ --output tableQ1608: How do you implement platform engineering?
Section titled “Q1608: How do you implement platform engineering?”Answer:
# Platform engineering# Internal Developer Platform (IDP)# Self-service
# Components# 1. CI/CD pipelines# GitHub Actions# GitLab CI
# 2. Service catalog# Backstage# Port
# 3. Infrastructure templates# Terraform modules# Helm charts
# 4. Observability# Unified dashboards
# 5. Security# Policy enforcement
# Implementation# Platform team builds tools# Developers consume
# Benefits# Faster deployments# Consistency# Security# Reduced cognitive load
# Backstage# Create catalog# Service templates# DocumentationQ1609: How do you implement developer experience?
Section titled “Q1609: How do you implement developer experience?”Answer:
# Developer experience# 1. Local development# Docker Compose# localstack
# 2. Documentation# OpenAPI specs# Swagger UI
# 3. IDE integration# LSP servers# Debugging
# 4. Testing# Fast feedback# Unit tests# Integration tests
# 5. Deployment# Simple commands# kubectl# ArgoCD
# Example: Developer workflow# 1. Clone repo# 2. Make changes# 3. Run tests locally# 4. Push to branch# 5. CI runs tests# 6. Merge to main# 7. CD deploys
# Self-service# Create environment# Deploy app# View logs# Scale applicationQ1610: How do you implement cloud security?
Section titled “Q1610: How do you implement cloud security?”Answer:
# Cloud security# Shared responsibility model
# Identity# IAM with least privilege# MFA everywhere
# Network# VPC with private subnets# Security groups# NACLs# WAF
# Data# Encryption at rest# Encryption in transit# Key management
# Compliance# Regular audits# Vulnerability scanning# Penetration testing
# Tools# AWS GuardDuty# AWS Config# AWS Security Hub
# Example: AWS security# Enable CloudTrailaws cloudtrail create-trail --name my-trail \ --s3-bucket-name mybucket
# Enable GuardDutyaws guardduty create-detector --enable
# Enable Security Hubaws securityhub enable-organization-admin-account \ --admin-account-id 123456789012Q1611: How do you implement Kubernetes security?
Section titled “Q1611: How do you implement Kubernetes security?”Answer:
# Kubernetes security# 1. RBAC# Least privilegekubectl create role pod-reader --verb=get,list --resource=podskubectl create rolebinding --role=pod-reader --user=dev
# 2. Network policies# Default denykubectl apply -f network-policy.yaml
# 3. Pod security# Pod security standardsapiVersion: v1kind: Namespacemetadata: name: production labels: pod-security.kubernetes.io/enforce: restricted
# 4. Secrets management# Use Vault or AWS Secrets Managerkubectl create secret generic mysecret \ --from-literal=key=value
# 5. Image scanningtrivy image myimage:latest
# 6. Runtime security# Falcofalco -r rules/myrules.yaml
# 7. API server security# Disable anonymous auth# Enable RBAC# Use TLSQ1612: How do you implement data protection?
Section titled “Q1612: How do you implement data protection?”Answer:
# Data protection# 1. Classification# Public, Internal, Confidential, Restricted
# 2. Encryption# At restcryptsetup luksFormat /dev/sdb1
# In transit# TLS 1.2+
# 3. Access control# IAM policies# Database permissions
# 4. Backup# Regular backups# Test restoration# Offsite backup
# 5. Monitoring# Audit logs# Alerts on suspicious access
# 6. Data loss prevention# Block sensitive data exfiltration
# Tools# AWS Macie# GCP DLP# Azure PurviewQ1613: How do you implement supply chain security?
Section titled “Q1613: How do you implement supply chain security?”Answer:
# Supply chain security# 1. Dependency scanning# Snyk# Dependabot
# 2. Container scanning# Trivy# Clair
# 3. SBOM (Software Bill of Materials)# Generate SBOMsyft myimage:latest
# Sign artifacts# Cosigncosign sign myimage:latest
# Verifycosign verify myimage:latest
# 4. SLSA compliance# Build provenance# GitHub Actions# Tekton
# 5. Secure build pipeline# No external dependencies at build time# Use pinned versions# Scan for secretsQ1614: How do you implement incident management?
Section titled “Q1614: How do you implement incident management?”Answer:
# Incident management# 1. Detection# Monitoring alerts# User reports
# 2. Response# Acknowledge# Assess severity# Mitigate
# 3. Communication# Status page# Stakeholder updates
# 4. Resolution# Fix root cause# Verify recovery
# 5. Post-incident# Blameless post-mortem# Action items
# Tools# PagerDuty# OpsGenie# VictorOps
# Runbook example# Runbook: Database Connection Issues# 1. Check database status# systemctl status postgresql# 2. Check connections# psql -c "SELECT count(*) FROM pg_stat_activity"# 3. Restart if needed# systemctl restart postgresqlQ1615: How do you implement change management?
Section titled “Q1615: How do you implement change management?”Answer:
# Change management# 1. Request# JIRA ticket# RFC (Request for Change)
# 2. Review# Technical review# Security review
# 3. Approval# Manager approval# CAB (Change Advisory Board)
# 4. Implementation# Schedule change window# Implement change
# 5. Verification# Test in staging# Monitor in production
# 6. Documentation# Update runbooks# Document lessons learned
# 7. Emergency changes# Expedited process# Post-implementation review
# Tools# ServiceNow# Jira Service Management# GitHub PRsLinux Expert Interview Questions
Section titled “Linux Expert Interview Questions”Q1616: How do you design a highly available web application?
Section titled “Q1616: How do you design a highly available web application?”Answer:
# Architecture components# 1. Load balancer (HAProxy/ALB)# 2. Web servers (multiple)# 3. Application servers (multiple)# 4. Database (primary + replica)# 5. Cache (Redis Sentinel/Cluster)# 6. Message queue (Kafka cluster)# 7. CDN for static content# 8. Object storage (S3)
# Implementation# Multi-AZ deployment# Auto-scaling groups# Health checks# Graceful degradation
# DNS# Route 53 with health checks
# Database# PostgreSQL with streaming replication
# Caching# Redis with Sentinel or Cluster
# Monitoring# Comprehensiveability
# DR observ# Multi-region deploymentQ1617: How do you troubleshoot a slow database?
Section titled “Q1617: How do you troubleshoot a slow database?”Answer:
# Database troubleshooting# 1. Check system resources# CPU, Memory, I/O
# 2. Check database stats# PostgreSQL# pg_stat_activity# pg_stat_statements
# MySQL# SHOW PROCESSLIST;# SHOW STATUS;
# 3. Check slow queries# PostgreSQL# pg_stat_statements# EXPLAIN ANALYZE
# MySQL# SHOW PROCESSLIST# EXPLAIN
# 4. Check indexes# PostgreSQL# \d table_name
# MySQL# SHOW INDEX FROM table
# 5. Fixes# Add indexes# Optimize queries# Tune configuration# Scale horizontally# Add read replicasQ1618: How do you design a backup strategy?
Section titled “Q1618: How do you design a backup strategy?”Answer:
# Backup strategy# 1. RPO/RTO definition# Recovery Point Objective# Recovery Time Objective
# 2. Backup types# Full# Incremental# Differential
# 3. Frequency# Full: Weekly# Incremental: Daily# Transaction logs: Every 15 minutes
# 4. Retention# Daily: 30 days# Weekly: 12 weeks# Monthly: 12 months# Yearly: 7 years
# 5. Testing# Monthly restoration tests# Document procedures
# 6. Offsite# Cross-region replication# Different cloud provider
# 7. Automation# Cron jobs# CI/CD pipelinesQ1619: How do you secure a Linux system?
Section titled “Q1619: How do you secure a Linux system?”Answer:
# Linux security# 1. Updates# Regular patching
# 2. Firewall# iptables/firewalld
# 3. SELinux/AppArmor# Enable and configure
# 4. Users# Disable root login# SSH keys only# Strong passwords
# 5. Services# Disable unused services
# 6. Network# Harden kernel parameters# Disable IP forwarding# Rate limiting
# 7. Monitoring# Audit logging# IDS
# 8. Encryption# Full disk encryption# TLS everywhereQ1620: How do you design a monitoring system?
Section titled “Q1620: How do you design a monitoring system?”Answer:
# Monitoring system design# 1. Metrics# Prometheus# Node exporter# Application metrics
# 2. Logs# ELK Stack or Loki
# 3. Traces# Jaeger or Zipkin
# 4. Alerting# Prometheus AlertManager# PagerDuty integration
# 5. Dashboards# Grafana
# 6. SLOs# Define error budgets
# 7. Runbooks# Document responses
# 8. On-call# Rotation scheduleQ1621: How do you optimize Linux performance?
Section titled “Q1621: How do you optimize Linux performance?”Answer:
# Linux optimization# 1. CPU# Tune scheduler# Process affinity# Priority adjustment
# 2. Memory# Swappiness# Cache tuning# Huge pages
# 3. I/O# I/O scheduler# Filesystem choice# Mount options# SSD optimization
# 4. Network# Buffer sizes# TCP tuning# Offloading
# 5. Kernel# Update regularly# Tune parameters
# 6. Applications# Profiling# Optimization
# Tools# perf# sysbench# fio# iperfQ1622: How do you design a disaster recovery plan?
Section titled “Q1622: How do you design a disaster recovery plan?”Answer:
# DR planning# 1. Risk assessment# Identify critical systems# RTO/RPO requirements
# 2. Strategy# Backup & Restore# Pilot Light# Warm Standby# Multi-region
# 3. Implementation# Automated backups# Replication# Infrastructure as Code
# 4. Testing# Regular DR tests# Document results
# 5. Documentation# Runbooks# Contact list
# 6. Communication# Stakeholder notification# Status updatesQ1623: How do you implement zero-downtime deployments?
Section titled “Q1623: How do you implement zero-downtime deployments?”Answer:
# Zero-downtime deployment# 1. Load balancer# Health checks# Graceful removal
# 2. Application# Signal handling# Graceful shutdown
# 3. Database# Schema migrations# Backward compatibility
# 4. Strategies# Rolling update# Blue-green# Canary# Feature flags
# 5. Rollback plan# Quick rollback capability
# 6. Testing# Load testing# Chaos engineeringQ1624: How do you handle capacity planning?
Section titled “Q1624: How do you handle capacity planning?”Answer:
# Capacity planning# 1. Current state# Measure utilization
# 2. Trends# Analyze growth
# 3. Forecasting# Predict future needs
# 4. Planning# Add capacity proactively
# 5. Optimization# Right-size resources# Use automation
# Metrics# CPU# Memory# Disk# Network# Application-specific
# Tools# Prometheus# Grafana# AWS Compute Optimizer# Azure AdvisorQ1625: How do you implement compliance?
Section titled “Q1625: How do you implement compliance?”Answer:
# Compliance implementation# 1. Framework# SOC 2, PCI-DSS, HIPAA, GDPR
# 2. Controls# Access control# Encryption# Monitoring# Auditing
# 3. Automation# Policy as Code# OPA/Gatekeeper
# 4. Evidence# Automated collection# Documentation
# 5. Training# Security awareness
# 6. Testing# Vulnerability scans# Penetration tests
# 7. Remediation# Track findings# Fix issuesQ1626: How do you design for scale?
Section titled “Q1626: How do you design for scale?”Answer:
# Designing for scale# 1. Horizontal scaling# Stateless applications# Load balancers# Auto-scaling
# 2. Database scaling# Read replicas# Sharding# Partitioning# Caching
# 3. Caching# Multi-layer# Redis/Memcached
# 4. Asynchronous# Message queues# Event-driven
# 5. CDN# Static content
# 6. Optimization# Profiling# Database tuning
# 7. Monitoring# Early detectionQ1627: How do you implement observability?
Section titled “Q1627: How do you implement observability?”Answer:
# Observability# 1. Metrics# Prometheus# Custom metrics
# 2. Logs# Structured logging# ELK/Loki
# 3. Traces# Distributed tracing
# 4. Correlation# Trace IDs# Request IDs
# 5. Alerting# Based on SLOs
# 6. Dashboards# Service overview# Troubleshooting
# 7. Post-mortems# Blameless analysis
# Implementation# OpenTelemetry# Many toolsQ1628: How do you secure containerized applications?
Section titled “Q1628: How do you secure containerized applications?”Answer:
# Container security# 1. Images# Minimal base# No secrets in images# Scan for vulnerabilities
# 2. Runtime# Non-root user# Read-only root# Resource limits
# 3. Network# Network policies# Service mesh
# 4. Orchestrator# RBAC# Pod security policies
# 5. Secrets# Use secrets manager# Don't use env vars
# Tools# Trivy# Falco# OPAQ1629: How do you implement infrastructure as code?
Section titled “Q1629: How do you implement infrastructure as code?”Answer:
# Infrastructure as Code# 1. Version control# Git
# 2. Modules# Reusable components
# 3. State management# Remote state# State locking
# 4. Testing# Validate# Plan
# 5. CI/CD# Automated deployment
# 6. Drift detection# Detect changes
# Tools# Terraform# Pulumi# CloudFormation# AnsibleQ1630: How do you manage secrets in CI/CD?
Section titled “Q1630: How do you manage secrets in CI/CD?”Answer:
# Secrets in CI/CD# 1. Never commit secrets
# 2. Use secrets management# HashiCorp Vault# AWS Secrets Manager# Azure Key Vault
# 3. Environment variables# Inject at runtime
# 4. CI/CD integration# GitHub Secrets# GitLab CI variables
# 5. Rotation# Auto-rotate secrets
# 6. Audit# Log accessQ1631: How do you design a secure network?
Section titled “Q1631: How do you design a secure network?”Answer:
# Secure network design# 1. Segmentation# DMZ# Internal# Database
# 2. Firewall# Whitelist approach# Default deny
# 3. Encryption# TLS everywhere# VPN for access
# 4. Monitoring# IDS/IPS# NetFlow
# 5. DDoS protection# CDN# WAF# Rate limitingQ1632: How do you handle database failover?
Section titled “Q1632: How do you handle database failover?”Answer:
# Database failover# 1. Automatic detection# Health checks
# 2. Failover process# Promote replica# Update DNS
# 3. Application handling# Connection retry# Circuit breakers
# 4. Monitoring# Alert on failover
# 5. Testing# Regular drillsQ1633: How do you implement caching?
Section titled “Q1633: How do you implement caching?”Answer:
# Caching strategy# 1. CDN# Static assets
# 2. Application cache# Redis# Memcached
# 3. Database cache# Query cache# Buffer pool
# 4. Browser cache# Headers
# 5. Invalidation# TTL# Cache busting# PatternsQ1634: How do you design for high availability?
Section titled “Q1634: How do you design for high availability?”Answer:
# High availability design# 1. Redundancy# Multiple AZs# Multiple regions
# 2. Load balancing# Health checks# Failover
# 3. Data replication# Synchronous# Asynchronous
# 4. Monitoring# Fast detection
# 5. Automation# Self-healing
# 6. Testing# Chaos engineeringQ1635: How do you secure Kubernetes?
Section titled “Q1635: How do you secure Kubernetes?”Answer:
# Kubernetes security# 1. RBAC# Least privilege
# 2. Network policies# Default deny
# 3. Pod security# Standards
# 4. Secrets# External
# 5. Images# Scanning
# 6. Runtime# Falco
# 7. Updates# RegularQ1636: How do you design API security?
Section titled “Q1636: How do you design API security?”Answer:
# API security# 1. Authentication# OAuth 2.0# JWT
# 2. Authorization# RBAC# Scopes
# 3. Rate limiting# Throttling
# 4. Input validation# Sanitization
# 5. TLS# Encryption
# 6. Monitoring# Anomaly detectionQ1637: How do you implement logging?
Section titled “Q1637: How do you implement logging?”Answer:
# Logging implementation# 1. Format# JSON# Structured
# 2. Levels# DEBUG, INFO, WARN, ERROR
# 3. Correlation# Trace IDs
# 4. Rotation# Logrotate
# 5. Aggregation# ELK/Loki
# 6. Retention# PolicyQ1638: How do you design for security?
Section titled “Q1638: How do you design for security?”Answer:
# Security design# 1. Defense in depth# Multiple layers
# 2. Least privilege# Minimize access
# 3. Zero trust# Verify always
# 4. Encryption# Everywhere
# 5. Monitoring# Continuous
# 6. Automation# Respond fastQ1639: How do you implement incident response?
Section titled “Q1639: How do you implement incident response?”Answer:
# Incident response# 1. Preparation# Runbooks# Tools
# 2. Detection# Alerts
# 3. Containment# Isolate
# 4. Eradication# Fix
# 5. Recovery# Restore
# 6. Lessons learned# Post-mortemQ1640: How do you optimize cloud costs?
Section titled “Q1640: How do you optimize cloud costs?”Answer:
# Cost optimization# 1. Right-sizing# Match needs
# 2. Reservations# Steady state
# 3. Spot# Fault-tolerant
# 4. Automation# Scale down
# 5. Cleanup# Unused resources
# 6. Monitoring# AlertsQ1641: How do you implement change automation?
Section titled “Q1641: How do you implement change automation?”Answer:
# Change automation# 1. GitOps# All changes in Git
# 2. CI/CD# Automated testing
# 3. Approval gates# Manual steps
# 4. Rollback# Automatic
# 5. Monitoring# Quick detectionQ1642: How do you design for failure?
Section titled “Q1642: How do you design for failure?”Answer:
# Design for failure# 1. Redundancy# Multiple copies
# 2. Graceful degradation# Partial service
# 3. Circuit breakers# Prevent cascade
# 4. Bulkheads# Isolate
# 5. Recovery# Fast
# 6. Testing# ChaosQ1643: How do you implement access control?
Section titled “Q1643: How do you implement access control?”Answer:
# Access control# 1. Authentication# MFA
# 2. Authorization# RBAC
# 3. Least privilege# Minimal access
# 4. Audit# Log access
# 5. Review# RegularQ1644: How do you secure data?
Section titled “Q1644: How do you secure data?”Answer:
# Data security# 1. Classification# Sensitivity
# 2. Encryption# At rest# In transit
# 3. Access control# Need to know
# 4. Backup# Encrypted
# 5. Monitoring# AuditQ1645: How do you design APIs?
Section titled “Q1645: How do you design APIs?”Answer:
# API design# 1. REST# Resources# HTTP verbs
# 2. Versioning# URL path
# 3. Error handling# Consistent
# 4. Pagination# Large sets
# 5. Rate limiting# Throttle
# 6. Documentation# OpenAPIQ1646: How do you implement service mesh?
Section titled “Q1646: How do you implement service mesh?”Answer:
# Service mesh# 1. Traffic management# Routing
# 2. Security# mTLS
# 3. Observability# Tracing
# 4. Resilience# Retries
# Tools# Istio# Linkerd# Consul ConnectQ1647: How do you optimize databases?
Section titled “Q1647: How do you optimize databases?”Answer:
# Database optimization# 1. Indexing# Proper indexes
# 2. Query optimization# EXPLAIN
# 3. Caching# Use cache
# 4. Connection pooling# Pool
# 5. Scaling# Read replicas# Sharding
# 6. Configuration# Tune parametersQ1648: How do you implement secrets management?
Section titled “Q1648: How do you implement secrets management?”Answer:
# Secrets management# 1. Centralized# Vault
# 2. Rotation# Auto
# 3. Audit# Log access
# 4. Encryption# Encrypt
# 5. Access control# Least privilegeQ1649: How do you design for disasters?
Section titled “Q1649: How do you design for disasters?”Answer:
# Disaster recovery# 1. Backup# Regular
# 2. Replication# Cross-region
# 3. Automation# Fast recovery
# 4. Testing# Regular
# 5. Documentation# RunbooksQ1650: How do you implement observability?
Section titled “Q1650: How do you implement observability?”Answer:
# Observability# 1. Metrics# Prometheus
# 2. Logs# ELK
# 3. Traces# Jaeger
# 4. Correlation# Trace IDs
# 5. Alerting# SLO-basedLinux Advanced Scenarios
Section titled “Linux Advanced Scenarios”Q1651: How do you handle kernel upgrades?
Section titled “Q1651: How do you handle kernel upgrades?”Answer:
# Kernel upgrade# 1. Test in staging# 2. Check compatibility# 3. Backup# 4. Schedule window# 5. Apply# 6. Monitor# 7. Rollback planQ1652: How do you design multi-tenant systems?
Section titled “Q1652: How do you design multi-tenant systems?”Answer:
# Multi-tenancy# 1. Isolation# Namespaces# RBAC
# 2. Quotas# Resources
# 3. Billing# Usage tracking
# 4. Data separation# Logical/physical
# 5. Network# SegmentationQ1653: How do you implement edge computing?
Section titled “Q1653: How do you implement edge computing?”Answer:
# Edge computing# 1. Lightweight K8s# K3s
# 2. Data processing# Local first
# 3. Sync# Periodic
# 4. Security# Edge-specific
# 5. Management# CentralizedQ1654: How do you optimize Linux for containers?
Section titled “Q1654: How do you optimize Linux for containers?”Answer:
# Container optimization# 1. OS# Minimal OS
# 2. Kernel# Tuned for containers
# 3. Storage# Overlay2
# 4. Network# CNI
# 5. Runtime# containerd
# 6. Security# HardenedQ1655: How do you design for GDPR?
Section titled “Q1655: How do you design for GDPR?”Answer:
# GDPR compliance# 1. Data minimization# Collect less
# 2. Consent# Explicit
# 3. Right to erasure# Delete capability
# 4. Portability# Export data
# 5. Breach notification# Process
# 6. DPO# AppointQ1656: How do you implement zero-downtime patching?
Section titled “Q1656: How do you implement zero-downtime patching?”Answer:
# Zero-downtime patching# 1. Blue-green# Two environments
# 2. Canary# Gradual
# 3. Rolling# One by one
# 4. Health checks# Before switch
# 5. Rollback# QuickQ1657: How do you design for IoT?
Section titled “Q1657: How do you design for IoT?”Answer:
# IoT architecture# 1. Edge# Local processing
# 2. Protocol# MQTT
# 3. Security# Device auth
# 4. Scale# Millions
# 5. OTA updates# SecureQ1658: How do you implement RBAC?
Section titled “Q1658: How do you implement RBAC?”Answer:
# RBAC implementation# 1. Roles# Define
# 2. Permissions# Map
# 3. Assignment# Users
# 4. Audit# Regular review
# 5. Tools# LDAP integrationQ1659: How do you optimize network performance?
Section titled “Q1659: How do you optimize network performance?”Answer:
# Network optimization# 1. Offloading# Hardware
# 2. Buffer tuning# TCP
# 3. Compression# Accept encoding
# 4. CDN# Static
# 5. Keepalive# HTTPQ1660: How do you design for mobile?
Section titled “Q1660: How do you design for mobile?”Answer:
# Mobile optimization# 1. API design# Efficient
# 2. Compression# gz/brotli
# 3. Caching# Aggressive
# 4. Offline# PWA
# 5. Security# Certificate pinningQ1661: How do you implement chaos engineering?
Section titled “Q1661: How do you implement chaos engineering?”Answer:
# Chaos engineering# 1. Define steady state# What works
# 2. Hypothesize# What will fail
# 3. Experiment# Inject failure
# 4. Learn# Observe
# 5. Improve# Fix
# Tools# Chaos Mesh# Litmus# GremlinQ1662: How do you implement immutable infrastructure?
Section titled “Q1662: How do you implement immutable infrastructure?”Answer:
# Immutable infrastructure# 1. Images# Pre-built
# 2. No changes# Rebuild
# 3. Versioned# All
# 4. Rollback# Previous image
# 5. Tools# Packer# ContainerQ1663: How do you design for high performance?
Section titled “Q1663: How do you design for high performance?”Answer:
# High performance design# 1. Profiling# Find bottleneck
# 2. Optimization# Targeted
# 3. Caching# Multi-layer
# 4. Async# Non-blocking
# 5. Scaling# HorizontalQ1664: How do you implement multi-cloud?
Section titled “Q1664: How do you implement multi-cloud?”Answer:
# Multi-cloud strategy# 1. Abstraction# Terraform
# 2. Portability# Container
# 3. Vendor lock-in# Avoid
# 4. Data# Strategy
# 5. Operations# UnifiedQ1665: How do you implement cost allocation?
Section titled “Q1665: How do you implement cost allocation?”Answer:
# Cost allocation# 1. Tagging# All resources
# 2. Tracking# By team/project
# 3. Reporting# Regular
# 4. Budgets# Alerts
# 5. Accountability# ShowbackQ1666: How do you design for compliance automation?
Section titled “Q1666: How do you design for compliance automation?”Answer:
# Compliance automation# 1. Policy as code# OPA
# 2. Scanning# Automated
# 3. Evidence# Auto-collect
# 4. Remediation# Auto-fix
# 5. Audit# RegularQ1667: How do you implement API rate limiting?
Section titled “Q1667: How do you implement API rate limiting?”Answer:
# API rate limiting# 1. Token bucket# Leaky bucket
# 2. Per-user# By key
# 3. Headers# Rate limit
# 4. Response# 429
# 5. Throttling# GracefulQ1668: How do you design for IoT security?
Section titled “Q1668: How do you design for IoT security?”Answer:
# IoT security# 1. Device identity# Certificates
# 2. OTA updates# Signed
# 3. Network# Segmentation
# 4. Data# Encryption
# 5. Monitoring# AnomalyQ1669: How do you implement infrastructure monitoring?
Section titled “Q1669: How do you implement infrastructure monitoring?”Answer:
# Infrastructure monitoring# 1. Metrics# Collect
# 2. Storage# Time-series
# 3. Visualization# Dashboards
# 4. Alerting# Thresholds
# 5. Analysis# TrendsQ1670: How do you implement database sharding?
Section titled “Q1670: How do you implement database sharding?”Answer:
# Database sharding# 1. Key strategy# Choose shard key
# 2. Routing# Application
# 3. Rebalancing# Plan
# 4. Cross-shard# Minimize
# 5. Monitoring# PerformanceQ1671: How do you design for 5G?
Section titled “Q1671: How do you design for 5G?”Answer:
# 5G optimization# 1. Edge computing# Local processing
# 2. Network slicing# Dedicated
# 3. Low latency# Optimization
# 4. Massive IoT# ScaleQ1672: How do you implement service discovery?
Section titled “Q1672: How do you implement service discovery?”Answer:
# Service discovery# 1. DNS# Consul
# 2. Health checks# Registration
# 3. Load balancing# Client-side
# 4. Failover# AutomaticQ1673: How do you optimize web performance?
Section titled “Q1673: How do you optimize web performance?”Answer:
# Web performance# 1. CDN# Static assets
# 2. Compression# gz/brotli
# 3. Caching# Headers
# 4. Minification# CSS/JS
# 5. Images# OptimizationQ1674: How do you implement backup verification?
Section titled “Q1674: How do you implement backup verification?”Answer:
# Backup verification# 1. Test restore# Regular
# 2. Automation# Script
# 3. Checksums# Verify
# 4. Documentation# ProceduresQ1675: How do you design for privacy?
Section titled “Q1675: How do you design for privacy?”Answer:
# Privacy design# 1. Data minimization# Collect less
# 2. Encryption# Strong
# 3. Access control# Strict
# 4. Audit# Logging
# 5. Retention# PolicyQ1676: How do you implement auto-remediation?
Section titled “Q1676: How do you implement auto-remediation?”Answer:
# Auto-remediation# 1. Detection# Alerts
# 2. Classification# Severity
# 3. Action# Runbook
# 4. Automation# Scripts
# 5. Verification# Confirm fixQ1677: How do you optimize storage?
Section titled “Q1677: How do you optimize storage?”Answer:
# Storage optimization# 1. Tiering# Hot/cold
# 2. Compression# Deduplication
# 3. Lifecycle# Policies
# 4. Monitoring# Usage
# 5. Cleanup# RegularQ1678: How do you implement MFA?
Section titled “Q1678: How do you implement MFA?”Answer:
# MFA implementation# 1. Factors# Multiple
# 2. Methods# TOTP/Push
# 3. Rollout# Gradual
# 4. Backup# Recovery codes
# 5. Enforcement# PolicyQ1679: How do you design for resilience?
Section titled “Q1679: How do you design for resilience?”Answer:
# Resilience design# 1. Redundancy# Multiple
# 2. Fault tolerance# Graceful
# 3. Recovery# Fast
# 4. Testing# Chaos
# 5. Monitoring# Real-timeQ1680: How do you implement cost reporting?
Section titled “Q1680: How do you implement cost reporting?”Answer:
# Cost reporting# 1. Tagging# Comprehensive
# 2. Collection# Automated
# 3. Analysis# By team
# 4. Visualization# Dashboards
# 5. Actions# OptimizationQ1681: How do you design for IoT data?
Section titled “Q1681: How do you design for IoT data?”Answer:
# IoT data management# 1. Collection# MQTT/HTTP
# 2. Processing# Stream
# 3. Storage# Time-series
# 4. Analysis# Real-time
# 5. Retention# PolicyQ1682: How do you implement service catalog?
Section titled “Q1682: How do you implement service catalog?”Answer:
# Service catalog# 1. Self-service# Portal
# 2. Standardization# Templates
# 3. Governance# Approval
# 4. Documentation# Auto-generatedQ1683: How do you optimize database queries?
Section titled “Q1683: How do you optimize database queries?”Answer:
# Query optimization# 1. EXPLAIN# Analyze
# 2. Indexing# Strategic
# 3. Rewriting# Equivalent
# 4. Caching# Query cache
# 5. Profiling# Slow queriesQ1684: How do you implement API gateway?
Section titled “Q1684: How do you implement API gateway?”Answer:
# API gateway# 1. Routing# Path-based
# 2. Authentication# JWT
# 3. Rate limiting# Quotas
# 4. Caching# Response
# 5. Monitoring# UsageQ1685: How do you design for compliance?
Section titled “Q1685: How do you design for compliance?”Answer:
# Compliance design# 1. Controls# Framework
# 2. Automation# Policy
# 3. Evidence# Collection
# 4. Monitoring# Continuous
# 5. Audit# RegularQ1686: How do you implement incident automation?
Section titled “Q1686: How do you implement incident automation?”Answer:
# Incident automation# 1. Detection# Automated
# 2. Triage# Classification
# 3. Response# Runbooks
# 4. Escalation# Rules
# 5. Resolution# TrackingQ1687: How do you optimize Kubernetes?
Section titled “Q1687: How do you optimize Kubernetes?”Answer:
# Kubernetes optimization# 1. Resources# Requests/limits
# 2. Scheduling# Affinity
# 3. Networking# CNI
# 4. Storage# Classes
# 5. Autoscaling# HPA/VPAQ1688: How do you implement data governance?
Section titled “Q1688: How do you implement data governance?”Answer:
# Data governance# 1. Classification# Sensitivity
# 2. Ownership# Clear
# 3. Quality# Rules
# 4. Lineage# Tracking
# 5. Compliance# PolicyQ1689: How do you design for ML infrastructure?
Section titled “Q1689: How do you design for ML infrastructure?”Answer:
# ML infrastructure# 1. Data pipeline# ETL
# 2. Training# Distributed
# 3. Serving# Model serving
# 4. Monitoring# Drift
# 5. MLOps# AutomationQ1690: How do you implement cloud governance?
Section titled “Q1690: How do you implement cloud governance?”Answer:
# Cloud governance# 1. Policies# Guardrails
# 2. Tagging# Standards
# 3. Cost control# Budgets
# 4. Security# Baseline
# 5. Compliance# AuditQ1691: How do you design for edge security?
Section titled “Q1691: How do you design for edge security?”Answer:
# Edge security# 1. Device auth# Certificates
# 2. Data encryption# TLS
# 3. Network# Segmentation
# 4. Updates# Signed
# 5. Monitoring# CentralizedQ1692: How do you implement container orchestration?
Section titled “Q1692: How do you implement container orchestration?”Answer:
# Container orchestration# 1. Scheduling# Placement
# 2. Scaling# Auto
# 3. Networking# Service mesh
# 4. Storage# CSI
# 5. Security# PoliciesQ1693: How do you optimize network latency?
Section titled “Q1693: How do you optimize network latency?”Answer:
# Network latency optimization# 1. CDN# Geographic
# 2. Caching# Multi-layer
# 3. Compression# gz/brotli
# 4. HTTP/2# Multiplexing
# 5. DNS# AnycastQ1694: How do you implement data protection?
Section titled “Q1694: How do you implement data protection?”Answer:
# Data protection# 1. Encryption# At rest/transit
# 2. Access control# RBAC
# 3. Backup# Automated
# 4. Monitoring# Audit
# 5. Incident# ResponseQ1695: How do you design for real-time processing?
Section titled “Q1695: How do you design for real-time processing?”Answer:
# Real-time processing# 1. Stream processing# Kafka/Spark
# 2. Low latency# Optimization
# 3. Scalability# Horizontal
# 4. Monitoring# Metrics
# 5. Backpressure# HandlingQ1696: How do you implement application security?
Section titled “Q1696: How do you implement application security?”Answer:
# Application security# 1. SDLC# Secure
# 2. SAST/DAST# Scanning
# 3. Dependencies# Scanning
# 4. Runtime# Protection
# 5. Training# DevelopersQ1697: How do you optimize Linux for databases?
Section titled “Q1697: How do you optimize Linux for databases?”Answer:
# Linux database optimization# 1. Filesystem# XFS/ext4
# 2. I/O scheduler# Deadline/noop
# 3. Memory# Huge pages
# 4. Network# Buffer sizes
# 5. Disk# SSD/NVMeQ1698: How do you implement data retention?
Section titled “Q1698: How do you implement data retention?”Answer:
# Data retention# 1. Policy# Defined
# 2. Classification# By type
# 3. Automation# Scripts
# 4. Compliance# Legal holds
# 5. Verification# RegularQ1699: How do you design for compliance reporting?
Section titled “Q1699: How do you design for compliance reporting?”Answer:
# Compliance reporting# 1. Evidence# Automated
# 2. Framework# Mapping
# 3. Controls# Validation
# 4. Audit# Support
# 5. Remediation# TrackingQ1700: How do you implement Kubernetes networking?
Section titled “Q1700: How do you implement Kubernetes networking?”Answer:
# Kubernetes networking# 1. CNI plugin# Calico/Flannel
# 2. Network policies# Segmentation
# 3. Services# Types
# 4. Ingress# Controller
# 5. DNS# CoreDNSQ1701: How do you optimize database connections?
Section titled “Q1701: How do you optimize database connections?”Answer:
# Database connection optimization# 1. Pooling# Connection pool
# 2. Sizing# Pool size
# 3. Timeouts# Configure
# 4. Monitoring# Active connections
# 5. Tuning# Database configQ1702: How do you implement backup automation?
Section titled “Q1702: How do you implement backup automation?”Answer:
# Backup automation# 1. Scheduling# Cron
# 2. Retention# Policy
# 3. Verification# Test restore
# 4. Offsite# Replication
# 5. Monitoring# AlertsQ1703: How do you design for regulatory compliance?
Section titled “Q1703: How do you design for regulatory compliance?”Answer:
# Regulatory compliance# 1. Assessment# Gap analysis
# 2. Controls# Implementation
# 3. Monitoring# Continuous
# 4. Documentation# Evidence
# 5. Audit# SupportQ1704: How do you implement service level objectives?
Section titled “Q1704: How do you implement service level objectives?”Answer:
# SLO implementation# 1. Define# Metrics
# 2. Measurement# Collection
# 3. Alerting# Budget
# 4. Reporting# Regular
# 5. Improvement# ActionQ1705: How do you optimize Linux storage?
Section titled “Q1705: How do you optimize Linux storage?”Answer:
# Linux storage optimization# 1. Filesystem# Choice
# 2. Mount options# Tuning
# 3. LVM# Flexible
# 4. RAID# Configuration
# 5. Monitoring# I/OQ1706: How do you implement network segmentation?
Section titled “Q1706: How do you implement network segmentation?”Answer:
# Network segmentation# 1. VLANs# Isolation
# 2. Firewalls# Zones
# 3. Zero trust# Micro-segmentation
# 4. Monitoring# Traffic
# 5. Compliance# AuditQ1707: How do you design for ML model serving?
Section titled “Q1707: How do you design for ML model serving?”Answer:
# ML model serving# 1. Framework# TensorFlow Serving
# 2. Scaling# Horizontal
# 3. A/B testing# Canary
# 4. Monitoring# Drift
# 5. Updates# RollingQ1708: How do you implement vulnerability management?
Section titled “Q1708: How do you implement vulnerability management?”Answer:
# Vulnerability management# 1. Scanning# Regular
# 2. Prioritization# Severity
# 3. Remediation# Process
# 4. Verification# Rescan
# 5. Reporting# MetricsQ1709: How do you optimize web application security?
Section titled “Q1709: How do you optimize web application security?”Answer:
# Web application security# 1. WAF# Deploy
# 2. Headers# Security
# 3. Input validation# Sanitization
# 4. SQL injection# Prevention
# 5. XSS# ProtectionQ1710: How do you design for compliance automation?
Section titled “Q1710: How do you design for compliance automation?”Answer:
# Compliance automation# 1. Policy as code# OPA
# 2. Scanning# Continuous
# 3. Remediation# Auto
# 4. Evidence# Collection
# 5. Reporting# AutomatedQ1711: How do you implement incident communication?
Section titled “Q1711: How do you implement incident communication?”Answer:
# Incident communication# 1. Stakeholders# Identification
# 2. Status page# Updates
# 3. Channels# Multiple
# 4. Timing# Regular
# 5. Post-incident# CommunicationQ1712: How do you optimize Kubernetes resources?
Section titled “Q1712: How do you optimize Kubernetes resources?”Answer:
# Kubernetes resource optimization# 1. Requests# Set appropriately
# 2. Limits# Configure
# 3. HPA# Auto-scale
# 4. VPA# Recommendations
# 5. Monitoring# UsageQ1713: How do you implement data classification?
Section titled “Q1713: How do you implement data classification?”Answer:
# Data classification# 1. Categories# Public, Internal, Confidential
# 2. Labeling# Automatic
# 3. Policies# Based on class
# 4. Training# Awareness
# 5. Auditing# RegularQ1714: How do you design for regulatory requirements?
Section titled “Q1714: How do you design for regulatory requirements?”Answer:
# Regulatory requirements# 1. Framework# Selection
# 2. Controls# Implementation
# 3. Monitoring# Continuous
# 4. Evidence# Automated
# 5. Audit# SupportQ1715: How do you implement cost allocation tags?
Section titled “Q1715: How do you implement cost allocation tags?”Answer:
# Cost allocation tags# 1. Tagging policy# Required tags
# 2. Enforcement# SCP
# 3. Reporting# By tag
# 4. Alerts# Budget
# 5. Optimization# ActionQ1716: How do you optimize Linux for networking?
Section titled “Q1716: How do you optimize Linux for networking?”Answer:
# Linux network optimization# 1. Buffer sizes# Tuning
# 2. Offloading# Enable
# 3. TCP# Parameters
# 4. Queue# Tuning
# 5. Monitoring# MetricsQ1717: How do you implement service mesh security?
Section titled “Q1717: How do you implement service mesh security?”Answer:
# Service mesh security# 1. mTLS# Enable
# 2. Authorization# Policies
# 3. Encryption# Automatic
# 4. Audit# Logging
# 5. Updates# RegularQ1718: How do you design for disaster recovery testing?
Section titled “Q1718: How do you design for disaster recovery testing?”Answer:
# DR testing# 1. Schedule# Regular
# 2. Scope# Defined
# 3. Documentation# Runbooks
# 4. Validation# Success
# 5. Improvements# Action itemsQ1719: How do you implement API versioning?
Section titled “Q1719: How do you implement API versioning?”Answer:
# API versioning# 1. Strategy# URL path
# 2. Deprecation# Policy
# 3. Documentation# Swagger
# 4. Migration# Guide
# 5. Support# TimelineQ1720: How do you optimize container images?
Section titled “Q1720: How do you optimize container images?”Answer:
# Container image optimization# 1. Base image# Minimal
# 2. Layers# Reduce
# 3. Caching# Build cache
# 4. Multi-stage# Build
# 5. Scanning# SecurityQ1721: How do you implement compliance monitoring?
Section titled “Q1721: How do you implement compliance monitoring?”Answer:
# Compliance monitoring# 1. Controls# Continuous
# 2. Alerts# Deviation
# 3. Reporting# Regular
# 4. Remediation# Tracking
# 5. Audit# SupportQ1722: How do you design for data pipelines?
Section titled “Q1722: How do you design for data pipelines?”Answer:
# Data pipeline design# 1. Source# Connectors
# 2. Processing# ETL/ELT
# 3. Quality# Validation
# 4. Destination# Storage
# 5. Monitoring# AlertsQ1723: How do you implement zero trust network?
Section titled “Q1723: How do you implement zero trust network?”Answer:
# Zero trust network# 1. Verify# Always
# 2. Least privilege# Access
# 3. Micro-segmentation# Network
# 4. Encryption# All traffic
# 5. Monitoring# ContinuousQ1724: How do you optimize Linux for high availability?
Section titled “Q1724: How do you optimize Linux for high availability?”Answer:
# Linux HA optimization# 1. Keepalived# Configure
# 2. HAProxy# Tune
# 3. Health checks# Configure
# 4. Monitoring# Comprehensive
# 5. Testing# RegularQ1725: How do you implement security automation?
Section titled “Q1725: How do you implement security automation?”Answer:
# Security automation# 1. Scanning# Automated
# 2. Remediation# Auto-fix
# 3. Response# Playbooks
# 4. Integration# CI/CD
# 5. Monitoring# ContinuousQ1726: How do you design for event-driven architecture?
Section titled “Q1726: How do you design for event-driven architecture?”Answer:
# Event-driven architecture# 1. Event sourcing# Design
# 2. Message broker# Kafka
# 3. Consumers# Scaling
# 4. Idempotency# Handle
# 5. Monitoring# EventsQ1727: How do you implement infrastructure testing?
Section titled “Q1727: How do you implement infrastructure testing?”Answer:
# Infrastructure testing# 1. Validation# Terraform
# 2. Integration# Kitchen
# 3. Compliance# InSpec
# 4. Security# Scanning
# 5. Chaos# EngineeringQ1728: How do you optimize for DevOps?
Section titled “Q1728: How do you optimize for DevOps?”Answer:
# DevOps optimization# 1. CI/CD# Optimize
# 2. Automation# Everything
# 3. Monitoring# Feedback
# 4. Collaboration# Teams
# 5. Culture# ImprovementQ1729: How do you implement data encryption?
Section titled “Q1729: How do you implement data encryption?”Answer:
# Data encryption# 1. At rest# LUKS
# 2. In transit# TLS
# 3. Application# Field-level
# 4. Keys# Management
# 5. Rotation# PolicyQ1730: How do you design for incident recovery?
Section titled “Q1730: How do you design for incident recovery?”Answer:
# Incident recovery# 1. Detection# Fast
# 2. Containment# Quick
# 3. Eradication# Complete
# 4. Recovery# Fast
# 5. Post-incident# LearningQ1731: How do you implement container security scanning?
Section titled “Q1731: How do you implement container security scanning?”Answer:
# Container security scanning# 1. Build time# Scan images
# 2. Registry# Scan stored
# 3. Runtime# Scan running
# 4. Policies# Define
# 5. Automation# CI/CDQ1732: How do you optimize Linux for virtualization?
Section titled “Q1732: How do you optimize Linux for virtualization?”Answer:
# Linux virtualization optimization# 1. CPU# Pinning
# 2. Memory# Overcommit
# 3. Network# Para-virtual
# 4. Storage# VirtIO
# 5. Monitoring# Per-VMQ1733: How do you implement access certification?
Section titled “Q1733: How do you implement access certification?”Answer:
# Access certification# 1. Review schedule# Quarterly
# 2. Certification# Campaign
# 3. Remediation# Tasks
# 4. Exceptions# Approval
# 5. Reporting# AuditQ1734: How do you design for data recovery?
Section titled “Q1734: How do you design for data recovery?”Answer:
# Data recovery# 1. Backups# Multiple
# 2. Point in time# Capability
# 3. Testing# Regular
# 4. Documentation# Procedures
# 5. Team# TrainingQ1735: How do you implement API authentication?
Section titled “Q1735: How do you implement API authentication?”Answer:
# API authentication# 1. OAuth 2.0# Implement
# 2. JWT# Tokens
# 3. API keys# Management
# 4. Rotation# Policy
# 5. Monitoring# UsageQ1736: How do you optimize database indexing?
Section titled “Q1736: How do you optimize database indexing?”Answer:
# Database indexing# 1. Identify# Slow queries
# 2. Analyze# EXPLAIN
# 3. Create# Appropriate
# 4. Composite# Order
# 5. Maintenance# RebuildQ1737: How do you implement incident triage?
Section titled “Q1737: How do you implement incident triage?”Answer:
# Incident triage# 1. Classification# Severity
# 2. Impact# Assessment
# 3. Prioritization# Order
# 4. Assignment# Owner
# 5. Escalation# PathQ1738: How do you design for cloud migration?
Section titled “Q1738: How do you design for cloud migration?”Answer:
# Cloud migration# 1. Assessment# Discovery
# 2. Planning# Strategy
# 3. Migration# Execute
# 4. Validation# Testing
# 5. Optimization# Post-migrationQ1739: How do you implement security policies?
Section titled “Q1739: How do you implement security policies?”Answer:
# Security policies# 1. Framework# Define