Auto Scaling & Elastic Load Balancing
Chapter 7: Auto Scaling & Load Balancing
Section titled “Chapter 7: Auto Scaling & Load Balancing”Building Scalable and Highly Available Applications
Section titled “Building Scalable and Highly Available Applications”7.1 Overview
Section titled “7.1 Overview”Auto Scaling and Elastic Load Balancing work together to provide automatic scaling and high availability for your applications.
Auto Scaling & Load Balancing Architecture+------------------------------------------------------------------+| || Internet || | || v || +---------------+ || | Route 53 | || +---------------+ || | || v || +-----------------------------------+ || | Application Load Balancer | || | (ALB) | || +-----------------------------------+ || | | || v v || +---------------------+ +---------------------+ || | Auto Scaling | | Auto Scaling | || | Group 1 | | Group 2 | || | | | | || | +----+ +----+ +----+| | +----+ +----+ +----+| || | |EC2 | |EC2 | |EC2 || | |EC2 | |EC2 | |EC2 || || | +----+ +----+ +----+| | +----+ +----+ +----+| || | AZ-A AZ-B AZ-C | | AZ-A AZ-B AZ-C | || +---------------------+ +---------------------+ || || Components: || - Load Balancer: Distributes traffic || - Auto Scaling Group: Manages instance count || - Launch Template: Defines instance configuration || |+------------------------------------------------------------------+7.2 Elastic Load Balancing (ELB)
Section titled “7.2 Elastic Load Balancing (ELB)”ELB Types Comparison
Section titled “ELB Types Comparison” Load Balancer Types+------------------------------------------------------------------+| || Application Load Balancer (ALB) || +----------------------------------------------------------+ || | Layer: 7 (HTTP/HTTPS) | || | Features: | || | - Content-based routing | || | - Host-based routing | || | - Path-based routing | || | - WebSocket support | || | - HTTP/2 support | || | - TLS termination | || | Use Cases: | || | - Web applications | || | - Microservices | || | - Containerized applications | || +----------------------------------------------------------+ || || Network Load Balancer (NLB) || +----------------------------------------------------------+ || | Layer: 4 (TCP/UDP) | || | Features: | || | - Ultra-high performance | || | - Static IP address | || | - TLS passthrough | || | - UDP support | || | - Millions of requests/second | || | Use Cases: | || | - Real-time gaming | || | - IoT | || | - Non-HTTP workloads | || +----------------------------------------------------------+ || || Gateway Load Balancer (GWLB) || +----------------------------------------------------------+ || | Layer: 3 (IP) | || | Features: | || | - Transparent network gateway | || | - Third-party security appliances | || | - Inline traffic inspection | || | Use Cases: | || | - Firewalls | || | - IDS/IPS | || | - Deep packet inspection | || +----------------------------------------------------------+ || || Classic Load Balancer (CLB) - Legacy || +----------------------------------------------------------+ || | Layer: 4 & 7 | || | Status: Deprecated (use ALB/NLB instead) | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+ALB Routing Architecture
Section titled “ALB Routing Architecture” ALB Request Routing+------------------------------------------------------------------+| || Client Request || | || v || +---------------+ || | Listener | || | (Port 443) | || +---------------+ || | || v || +---------------+ || | Rules | || | Evaluation | || +---------------+ || | || +------------------+------------------+ || | | | || v v v || +---------+ +---------+ +---------+ || | Rule 1 | | Rule 2 | | Default | || | | | | | Rule | || |Host: | |Path: | | | || |api. | |/images | | | || |example. | | | | | || |com | | | | | || +---------+ +---------+ +---------+ || | | | || v v v || +---------+ +---------+ +---------+ || |Target | |Target | |Target | || |Group 1 | |Group 2 | |Group 3 | || | | | | | | || |API | |Image | |Web | || |Servers | |Servers | |Servers | || +---------+ +---------+ +---------+ || |+------------------------------------------------------------------+Target Groups
Section titled “Target Groups” Target Group Configuration+------------------------------------------------------------------+| || Target Group Settings: || +----------------------------------------------------------+ || | Protocol: HTTP/HTTPS/TCP | || | Port: Application port | || | Health Check: | || | - Path: /health | || | - Interval: 30 seconds | || | - Timeout: 5 seconds | || | - Healthy threshold: 3 | || | - Unhealthy threshold: 2 | || +----------------------------------------------------------+ || || Target Types: || +----------------------------------------------------------+ || | | || | Instance | IP Address | Lambda | || | +----------+ | +----------+ | +----------+ | || | | EC2 | | | Private | | | Function | | || | | Instance | | | IP | | | | | || | | ID | | | Address | | | | | || | +----------+ | +----------+ | +----------+ | || | | | | || | Use: EC2 in ASG | Use: Containers | Use: Serverless | || | | on ECS/EKS | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Health Checks
Section titled “Health Checks” Health Check Flow+------------------------------------------------------------------+| || Load Balancer || +----------------------------------------------------------+ || | | || | Health Check Request (every 30s) | || | | | || | v | || | +----------+ +----------+ +----------+ | || | | Target 1 | | Target 2 | | Target 3 | | || | | | | | | | | || | | GET | | GET | | GET | | || | | /health | | /health | | /health | | || | | | | | | | | || | | 200 OK | | 200 OK | | 503 | | || | | HEALTHY | | HEALTHY | | UNHEALTHY| | || | +----------+ +----------+ +----------+ | || | | || | Traffic Routing: | || | - Only routes to HEALTHY targets | || | - Unhealthy targets removed from rotation | || | - Auto Scaling can replace unhealthy instances | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+7.3 Auto Scaling Groups (ASG)
Section titled “7.3 Auto Scaling Groups (ASG)”ASG Architecture
Section titled “ASG Architecture” Auto Scaling Group Components+------------------------------------------------------------------+| || +------------------------+ || | Auto Scaling Group | || +------------------------+ || | || +---------------------+---------------------+ || | | | || v v v || +----------+ +----------+ +----------+ || | Launch | | Scaling | | Health | || | Template | | Policies | | Checks | || +----------+ +----------+ +----------+ || || Launch Template: Instance configuration || Scaling Policies: When to scale || Health Checks: Instance health monitoring || |+------------------------------------------------------------------+Scaling Policies
Section titled “Scaling Policies” Auto Scaling Policies+------------------------------------------------------------------+| || 1. Simple Scaling || +----------------------------------------------------------+ || | | || | CloudWatch Alarm | || | | | || | v | || | +----------+ +----------+ | || | | CPU > 80%| --> | Add 2 | | || | | | | Instances| | || | +----------+ +----------+ | || | | || | Cooldown Period: Wait before next scaling action | || +----------------------------------------------------------+ || || 2. Step Scaling || +----------------------------------------------------------+ || | | || | CPU Utilization Instances to Add | || | +----------------+-------------------+ | || | | 60-70% | +1 instance | | || | | 70-80% | +2 instances | | || | | 80-90% | +3 instances | | || | | > 90% | +4 instances | | || | +----------------+-------------------+ | || +----------------------------------------------------------+ || || 3. Target Tracking || +----------------------------------------------------------+ || | | || | Target: CPU = 50% | || | | || | Actual CPU Action | || | +----------------+-------------------+ | || | | 70% | Scale out | | || | | 40% | Scale in | | || | | 50% | No action | | || | +----------------+-------------------+ | || | | || | AWS automatically adjusts capacity | || +----------------------------------------------------------+ || || 4. Predictive Scaling || +----------------------------------------------------------+ || | | || | Uses ML to predict traffic patterns | || | | || | Time Predicted Traffic Instances | || | +----------------+-------------------+----------------+ | || | | 09:00 | High | 10 | | || | | 12:00 | Peak | 15 | | || | | 18:00 | Low | 5 | | || | +----------------+-------------------+----------------+ | || | | || | Pre-provisions capacity before traffic spikes | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Scaling Process Flow
Section titled “Scaling Process Flow” Auto Scaling Process+------------------------------------------------------------------+| || Scale Out (Add Instances) || +----------------------------------------------------------+ || | | || | 1. CloudWatch Alarm triggers | || | | | || | v | || | 2. ASG evaluates scaling policy | || | | | || | v | || | 3. Launch new instance using Launch Template | || | | | || | v | || | 4. Instance boots and passes health checks | || | | | || | v | || | 5. Instance added to Load Balancer target group | || | | | || | v | || | 6. Traffic routed to new instance | || | | || +----------------------------------------------------------+ || || Scale In (Remove Instances) || +----------------------------------------------------------+ || | | || | 1. CloudWatch Alarm triggers (low utilization) | || | | | || | v | || | 2. ASG selects instance to terminate | || | | | || | v | || | 3. Instance enters Standby or Terminate | || | | | || | v | || | 4. Connection draining (if enabled) | || | | | || | v | || | 5. Instance removed from target group | || | | | || | v | || | 6. Instance terminated | || | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Instance Protection
Section titled “Instance Protection” Instance Protection Options+------------------------------------------------------------------+| || Scale-In Protection || +----------------------------------------------------------+ || | | || | Protected Instances (not terminated during scale-in) | || | | || | +----------+ +----------+ +----------+ | || | | Instance | | Instance | | Instance | | || | | 1 | | 2 | | 3 | | || | | [LOCK] | | | | [LOCK] | | || | |Protected | | Can be | |Protected | | || | +----------+ |terminated| +----------+ | || | +----------+ | || | | || | Enable: | || | aws autoscaling set-instance-protection \ | || | --instance-ids i-12345 \ | || | --protected-from-scale-in | || +----------------------------------------------------------+ || || Standby State || +----------------------------------------------------------+ || | | || | Instance in Standby: | || | - Not serving traffic | || | - Not replaced by ASG | || | - Can be updated/troubleshooted | || | | || | Enter Standby: | || | aws autoscaling enter-standby \ | || | --instance-ids i-12345 \ | || | --auto-scaling-group-name my-asg | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+7.4 Load Balancer Integration
Section titled “7.4 Load Balancer Integration”ALB + ASG Architecture
Section titled “ALB + ASG Architecture” Complete ALB + ASG Setup+------------------------------------------------------------------+| || Internet || | || v || +---------------+ || | Route 53 | || | (DNS) | || +---------------+ || | || v || +-----------------------------------+ || | Application Load Balancer | || | | || | Listeners: | || | - Port 80 (HTTP) -> Redirect | || | - Port 443 (HTTPS) | || | | || | Target Groups: | || | - Web-TG (Port 8080) | || | - API-TG (Port 3000) | || +-----------------------------------+ || / \ || / \ || v v || +---------------------+ +---------------------+ || | Auto Scaling | | Auto Scaling | || | Group: Web | | Group: API | || | | | | || | Min: 2 | | Min: 2 | || | Max: 10 | | Max: 20 | || | Desired: 3 | | Desired: 5 | || | | | | || | +----+ +----+ +----+| | +----+ +----+ +----+ +----+ +----+| || | |Web | |Web | |Web || | |API | |API | |API | |API | |API || || | +----+ +----+ +----+| | +----+ +----+ +----+ +----+ +----+| || +---------------------+ +---------------------+ || |+------------------------------------------------------------------+7.5 Practical Configuration
Section titled “7.5 Practical Configuration”Terraform: ALB + ASG
Section titled “Terraform: ALB + ASG”# ============================================================# Application Load Balancer# ============================================================
resource "aws_lb" "main" { name = "my-alb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.alb.id] subnets = var.public_subnet_ids
enable_deletion_protection = false}
resource "aws_lb_target_group" "web" { name = "web-tg" port = 8080 protocol = "HTTP" vpc_id = var.vpc_id
health_check { enabled = true healthy_threshold = 3 interval = 30 matcher = "200" path = "/health" port = "traffic-port" protocol = "HTTP" timeout = 5 unhealthy_threshold = 2 }}
resource "aws_lb_listener" "https" { load_balancer_arn = aws_lb.main.arn port = "443" protocol = "HTTPS" ssl_policy = "ELBSecurityPolicy-2021-06" certificate_arn = var.certificate_arn
default_action { type = "forward" target_group_arn = aws_lb_target_group.web.arn }}
# ============================================================# Auto Scaling Group# ============================================================
resource "aws_launch_template" "web" { name_prefix = "web-" image_id = var.ami_id instance_type = "t3.medium" key_name = var.key_name
iam_instance_profile { name = aws_iam_instance_profile.web.name }
network_interfaces { associate_public_ip_address = false security_groups = [aws_security_group.web.id] }
user_data = base64encode(<<-EOF #!/bin/bash yum install -y httpd systemctl start httpd EOF )
tag_specifications { resource_type = "instance" tags = { Name = "WebServer" } }}
resource "aws_autoscaling_group" "web" { name = "web-asg" vpc_zone_identifier = var.private_subnet_ids
min_size = 2 max_size = 10 desired_capacity = 3
launch_template { id = aws_launch_template.web.id version = "$Latest" }
target_group_arns = [aws_lb_target_group.web.arn]
health_check_type = "ELB" health_check_grace_period = 300
tag { key = "Name" value = "WebServer" propagate_at_launch = true }}
resource "aws_autoscaling_policy" "scale_out" { name = "scale-out" scaling_adjustment = 2 adjustment_type = "ChangeInCapacity" cooldown = 300 autoscaling_group_name = aws_autoscaling_group.web.name}
resource "aws_cloudwatch_metric_alarm" "high_cpu" { alarm_name = "high-cpu" comparison_operator = "GreaterThanThreshold" evaluation_periods = "2" metric_name = "CPUUtilization" namespace = "AWS/EC2" period = "120" statistic = "Average" threshold = "80"
dimensions = { AutoScalingGroupName = aws_autoscaling_group.web.name }
alarm_actions = [aws_autoscaling_policy.scale_out.arn]}7.6 Best Practices
Section titled “7.6 Best Practices” Auto Scaling & Load Balancing Best Practices+------------------------------------------------------------------+| || 1. Multi-AZ Deployment || +----------------------------------------------------------+ || | - Deploy across minimum 2 AZs | || | - Use all available AZs for maximum availability | || | - Configure subnets in each AZ | || +----------------------------------------------------------+ || || 2. Health Check Configuration || +----------------------------------------------------------+ || | - Use meaningful health check endpoints | || | - Set appropriate thresholds | || | - Configure grace period for instance startup | || +----------------------------------------------------------+ || || 3. Scaling Policies || +----------------------------------------------------------+ || | - Use target tracking for simplicity | || | - Set appropriate cooldown periods | || | - Consider predictive scaling for known patterns | || +----------------------------------------------------------+ || || 4. Instance Warm-up || +----------------------------------------------------------+ || | - Allow time for instance initialization | || | - Use lifecycle hooks for custom initialization | || | - Configure appropriate grace period | || +----------------------------------------------------------+ || || 5. Monitoring || +----------------------------------------------------------+ || | - Monitor scaling activities | || | - Set up CloudWatch alarms | || | - Use ASG notifications | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+7.7 Why This Matters in DevOps/SRE
Section titled “7.7 Why This Matters in DevOps/SRE”Auto Scaling and Load Balancing are the foundation of self-healing infrastructure. They’re central to achieving high availability, handling traffic spikes, and enabling zero-downtime deployments.
ASG/ELB in DevOps Workflow+------------------------------------------------------------------+| || Core SRE Use Cases: || || 1. Zero-Downtime Deployments || +----------------------------------------------------------+ || | - Rolling updates via ASG instance refresh | || | - Blue/green deployments with weighted target groups | || | - Canary releases using ALB routing rules | || +----------------------------------------------------------+ || || 2. Self-Healing Infrastructure || +----------------------------------------------------------+ || | - ELB health checks detect failed instances | || | - ASG replaces unhealthy instances automatically | || | - No manual intervention required during failures | || +----------------------------------------------------------+ || || 3. Cost-Efficient Scaling || +----------------------------------------------------------+ || | - Scale to zero during off-hours (dev/staging) | || | - Mixed instance policies (Spot + On-Demand) | || | - Predictive scaling for known traffic patterns | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+7.8 Linux Systems Perspective
Section titled “7.8 Linux Systems Perspective”ASG/ELB Monitoring from Arch Linux
Section titled “ASG/ELB Monitoring from Arch Linux”# Install monitoring toolssudo pacman -S aws-cli-v2 jq
# ASG status dashboard script#!/bin/bash# ~/bin/asg-dashboard.shset -euo pipefail
echo "=== Auto Scaling Groups Status ==="echo ""
for asg in $(aws autoscaling describe-auto-scaling-groups \ --query 'AutoScalingGroups[*].AutoScalingGroupName' \ --output text); do
echo "--- $asg ---" aws autoscaling describe-auto-scaling-groups \ --auto-scaling-group-names "$asg" \ --query 'AutoScalingGroups[0].{ Min:MinSize, Max:MaxSize, Desired:DesiredCapacity, Instances:Instances[*].{ Id:InstanceId, Health:HealthStatus, Lifecycle:LifecycleState } }' --output yaml echo ""done
# Monitor ALB health in real-timewatch -n 10 'aws elbv2 describe-target-health \ --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-tg/12345 \ --query "TargetHealthDescriptions[*].[Target.Id,TargetHealth.State,TargetHealth.Description]" \ --output table'
# Trigger instance refresh (rolling deployment)aws autoscaling start-instance-refresh \ --auto-scaling-group-name my-asg \ --preferences '{ "MinHealthyPercentage": 90, "InstanceWarmup": 300 }' \ --desired-configuration '{ "LaunchTemplate": { "LaunchTemplateId": "lt-12345", "Version": "$Latest" } }'7.9 Troubleshooting Guide
Section titled “7.9 Troubleshooting Guide”| Issue | Cause | Solution |
|---|---|---|
| Instances launching but immediately failing | Health check misconfigured | Increase grace period, verify health check path |
| ASG not scaling out | CloudWatch alarm not triggering | Check alarm thresholds and evaluation periods |
| Scaling oscillation | Cooldown too short | Increase cooldown period, use target tracking |
| 5xx errors during deployment | No connection draining | Enable deregistration delay on target group |
| Uneven traffic distribution | Cross-zone LB disabled | Enable cross-zone load balancing |
| New instances failing health check | App startup time too long | Increase health check grace period |
# Debug scaling issues# Check recent scaling activitiesaws autoscaling describe-scaling-activities \ --auto-scaling-group-name my-asg \ --max-items 5 \ --query 'Activities[*].[StartTime,StatusCode,Description]' \ --output table
# Check ALB target healthaws elbv2 describe-target-health \ --target-group-arn arn:aws:....:targetgroup/my-tg/123 \ --query 'TargetHealthDescriptions[*].[Target.Id,TargetHealth.State,TargetHealth.Reason]' \ --output table7.10 Common Mistakes & Anti-Patterns
Section titled “7.10 Common Mistakes & Anti-Patterns” ASG/ELB Anti-Patterns+------------------------------------------------------------------+| || ❌ Mistake 1: Single-AZ Deployment || +----------------------------------------------------------+ || | Problem: All instances in one AZ | || | Impact: Total outage if AZ fails | || | Fix: Spread across minimum 2-3 AZs | || +----------------------------------------------------------+ || || ❌ Mistake 2: Missing Health Check Grace Period || +----------------------------------------------------------+ || | Problem: ASG terminates instances during boot | || | Impact: Infinite launch/terminate loop | || | Fix: Set grace period > application startup time | || +----------------------------------------------------------+ || || ❌ Mistake 3: No Connection Draining || +----------------------------------------------------------+ || | Problem: In-flight requests dropped during scale-in | || | Impact: User-facing errors, data loss | || | Fix: Enable deregistration delay (300s default) | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+7.11 Interview Questions
Section titled “7.11 Interview Questions”Conceptual Questions
Section titled “Conceptual Questions”-
Q: Explain ALB vs NLB. When would you use each?
- A: ALB operates at Layer 7 (HTTP/HTTPS), supports content-based routing, path routing, host routing. NLB operates at Layer 4 (TCP/UDP), offers ultra-low latency, static IP, and millions of RPS. Use ALB for web apps/APIs/microservices. Use NLB for real-time gaming, IoT, TCP services, or when you need a static IP.
-
Q: What’s the difference between target tracking and step scaling?
- A: Target tracking is simpler — you set a target metric (e.g., CPU 50%) and AWS scales automatically to maintain it. Step scaling gives granular control — you define different scaling amounts for different alarm thresholds. Target tracking is recommended for most cases; step scaling when you need different responses at different severity levels.
Scenario-Based Questions
Section titled “Scenario-Based Questions”- Q: How would you implement zero-downtime deployments with ASG?
- A: Use ASG Instance Refresh with MinHealthyPercentage=90 and InstanceWarmup=300. Update the launch template with new AMI/config, then trigger instance refresh. ASG replaces instances in batches, ensuring capacity never drops below 90%. Alternatively, use blue/green with two ASGs and weighted target groups for more control.
7.12 Exam Tips
Section titled “7.12 Exam Tips”- ALB vs NLB: ALB for HTTP/HTTPS (Layer 7), NLB for TCP/UDP (Layer 4)
- Target Groups: Can target instances, IPs, or Lambda functions
- Health Checks: ELB health checks + EC2 health checks for ASG
- Scaling Policies: Target tracking is simplest, step scaling for granular control
- Cooldown: Prevents rapid scaling cycles
- Instance Protection: Prevents scale-in termination
- Connection Draining: Allows in-flight requests to complete
- Cross-Zone Load Balancing: Distributes traffic evenly across AZs
Next Chapter
Section titled “Next Chapter”Chapter 8: AWS Lambda - Serverless Computing
Last Updated: March 2026