Skip to content

Autoscaling_elb

Building Scalable and Highly Available Applications

Section titled “Building Scalable and Highly Available Applications”

Auto Scaling and Elastic Load Balancing work together to provide automatic scaling and high availability for your applications.

Auto Scaling & Load Balancing Architecture
+------------------------------------------------------------------+
| |
| Internet |
| | |
| v |
| +---------------+ |
| | Route 53 | |
| +---------------+ |
| | |
| v |
| +-----------------------------------+ |
| | Application Load Balancer | |
| | (ALB) | |
| +-----------------------------------+ |
| | | |
| v v |
| +---------------------+ +---------------------+ |
| | Auto Scaling | | Auto Scaling | |
| | Group 1 | | Group 2 | |
| | | | | |
| | +----+ +----+ +----+| | +----+ +----+ +----+| |
| | |EC2 | |EC2 | |EC2 || | |EC2 | |EC2 | |EC2 || |
| | +----+ +----+ +----+| | +----+ +----+ +----+| |
| | AZ-A AZ-B AZ-C | | AZ-A AZ-B AZ-C | |
| +---------------------+ +---------------------+ |
| |
| Components: |
| - Load Balancer: Distributes traffic |
| - Auto Scaling Group: Manages instance count |
| - Launch Template: Defines instance configuration |
| |
+------------------------------------------------------------------+

Load Balancer Types
+------------------------------------------------------------------+
| |
| Application Load Balancer (ALB) |
| +----------------------------------------------------------+ |
| | Layer: 7 (HTTP/HTTPS) | |
| | Features: | |
| | - Content-based routing | |
| | - Host-based routing | |
| | - Path-based routing | |
| | - WebSocket support | |
| | - HTTP/2 support | |
| | - TLS termination | |
| | Use Cases: | |
| | - Web applications | |
| | - Microservices | |
| | - Containerized applications | |
| +----------------------------------------------------------+ |
| |
| Network Load Balancer (NLB) |
| +----------------------------------------------------------+ |
| | Layer: 4 (TCP/UDP) | |
| | Features: | |
| | - Ultra-high performance | |
| | - Static IP address | |
| | - TLS passthrough | |
| | - UDP support | |
| | - Millions of requests/second | |
| | Use Cases: | |
| | - Real-time gaming | |
| | - IoT | |
| | - Non-HTTP workloads | |
| +----------------------------------------------------------+ |
| |
| Gateway Load Balancer (GWLB) |
| +----------------------------------------------------------+ |
| | Layer: 3 (IP) | |
| | Features: | |
| | - Transparent network gateway | |
| | - Third-party security appliances | |
| | - Inline traffic inspection | |
| | Use Cases: | |
| | - Firewalls | |
| | - IDS/IPS | |
| | - Deep packet inspection | |
| +----------------------------------------------------------+ |
| |
| Classic Load Balancer (CLB) - Legacy |
| +----------------------------------------------------------+ |
| | Layer: 4 & 7 | |
| | Status: Deprecated (use ALB/NLB instead) | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
ALB Request Routing
+------------------------------------------------------------------+
| |
| Client Request |
| | |
| v |
| +---------------+ |
| | Listener | |
| | (Port 443) | |
| +---------------+ |
| | |
| v |
| +---------------+ |
| | Rules | |
| | Evaluation | |
| +---------------+ |
| | |
| +------------------+------------------+ |
| | | | |
| v v v |
| +---------+ +---------+ +---------+ |
| | Rule 1 | | Rule 2 | | Default | |
| | | | | | Rule | |
| |Host: | |Path: | | | |
| |api. | |/images | | | |
| |example. | | | | | |
| |com | | | | | |
| +---------+ +---------+ +---------+ |
| | | | |
| v v v |
| +---------+ +---------+ +---------+ |
| |Target | |Target | |Target | |
| |Group 1 | |Group 2 | |Group 3 | |
| | | | | | | |
| |API | |Image | |Web | |
| |Servers | |Servers | |Servers | |
| +---------+ +---------+ +---------+ |
| |
+------------------------------------------------------------------+
Target Group Configuration
+------------------------------------------------------------------+
| |
| Target Group Settings: |
| +----------------------------------------------------------+ |
| | Protocol: HTTP/HTTPS/TCP | |
| | Port: Application port | |
| | Health Check: | |
| | - Path: /health | |
| | - Interval: 30 seconds | |
| | - Timeout: 5 seconds | |
| | - Healthy threshold: 3 | |
| | - Unhealthy threshold: 2 | |
| +----------------------------------------------------------+ |
| |
| Target Types: |
| +----------------------------------------------------------+ |
| | | |
| | Instance | IP Address | Lambda | |
| | +----------+ | +----------+ | +----------+ | |
| | | EC2 | | | Private | | | Function | | |
| | | Instance | | | IP | | | | | |
| | | ID | | | Address | | | | | |
| | +----------+ | +----------+ | +----------+ | |
| | | | | |
| | Use: EC2 in ASG | Use: Containers | Use: Serverless | |
| | | on ECS/EKS | | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Health Check Flow
+------------------------------------------------------------------+
| |
| Load Balancer |
| +----------------------------------------------------------+ |
| | | |
| | Health Check Request (every 30s) | |
| | | | |
| | v | |
| | +----------+ +----------+ +----------+ | |
| | | Target 1 | | Target 2 | | Target 3 | | |
| | | | | | | | | |
| | | GET | | GET | | GET | | |
| | | /health | | /health | | /health | | |
| | | | | | | | | |
| | | 200 OK | | 200 OK | | 503 | | |
| | | HEALTHY | | HEALTHY | | UNHEALTHY| | |
| | +----------+ +----------+ +----------+ | |
| | | |
| | Traffic Routing: | |
| | - Only routes to HEALTHY targets | |
| | - Unhealthy targets removed from rotation | |
| | - Auto Scaling can replace unhealthy instances | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Auto Scaling Group Components
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | Auto Scaling Group | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | |
| v v v |
| +----------+ +----------+ +----------+ |
| | Launch | | Scaling | | Health | |
| | Template | | Policies | | Checks | |
| +----------+ +----------+ +----------+ |
| |
| Launch Template: Instance configuration |
| Scaling Policies: When to scale |
| Health Checks: Instance health monitoring |
| |
+------------------------------------------------------------------+
Auto Scaling Policies
+------------------------------------------------------------------+
| |
| 1. Simple Scaling |
| +----------------------------------------------------------+ |
| | | |
| | CloudWatch Alarm | |
| | | | |
| | v | |
| | +----------+ +----------+ | |
| | | CPU > 80%| --> | Add 2 | | |
| | | | | Instances| | |
| | +----------+ +----------+ | |
| | | |
| | Cooldown Period: Wait before next scaling action | |
| +----------------------------------------------------------+ |
| |
| 2. Step Scaling |
| +----------------------------------------------------------+ |
| | | |
| | CPU Utilization Instances to Add | |
| | +----------------+-------------------+ | |
| | | 60-70% | +1 instance | | |
| | | 70-80% | +2 instances | | |
| | | 80-90% | +3 instances | | |
| | | > 90% | +4 instances | | |
| | +----------------+-------------------+ | |
| +----------------------------------------------------------+ |
| |
| 3. Target Tracking |
| +----------------------------------------------------------+ |
| | | |
| | Target: CPU = 50% | |
| | | |
| | Actual CPU Action | |
| | +----------------+-------------------+ | |
| | | 70% | Scale out | | |
| | | 40% | Scale in | | |
| | | 50% | No action | | |
| | +----------------+-------------------+ | |
| | | |
| | AWS automatically adjusts capacity | |
| +----------------------------------------------------------+ |
| |
| 4. Predictive Scaling |
| +----------------------------------------------------------+ |
| | | |
| | Uses ML to predict traffic patterns | |
| | | |
| | Time Predicted Traffic Instances | |
| | +----------------+-------------------+----------------+ | |
| | | 09:00 | High | 10 | | |
| | | 12:00 | Peak | 15 | | |
| | | 18:00 | Low | 5 | | |
| | +----------------+-------------------+----------------+ | |
| | | |
| | Pre-provisions capacity before traffic spikes | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Auto Scaling Process
+------------------------------------------------------------------+
| |
| Scale Out (Add Instances) |
| +----------------------------------------------------------+ |
| | | |
| | 1. CloudWatch Alarm triggers | |
| | | | |
| | v | |
| | 2. ASG evaluates scaling policy | |
| | | | |
| | v | |
| | 3. Launch new instance using Launch Template | |
| | | | |
| | v | |
| | 4. Instance boots and passes health checks | |
| | | | |
| | v | |
| | 5. Instance added to Load Balancer target group | |
| | | | |
| | v | |
| | 6. Traffic routed to new instance | |
| | | |
| +----------------------------------------------------------+ |
| |
| Scale In (Remove Instances) |
| +----------------------------------------------------------+ |
| | | |
| | 1. CloudWatch Alarm triggers (low utilization) | |
| | | | |
| | v | |
| | 2. ASG selects instance to terminate | |
| | | | |
| | v | |
| | 3. Instance enters Standby or Terminate | |
| | | | |
| | v | |
| | 4. Connection draining (if enabled) | |
| | | | |
| | v | |
| | 5. Instance removed from target group | |
| | | | |
| | v | |
| | 6. Instance terminated | |
| | | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Instance Protection Options
+------------------------------------------------------------------+
| |
| Scale-In Protection |
| +----------------------------------------------------------+ |
| | | |
| | Protected Instances (not terminated during scale-in) | |
| | | |
| | +----------+ +----------+ +----------+ | |
| | | Instance | | Instance | | Instance | | |
| | | 1 | | 2 | | 3 | | |
| | | [LOCK] | | | | [LOCK] | | |
| | |Protected | | Can be | |Protected | | |
| | +----------+ |terminated| +----------+ | |
| | +----------+ | |
| | | |
| | Enable: | |
| | aws autoscaling set-instance-protection \ | |
| | --instance-ids i-12345 \ | |
| | --protected-from-scale-in | |
| +----------------------------------------------------------+ |
| |
| Standby State |
| +----------------------------------------------------------+ |
| | | |
| | Instance in Standby: | |
| | - Not serving traffic | |
| | - Not replaced by ASG | |
| | - Can be updated/troubleshooted | |
| | | |
| | Enter Standby: | |
| | aws autoscaling enter-standby \ | |
| | --instance-ids i-12345 \ | |
| | --auto-scaling-group-name my-asg | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Complete ALB + ASG Setup
+------------------------------------------------------------------+
| |
| Internet |
| | |
| v |
| +---------------+ |
| | Route 53 | |
| | (DNS) | |
| +---------------+ |
| | |
| v |
| +-----------------------------------+ |
| | Application Load Balancer | |
| | | |
| | Listeners: | |
| | - Port 80 (HTTP) -> Redirect | |
| | - Port 443 (HTTPS) | |
| | | |
| | Target Groups: | |
| | - Web-TG (Port 8080) | |
| | - API-TG (Port 3000) | |
| +-----------------------------------+ |
| / \ |
| / \ |
| v v |
| +---------------------+ +---------------------+ |
| | Auto Scaling | | Auto Scaling | |
| | Group: Web | | Group: API | |
| | | | | |
| | Min: 2 | | Min: 2 | |
| | Max: 10 | | Max: 20 | |
| | Desired: 3 | | Desired: 5 | |
| | | | | |
| | +----+ +----+ +----+| | +----+ +----+ +----+ +----+ +----+| |
| | |Web | |Web | |Web || | |API | |API | |API | |API | |API || |
| | +----+ +----+ +----+| | +----+ +----+ +----+ +----+ +----+| |
| +---------------------+ +---------------------+ |
| |
+------------------------------------------------------------------+

# ============================================================
# Application Load Balancer
# ============================================================
resource "aws_lb" "main" {
name = "my-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = var.public_subnet_ids
enable_deletion_protection = false
}
resource "aws_lb_target_group" "web" {
name = "web-tg"
port = 8080
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 3
interval = 30
matcher = "200"
path = "/health"
port = "traffic-port"
protocol = "HTTP"
timeout = 5
unhealthy_threshold = 2
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.main.arn
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-2021-06"
certificate_arn = var.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.web.arn
}
}
# ============================================================
# Auto Scaling Group
# ============================================================
resource "aws_launch_template" "web" {
name_prefix = "web-"
image_id = var.ami_id
instance_type = "t3.medium"
key_name = var.key_name
iam_instance_profile {
name = aws_iam_instance_profile.web.name
}
network_interfaces {
associate_public_ip_address = false
security_groups = [aws_security_group.web.id]
}
user_data = base64encode(<<-EOF
#!/bin/bash
yum install -y httpd
systemctl start httpd
EOF
)
tag_specifications {
resource_type = "instance"
tags = {
Name = "WebServer"
}
}
}
resource "aws_autoscaling_group" "web" {
name = "web-asg"
vpc_zone_identifier = var.private_subnet_ids
min_size = 2
max_size = 10
desired_capacity = 3
launch_template {
id = aws_launch_template.web.id
version = "$Latest"
}
target_group_arns = [aws_lb_target_group.web.arn]
health_check_type = "ELB"
health_check_grace_period = 300
tag {
key = "Name"
value = "WebServer"
propagate_at_launch = true
}
}
resource "aws_autoscaling_policy" "scale_out" {
name = "scale-out"
scaling_adjustment = 2
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.web.name
}
resource "aws_cloudwatch_metric_alarm" "high_cpu" {
alarm_name = "high-cpu"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = "2"
metric_name = "CPUUtilization"
namespace = "AWS/EC2"
period = "120"
statistic = "Average"
threshold = "80"
dimensions = {
AutoScalingGroupName = aws_autoscaling_group.web.name
}
alarm_actions = [aws_autoscaling_policy.scale_out.arn]
}

Auto Scaling & Load Balancing Best Practices
+------------------------------------------------------------------+
| |
| 1. Multi-AZ Deployment |
| +----------------------------------------------------------+ |
| | - Deploy across minimum 2 AZs | |
| | - Use all available AZs for maximum availability | |
| | - Configure subnets in each AZ | |
| +----------------------------------------------------------+ |
| |
| 2. Health Check Configuration |
| +----------------------------------------------------------+ |
| | - Use meaningful health check endpoints | |
| | - Set appropriate thresholds | |
| | - Configure grace period for instance startup | |
| +----------------------------------------------------------+ |
| |
| 3. Scaling Policies |
| +----------------------------------------------------------+ |
| | - Use target tracking for simplicity | |
| | - Set appropriate cooldown periods | |
| | - Consider predictive scaling for known patterns | |
| +----------------------------------------------------------+ |
| |
| 4. Instance Warm-up |
| +----------------------------------------------------------+ |
| | - Allow time for instance initialization | |
| | - Use lifecycle hooks for custom initialization | |
| | - Configure appropriate grace period | |
| +----------------------------------------------------------+ |
| |
| 5. Monitoring |
| +----------------------------------------------------------+ |
| | - Monitor scaling activities | |
| | - Set up CloudWatch alarms | |
| | - Use ASG notifications | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Exam Tip

  1. ALB vs NLB: ALB for HTTP/HTTPS (Layer 7), NLB for TCP/UDP (Layer 4)
  2. Target Groups: Can target instances, IPs, or Lambda functions
  3. Health Checks: ELB health checks + EC2 health checks for ASG
  4. Scaling Policies: Target tracking is simplest, step scaling for granular control
  5. Cooldown: Prevents rapid scaling cycles
  6. Instance Protection: Prevents scale-in termination
  7. Connection Draining: Allows in-flight requests to complete
  8. Cross-Zone Load Balancing: Distributes traffic evenly across AZs

Chapter 8: AWS Lambda - Serverless Computing


Last Updated: February 2026