Skip to content

Containers on AWS (ECS, EKS, Fargate)

Chapter 9: AWS Elastic Container Services (ECS/EKS)

Section titled “Chapter 9: AWS Elastic Container Services (ECS/EKS)”

Running Containerized Applications at Scale

Section titled “Running Containerized Applications at Scale”

AWS provides multiple container orchestration services to run Docker containers at scale.

AWS Container Services
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | Container Services | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | |
| v v v |
| +----------+ +----------+ +----------+ |
| | ECS | | EKS | | Fargate | |
| | | | | | | |
| | Amazon's | | Managed | | Serverless| |
| | Native | |Kubernetes| | Container | |
| | Container| | Service | | Compute | |
| | Service | | | | | |
| +----------+ +----------+ +----------+ |
| |
| ECS: Simple, AWS-native container orchestration |
| EKS: Kubernetes-compatible, portable workloads |
| Fargate: Serverless compute for both ECS and EKS |
| |
+------------------------------------------------------------------+

ECS Core Components
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | ECS Cluster | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | |
| v v v |
| +----------+ +----------+ +----------+ |
| | Task | | Service | | Container| |
| | Definition| | | | Instance | |
| +----------+ +----------+ +----------+ |
| |
| Task Definition: Blueprint for containers |
| Service: Manages running tasks (scaling, load balancing) |
| Container Instance: EC2 instance running ECS agent |
| |
+------------------------------------------------------------------+
ECS Launch Types Comparison
+------------------------------------------------------------------+
| |
| EC2 Launch Type |
| +----------------------------------------------------------+ |
| | | |
| | +------------------+ | |
| | | EC2 Instance | | |
| | | | | |
| | | +------------+ | +------------+ +------------+ | |
| | | | Container 1| | | Container 2| | Container 3| | |
| | | +------------+ | +------------+ +------------+ | |
| | | | | |
| | | ECS Agent | | |
| | +------------------+ | |
| | | |
| | You manage: | |
| | - EC2 instances | |
| | - Scaling | |
| | - Security patches | |
| +----------------------------------------------------------+ |
| |
| Fargate Launch Type |
| +----------------------------------------------------------+ |
| | | |
| | +------------------+ | |
| | | Fargate Task | | |
| | | | | |
| | | +------------+ | | |
| | | | Container | | <-- Single container per task | |
| | | +------------+ | | |
| | | | | |
| | +------------------+ | |
| | | |
| | AWS manages: | |
| | - Infrastructure | |
| | - Scaling | |
| | - Security | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
{
"family": "web-app-task",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"containerDefinitions": [
{
"name": "web-app",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/web-app:latest",
"essential": true,
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"environment": [
{"name": "ENVIRONMENT", "value": "production"}
],
"secrets": [
{"name": "DB_PASSWORD", "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password"}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/web-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
ECS Service Types
+------------------------------------------------------------------+
| |
| 1. Replica Service |
| +----------------------------------------------------------+ |
| | | |
| | +------------+ +------------+ +------------+ | |
| | | Task 1 | | Task 2 | | Task 3 | | |
| | | (Replica) | | (Replica) | | (Replica) | | |
| | +------------+ +------------+ +------------+ | |
| | | |
| | Use Case: Web servers, APIs | |
| | Scaling: Based on CPU, memory, or ALB requests | |
| +----------------------------------------------------------+ |
| |
| 2. Daemon Service |
| +----------------------------------------------------------+ |
| | | |
| | EC2 Instance 1 EC2 Instance 2 EC2 Instance 3 | |
| | +------------+ +------------+ +------------+ | |
| | | Task | | Task | | Task | | |
| | | (Daemon) | | (Daemon) | | (Daemon) | | |
| | +------------+ +------------+ +------------+ | |
| | | |
| | Use Case: Logging agents, monitoring agents | |
| | One task per container instance | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

EKS Architecture
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | EKS Cluster | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | |
| v v v |
| +----------+ +----------+ +----------+ |
| | Control | | Worker | | Fargate | |
| | Plane | | Nodes | | Profile | |
| |(Managed) | | | | | |
| +----------+ +----------+ +----------+ |
| |
| Control Plane: Managed by AWS (API server, etcd) |
| Worker Nodes: EC2 instances running Kubernetes |
| Fargate Profile: Serverless Kubernetes pods |
| |
+------------------------------------------------------------------+
EKS Node Options
+------------------------------------------------------------------+
| |
| 1. Managed Node Groups |
| +----------------------------------------------------------+ |
| | | |
| | Features: | |
| | - Automated provisioning | |
| | - Automated updates | |
| | - Managed by AWS | |
| | - Can use Spot instances | |
| | | |
| | Node Group Configuration: | |
| | - Instance types | |
| | - AMI version | |
| | - Scaling config (min/max/desired) | |
| | - Labels and taints | |
| +----------------------------------------------------------+ |
| |
| 2. Self-Managed Nodes |
| +----------------------------------------------------------+ |
| | | |
| | Features: | |
| | - Full control over nodes | |
| | - Custom AMI | |
| | - Custom bootstrap scripts | |
| | - Manual updates | |
| +----------------------------------------------------------+ |
| |
| 3. Fargate Profiles |
| +----------------------------------------------------------+ |
| | | |
| | Features: | |
| | - Serverless pods | |
| | - No node management | |
| | - Per-namespace selection | |
| | - Higher cost but less overhead | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
EKS VPC Networking
+------------------------------------------------------------------+
| |
| VPC CNI Plugin Architecture |
| +----------------------------------------------------------+ |
| | | |
| | VPC | |
| | +----------------------------------------------------+ | |
| | | | | |
| | | Subnet (10.0.1.0/24) | | |
| | | +----------+ +----------+ +----------+ | | |
| | | | Pod IP | | Pod IP | | Pod IP | | | |
| | | |10.0.1.10 | |10.0.1.11 | |10.0.1.12 | | | |
| | | +----------+ +----------+ +----------+ | | |
| | | | | | | | |
| | | +------+------+------+------+ | | |
| | | | | | | |
| | | v v | | |
| | | +------------------------------------------+ | | |
| | | | Worker Node (EC2) | | | |
| | | | 10.0.1.100 | | | |
| | | | | | | |
| | | | +--------+ +--------+ +--------+ | | | |
| | | | | Pod 1 | | Pod 2 | | Pod 3 | | | | |
| | | | +--------+ +--------+ +--------+ | | | |
| | | +------------------------------------------+ | | |
| | +----------------------------------------------------+ | |
| | | |
| | Benefits: | |
| | - Native VPC networking for pods | |
| | - Security groups per pod | |
| | - No overlay network overhead | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Elastic Container Registry
+------------------------------------------------------------------+
| |
| ECR Repository Structure |
| +----------------------------------------------------------+ |
| | | |
| | Repository: my-app | |
| | +----------------------------------------------------+ | |
| | | | | |
| | | Images: | | |
| | | my-app:latest (sha256:abc123) | | |
| | | my-app:v1.0 (sha256:def456) | | |
| | | my-app:v1.1 (sha256:ghi789) | | |
| | | my-app@sha256:abc123 | | |
| | | | | |
| | +----------------------------------------------------+ | |
| | | |
| | Features: | |
| | - Private repositories | |
| | - Public repositories (ECR Public) | |
| | - Image scanning (security) | |
| | - Cross-region replication | |
| | - Lifecycle policies | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Terminal window
# Login to ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin \
123456789012.dkr.ecr.us-east-1.amazonaws.com
# Create repository
aws ecr create-repository \
--repository-name my-app \
--image-scanning-configuration scanOnPush=true
# Build and push image
docker build -t my-app .
docker tag my-app:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/my-app:latest
# Set lifecycle policy
aws ecr put-lifecycle-policy \
--repository-name my-app \
--lifecycle-policy-text file://lifecycle-policy.json

Container Security Checklist
+------------------------------------------------------------------+
| |
| 1. Image Security |
| +----------------------------------------------------------+ |
| | [ ] Use minimal base images | |
| | [ ] Scan images for vulnerabilities | |
| | [ ] Use specific image tags (not :latest) | |
| | [ ] Sign images | |
| +----------------------------------------------------------+ |
| |
| 2. Runtime Security |
| +----------------------------------------------------------+ |
| | [ ] Run as non-root user | |
| | [ ] Read-only root filesystem | |
| | [ ] Drop unnecessary capabilities | |
| | [ ] Use security contexts | |
| +----------------------------------------------------------+ |
| |
| 3. Network Security |
| +----------------------------------------------------------+ |
| | [ ] Use security groups | |
| | [ ] Network policies (EKS) | |
| | [ ] Service mesh (optional) | |
| | [ ] Private subnets | |
| +----------------------------------------------------------+ |
| |
| 4. Secrets Management |
| +----------------------------------------------------------+ |
| | [ ] Use AWS Secrets Manager | |
| | [ ] Use Parameter Store | |
| | [ ] Don't embed secrets in images | |
| | [ ] Rotate secrets regularly | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Container Resource Configuration
+------------------------------------------------------------------+
| |
| Task/Container Resources |
| +----------------------------------------------------------+ |
| | | |
| | CPU: | |
| | - ECS: 0.25 - 16 vCPUs | |
| | - Kubernetes: millicores (100m = 0.1 CPU) | |
| | | |
| | Memory: | |
| | - ECS: 512 MB - 30 GB | |
| | - Kubernetes: bytes (Mi, Gi) | |
| | | |
| | Best Practices: | |
| | - Set requests (guaranteed) | |
| | - Set limits (maximum) | |
| | - Monitor actual usage | |
| | - Right-size based on metrics | |
| +----------------------------------------------------------+ |
| |
| Example Task Definition: |
| +----------------------------------------------------------+ |
| | { | |
| | "cpu": "512", // 0.5 vCPU | |
| | "memory": "1024", // 1 GB | |
| | "containerDefinitions": [{ | |
| | "cpu": 256, // Container CPU | |
| | "memory": 512, // Container memory | |
| | "memoryReservation": 256 // Soft limit | |
| | }] | |
| | } | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

# ============================================================
# ECS Cluster
# ============================================================
resource "aws_ecs_cluster" "main" {
name = "my-cluster"
setting {
name = "containerInsights"
value = "enabled"
}
}
# ============================================================
# CloudWatch Log Group
# ============================================================
resource "aws_cloudwatch_log_group" "ecs" {
name = "/ecs/my-app"
retention_in_days = 30
}
# ============================================================
# ECS Task Definition
# ============================================================
resource "aws_ecs_task_definition" "app" {
family = "my-app"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 256
memory = 512
container_definitions = jsonencode([
{
name = "app"
image = "${aws_ecr_repository.app.repository_url}:latest"
essential = true
portMappings = [
{
containerPort = 8080
protocol = "tcp"
}
]
environment = [
{
name = "ENVIRONMENT"
value = "production"
}
]
logConfiguration = {
logDriver = "awslogs"
options = {
"awslogs-group" = aws_cloudwatch_log_group.ecs.name
"awslogs-region" = var.region
"awslogs-stream-prefix" = "ecs"
}
}
}
])
execution_role_arn = aws_iam_role.ecs_execution.arn
task_role_arn = aws_iam_role.ecs_task.arn
}
# ============================================================
# ECS Service
# ============================================================
resource "aws_ecs_service" "app" {
name = "my-app"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = 3
launch_type = "FARGATE"
network_configuration {
subnets = var.private_subnet_ids
security_groups = [aws_security_group.ecs.id]
assign_public_ip = false
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = "app"
container_port = 8080
}
depends_on = [aws_lb_listener.https]
}
# ============================================================
# Auto Scaling
# ============================================================
resource "aws_appautoscaling_target" "ecs" {
max_capacity = 10
min_capacity = 2
resource_id = "service/${aws_ecs_cluster.main.name}/${aws_ecs_service.app.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "cpu" {
name = "cpu-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs.resource_id
scalable_dimension = aws_appautoscaling_target.ecs.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs.service_namespace
target_tracking_scaling_policy_configuration {
target_value = 70
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
}
}

Containers are the standard deployment unit in modern DevOps. Your choice between ECS, EKS, and Fargate directly impacts operational complexity, team skill requirements, and cost.

Container Orchestration Decision Matrix
+------------------------------------------------------------------+
| |
| Decision: ECS vs EKS vs Fargate |
| |
| ECS + Fargate (Simplest) |
| +----------------------------------------------------------+ |
| | - Small teams, no K8s experience | |
| | - Simple microservices, internal tools | |
| | - Minimal operational overhead | |
| +----------------------------------------------------------+ |
| |
| EKS + Managed Nodes (Portable) |
| +----------------------------------------------------------+ |
| | - Multi-cloud strategy or K8s expertise on team | |
| | - Complex service mesh, advanced networking | |
| | - Rich ecosystem (Helm, Argo, Istio) | |
| +----------------------------------------------------------+ |
| |
| ECS on EC2 (Maximum Control) |
| +----------------------------------------------------------+ |
| | - GPU workloads, custom AMIs | |
| | - Cost optimization with Spot/RIs | |
| | - High-density task packing | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Terminal window
# Install container tools on Arch Linux
sudo pacman -S docker docker-compose kubectl helm jq
yay -S aws-cli-v2 copilot-cli
# Enable Docker
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
# ECR login helper
ecr-login() {
aws ecr get-login-password --region ${1:-us-east-1} | \
docker login --username AWS --password-stdin \
"$(aws sts get-caller-identity --query Account --output text).dkr.ecr.${1:-us-east-1}.amazonaws.com"
}
# Build, tag, push workflow
deploy-container() {
local app="$1" tag="${2:-latest}"
local account=$(aws sts get-caller-identity --query Account --output text)
local repo="$account.dkr.ecr.us-east-1.amazonaws.com/$app"
ecr-login
docker build -t "$app:$tag" .
docker tag "$app:$tag" "$repo:$tag"
docker push "$repo:$tag"
# Force new deployment
aws ecs update-service \
--cluster production \
--service "$app" \
--force-new-deployment
echo "✅ Deployed $app:$tag"
}
# ECS service status
ecs-status() {
aws ecs describe-services \
--cluster "${1:-production}" \
--services $(aws ecs list-services --cluster "${1:-production}" --query 'serviceArns[*]' --output text) \
--query 'services[*].{Name:serviceName,Desired:desiredCount,Running:runningCount,Status:status}' \
--output table
}

IssueCauseSolution
Task fails to startImage pull errorCheck ECR permissions, verify image URI
Task keeps restartingContainer exits immediatelyCheck logs: aws logs tail /ecs/my-app
Container can’t reach internetMissing NAT GatewayEnsure private subnet has NAT GW route
Health check failingWrong health check path/portVerify ALB target group health check config
Out of memory (OOM kill)Memory limit too lowIncrease task/container memory allocation
EKS pods stuck PendingInsufficient node resourcesScale node group or right-size pods
Terminal window
# Debug ECS task failures
# Check stopped task reason
aws ecs describe-tasks \
--cluster production \
--tasks $(aws ecs list-tasks --cluster production --service-name my-app --desired-status STOPPED --query 'taskArns[0]' --output text) \
--query 'tasks[0].{StopCode:stopCode,StoppedReason:stoppedReason,Containers:containers[*].{Name:name,ExitCode:exitCode,Reason:reason}}'

  1. Q: When would you choose ECS over EKS?

    • A: ECS when: (1) team lacks K8s expertise, (2) simpler workloads, (3) tight AWS integration preferred, (4) lower operational overhead desired. EKS when: (1) multi-cloud portability needed, (2) team has K8s skills, (3) need K8s ecosystem (Helm, service mesh), (4) complex networking requirements.
  2. Q: Explain the difference between task role and execution role in ECS.

    • A: Execution role is used by the ECS agent to pull images from ECR and write logs to CloudWatch. Task role is used by the application code running inside the container to access AWS services (like S3, DynamoDB). Separate roles enforce least privilege.
  1. Q: Design a CI/CD pipeline for containerized microservices on ECS.
    • A: GitHub push → CodeBuild builds Docker image → pushes to ECR → updates ECS task definition with new image tag → CodeDeploy/ECS rolling update deploys new tasks → ALB shifts traffic gradually → CloudWatch monitors error rate → auto-rollback if errors spike.

Exam Tip

  1. ECS vs EKS: ECS is AWS-native, EKS is Kubernetes-compatible
  2. Fargate: Serverless compute for both ECS and EKS
  3. Task Definition: Blueprint for containers (CPU, memory, ports)
  4. Service: Manages tasks, handles scaling and load balancing
  5. ECR: Container registry with image scanning
  6. VPC CNI: Pods get real IP addresses in VPC
  7. IAM Roles: Task roles for AWS API access
  8. Load Balancing: ALB for ECS services, Ingress for EKS

Chapter 10: AWS Elastic Beanstalk & App Runner


Last Updated: March 2026