S3
Chapter 16: Amazon S3 - Simple Storage Service
Section titled “Chapter 16: Amazon S3 - Simple Storage Service”Object Storage for the Cloud
Section titled “Object Storage for the Cloud”16.1 Overview
Section titled “16.1 Overview”Amazon S3 (Simple Storage Service) is an object storage service offering industry-leading scalability, data availability, security, and performance.
S3 Overview+------------------------------------------------------------------+| || +------------------------+ || | Amazon S3 | || +------------------------+ || | || +---------------------+---------------------+ || | | | || v v v || +----------+ +----------+ +----------+ || | Buckets | | Objects | | Storage | || | | | | | Classes | || | | | | | | || | - Global | | - Files | | - Standard| || | - Unique | | - Keys | | - IA | || | - Names | | - Metadata| | - Glacier | || +----------+ +----------+ +----------+ || || Buckets: Containers for objects (globally unique names) || Objects: Files + metadata (up to 5TB each) || Storage Classes: Different tiers for different use cases || |+------------------------------------------------------------------+16.2 S3 Architecture
Section titled “16.2 S3 Architecture”Buckets and Objects
Section titled “Buckets and Objects” S3 Bucket Structure+------------------------------------------------------------------+| || Bucket: my-bucket || +----------------------------------------------------------+ || | | || | Objects: | || | - images/photo1.jpg | || | - images/photo2.jpg | || | - documents/report.pdf | || | - logs/2024/01/app.log | || | - backup/data.tar.gz | || | | || | Object Components: | || | +----------------------------------------------------+ | || | | Key: images/photo1.jpg | | || | | Value: [Binary data] | | || | | Version ID: 123456789 | | || | | Metadata: | | || | | - Content-Type: image/jpeg | | || | | - Last-Modified: 2024-01-15T10:00:00Z | | || | | - x-amz-meta-custom: value | | || | | Tags: | | || | | - Environment: Production | | || | | - Owner: Team-A | | || | +----------------------------------------------------+ | || | | || +----------------------------------------------------------+ || || Key Concepts: || - Key: Unique identifier within bucket || - Prefix: Folder-like structure (images/, logs/) || - Delimiter: Usually / (forward slash) || |+------------------------------------------------------------------+S3 URL Formats
Section titled “S3 URL Formats” S3 URL Formats+------------------------------------------------------------------+| || Virtual-Hosted Style (Recommended) || +----------------------------------------------------------+ || | https://bucket-name.s3.region.amazonaws.com/key | || | | || | Example: | || | https://my-bucket.s3.us-east-1.amazonaws.com/images/photo.jpg| +----------------------------------------------------------+ || || Path Style (Legacy) || +----------------------------------------------------------+ || | https://s3.region.amazonaws.com/bucket-name/key | || | | || | Example: | || | https://s3.us-east-1.amazonaws.com/my-bucket/images/photo.jpg| +----------------------------------------------------------+ || || S3 Access Point || +----------------------------------------------------------+ || | https://access-point-alias.s3-accesspoint.region.amazonaws.com| | | || | Example: | || | https://my-ap-123456789012.s3-accesspoint.us-east-1.amazonaws.com| +----------------------------------------------------------+ || |+------------------------------------------------------------------+16.3 Storage Classes
Section titled “16.3 Storage Classes” S3 Storage Classes+------------------------------------------------------------------+| || 1. S3 Standard (General Purpose) || +----------------------------------------------------------+ || | | || | Use Case: Frequently accessed data | || | Availability: 99.99% | || | Durability: 99.999999999% (11 9s) | || | Min Storage: None | || | Retrieval: Immediate | || | AZs: 3+ | || +----------------------------------------------------------+ || || 2. S3 Intelligent-Tiering || +----------------------------------------------------------+ || | | || | Use Case: Unknown access patterns | || | Tiers: | || | - Frequent Access (auto) | || | - Infrequent Access (30 days no access) | || | - Archive (90 days no access) | || | - Deep Archive (180 days no access) | || | Monitoring fee: Small monthly fee | || | Retrieval: Immediate | || +----------------------------------------------------------+ || || 3. S3 Standard-IA (Infrequent Access) || +----------------------------------------------------------+ || | | || | Use Case: Less frequent access | || | Availability: 99.9% | || | Min Storage: 30 days | || | Min Object Size: 128KB | || | Retrieval: Per-GB fee | || | Retrieval Time: Immediate | || +----------------------------------------------------------+ || || 4. S3 One Zone-IA || +----------------------------------------------------------+ || | | || | Use Case: Infrequently accessed, non-critical | || | Availability: 99.5% | || | AZs: 1 (less resilient) | || | Min Storage: 30 days | || | Min Object Size: 128KB | || | Retrieval: Immediate | || +----------------------------------------------------------+ || || 5. S3 Glacier Instant Retrieval || +----------------------------------------------------------+ || | | || | Use Case: Archive, immediate access | || | Min Storage: 90 days | || | Min Object Size: 128KB | || | Retrieval: Milliseconds | || | Cost: Lower storage, higher retrieval | || +----------------------------------------------------------+ || || 6. S3 Glacier Flexible Retrieval (Formerly Glacier) || +----------------------------------------------------------+ || | | || | Use Case: Long-term archive | || | Min Storage: 90 days | || | Retrieval Options: | || | - Expedited: 1-5 minutes | || | - Standard: 3-5 hours | || | - Bulk: 5-12 hours | || +----------------------------------------------------------+ || || 7. S3 Glacier Deep Archive || +----------------------------------------------------------+ || | | || | Use Case: Long-term retention (7+ years) | || | Min Storage: 180 days | || | Retrieval Options: | || | - Standard: 12 hours | || | - Bulk: 48 hours | || | Cost: Lowest storage cost | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Storage Class Comparison
Section titled “Storage Class Comparison” Storage Class Comparison Table+------------------------------------------------------------------+| || Class | Access | Min Storage | Retrieval || ----------------|-------------|-------------|------------------|| Standard | Frequent | None | Immediate || Intelligent | Variable | None | Immediate || Standard-IA | Infrequent | 30 days | Immediate || One Zone-IA | Infrequent | 30 days | Immediate || Glacier Inst. | Archive | 90 days | Milliseconds || Glacier Flex. | Archive | 90 days | 1-12 hours || Deep Archive | Long-term | 180 days | 12-48 hours || |+------------------------------------------------------------------+16.4 S3 Features
Section titled “16.4 S3 Features”Versioning
Section titled “Versioning” S3 Versioning+------------------------------------------------------------------+| || Versioning States || +----------------------------------------------------------+ || | | || | 1. Not Enabled (default) | || | - No versioning | || | | || | 2. Enabled | || | - All versions preserved | || | - Delete marker for deletes | || | | || | 3. Suspended | || | - New objects get null version ID | || | - Existing versions preserved | || | | || +----------------------------------------------------------+ || || Versioning Example || +----------------------------------------------------------+ || | | || | Object: document.txt | || | | || | Upload v1: | || | Key: document.txt | || | Version ID: 111111 | || | Content: "Version 1" | || | | || | Upload v2: | || | Key: document.txt | || | Version ID: 222222 | || | Content: "Version 2" | || | | || | Delete: | || | Key: document.txt | || | Version ID: 333333 (Delete Marker) | || | | || | Result: | || | - GET document.txt -> 404 (delete marker) | || | - GET document.txt?versionId=111111 -> "Version 1" | || | - GET document.txt?versionId=222222 -> "Version 2" | || | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Lifecycle Policies
Section titled “Lifecycle Policies” S3 Lifecycle Policies+------------------------------------------------------------------+| || Lifecycle Rules || +----------------------------------------------------------+ || | | || | Rule 1: Transition to IA | || | +----------------------------------------------------+ | || | | Filter: prefix "logs/" | | || | | Transition: | | || | | - After 30 days -> Standard-IA | | || | | - After 90 days -> Glacier Instant | | || | | - After 365 days -> Glacier Deep Archive | | || | +----------------------------------------------------+ | || | | || | Rule 2: Expiration | || | +----------------------------------------------------+ | || | | Filter: prefix "temp/" | | || | | Expiration: | | || | | - After 7 days -> Delete | | || | +----------------------------------------------------+ | || | | || | Rule 3: Versioned Objects | || | +----------------------------------------------------+ | || | | Filter: prefix "archive/" | | || | | Noncurrent Version Transitions: | | || | | - After 30 days -> Standard-IA | | || | | - After 90 days -> Glacier | | || | | Noncurrent Version Expiration: | | || | | - After 365 days -> Delete | | || | | Delete Marker Expiration: | | || | | - After 7 days -> Remove delete marker | | || | +----------------------------------------------------+ | || | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Encryption
Section titled “Encryption” S3 Encryption Options+------------------------------------------------------------------+| || Server-Side Encryption (SSE) || +----------------------------------------------------------+ || | | || | SSE-S3 (Amazon Managed Keys) | || | +----------------------------------------------------+ | || | | - AES-256 encryption | | || | | - AWS manages keys | | || | | - Free | | || | | - Header: x-amz-server-side-encryption: AES256 | | || | +----------------------------------------------------+ | || | | || | SSE-KMS (KMS Managed Keys) | || | +----------------------------------------------------+ | || | | - KMS Customer Master Key (CMK) | | || | | - Audit trail via CloudTrail | | || | | - Key rotation support | | || | | - Cost: KMS API calls | | || | | - Header: x-amz-server-side-encryption: aws:kms | | || | +----------------------------------------------------+ | || | | || | SSE-C (Customer Provided Keys) | || | +----------------------------------------------------+ | || | | - You provide encryption key | | || | | - AWS manages encryption | | || | | - HTTPS required | | || | | - AWS does not store the key | | || | +----------------------------------------------------+ | || | | || +----------------------------------------------------------+ || || Client-Side Encryption || +----------------------------------------------------------+ || | | || | - Encrypt data before uploading | || | - Use AWS Encryption SDK | || | - Full control over encryption process | || | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+Bucket Policies and ACLs
Section titled “Bucket Policies and ACLs” S3 Access Control+------------------------------------------------------------------+| || Bucket Policy (Resource-based Policy) || +----------------------------------------------------------+ || | { | || | "Version": "2012-10-17", | || | "Statement": [ | || | { | || | "Sid": "PublicReadGetObject", | || | "Effect": "Allow", | || | "Principal": "*", | || | "Action": "s3:GetObject", | || | "Resource": "arn:aws:s3:::my-bucket/*" | || | }, | || | { | || | "Sid": "AllowCloudFrontAccess", | || | "Effect": "Allow", | || | "Principal": { | || | "Service": "cloudfront.amazonaws.com" | || | }, | || | "Action": "s3:GetObject", | || | "Resource": "arn:aws:s3:::my-bucket/*", | || | "Condition": { | || | "StringEquals": { | || | "AWS:SourceArn": "arn:aws:cloudfront::..." | || | } | || | } | || | } | || | ] | || | } | || +----------------------------------------------------------+ || || ACL (Access Control List) - Legacy || +----------------------------------------------------------+ || | | || | Grantees: | || | - Owner (Full Control) | || | - AuthenticatedUsers (AWS accounts) | || | - AllUsers (Public) | || | - LogDelivery (S3 logs) | || | | || | Permissions: | || | - READ | || | - WRITE | || | - READ_ACP | || | - WRITE_ACP | || | - FULL_CONTROL | || | | || | Note: Use bucket policies instead of ACLs | || | | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+16.5 S3 Advanced Features
Section titled “16.5 S3 Advanced Features”S3 Select and Glacier Select
Section titled “S3 Select and Glacier Select” S3 Select+------------------------------------------------------------------+| || Purpose: Query objects using SQL || || Traditional Approach: || +----------------------------------------------------------+ || | 1. Download entire object (GBs) | || | 2. Parse locally | || | 3. Extract needed data | || +----------------------------------------------------------+ || || S3 Select Approach: || +----------------------------------------------------------+ || | 1. Send SQL query to S3 | || | 2. S3 filters and returns only matching data | || | 3. Reduced data transfer and cost | || +----------------------------------------------------------+ || || Example Query: || +----------------------------------------------------------+ || | SELECT name, age FROM s3object s | || | WHERE s.age > 25 | || | LIMIT 10 | || +----------------------------------------------------------+ || || Supported Formats: || - CSV, JSON, Parquet || - GZIP and BZIP2 compression || |+------------------------------------------------------------------+S3 Event Notifications
Section titled “S3 Event Notifications” S3 Event Notifications+------------------------------------------------------------------+| || Event Types: || +----------------------------------------------------------+ || | - s3:ObjectCreated:* | || | - s3:ObjectCreated:Put | || | - s3:ObjectCreated:Post | || | - s3:ObjectCreated:Copy | || | - s3:ObjectCreated:CompleteMultipartUpload | || | - s3:ObjectRemoved:* | || | - s3:ObjectRemoved:Delete | || | - s3:ObjectRestore:* | || | - s3:ObjectRestore:Completed | || | - s3:ReducedRedundancyLostObject | || +----------------------------------------------------------+ || || Destinations: || +----------------------------------------------------------+ || | | || | S3 Event -> Lambda Function | || | S3 Event -> SNS Topic | || | S3 Event -> SQS Queue | || | S3 Event -> EventBridge | || | | || +----------------------------------------------------------+ || || Configuration Example: || +----------------------------------------------------------+ || | { | || | "LambdaFunctionConfigurations": [{ | || | "Id": "ImageProcessing", | || | "LambdaFunctionArn": "arn:aws:lambda:...", | || | "Events": ["s3:ObjectCreated:*"], | || | "Filter": { | || | "Key": { | || | "FilterRules": [{ | || | "Name": "prefix", | || | "Value": "images/" | || | }, { | || | "Name": "suffix", | || | "Value": ".jpg" | || | }] | || | } | || | } | || | }] | || | } | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+S3 Access Points
Section titled “S3 Access Points” S3 Access Points+------------------------------------------------------------------+| || Purpose: Simplify access management for shared datasets || || Architecture: || +----------------------------------------------------------+ || | | || | Bucket: shared-data-bucket | || | +----------------------------------------------------+ | || | | | | || | | Access Point: finance-ap | | || | | +----------------------------------------------+ | | || | | | Policy: Allow finance-team | | | || | | | Prefix: finance/ | | | || | | | VPC: vpc-finance | | | || | | +----------------------------------------------+ | | || | | | | || | | Access Point: analytics-ap | | || | | +----------------------------------------------+ | | || | | | Policy: Allow analytics-team | | | || | | | Prefix: analytics/ | | | || | | | VPC: vpc-analytics | | | || | | +----------------------------------------------+ | | || | | | | || | +----------------------------------------------------+ | || | | || +----------------------------------------------------------+ || || Features: || - Unique DNS name per access point || - Dedicated access policy || - VPC endpoint support || - Block public access inheritance || |+------------------------------------------------------------------+S3 Object Lock
Section titled “S3 Object Lock” S3 Object Lock (WORM)+------------------------------------------------------------------+| || Purpose: Write Once, Read Many (WORM) compliance || || Retention Modes: || +----------------------------------------------------------+ || | | || | Governance Mode: | || | - Most users cannot overwrite/delete | || | - Special permissions can override | || | - Use case: Prevent accidental deletion | || | | || | Compliance Mode: | || | - No one can overwrite/delete | || | - Including root user | || | - Use case: Regulatory compliance | || | | || +----------------------------------------------------------+ || || Retention Period: || +----------------------------------------------------------+ || | - Fixed duration (days or years) | || | - Can be extended | || | - Cannot be shortened (compliance mode) | || +----------------------------------------------------------+ || || Legal Hold: || +----------------------------------------------------------+ || | - Indefinite retention | || | - Until explicitly removed | || | - Requires s3:PutObjectLegalHold permission | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+16.6 Practical Configuration
Section titled “16.6 Practical Configuration”S3 with Terraform
Section titled “S3 with Terraform”# ============================================================# S3 Bucket# ============================================================
resource "aws_s3_bucket" "main" { bucket = "my-unique-bucket-name"
tags = { Name = "main-bucket" }}
# ============================================================# Versioning# ============================================================
resource "aws_s3_bucket_versioning" "main" { bucket = aws_s3_bucket.main.id
versioning_configuration { status = "Enabled" }}
# ============================================================# Server-Side Encryption# ============================================================
resource "aws_s3_bucket_server_side_encryption_configuration" "main" { bucket = aws_s3_bucket.main.id
rule { apply_server_side_encryption_by_default { sse_algorithm = "aws:kms" kms_master_key_id = aws_kms_key.s3.arn } }}
# ============================================================# Block Public Access# ============================================================
resource "aws_s3_bucket_public_access_block" "main" { bucket = aws_s3_bucket.main.id
block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true}
# ============================================================# Bucket Policy# ============================================================
resource "aws_s3_bucket_policy" "main" { bucket = aws_s3_bucket.main.id
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Sid = "AllowSSLRequestsOnly" Effect = "Deny" Principal = "*" Action = "s3:*" Resource = [ aws_s3_bucket.main.arn, "${aws_s3_bucket.main.arn}/*" ] Condition = { Bool = { "aws:SecureTransport" = "false" } } }, { Sid = "AllowCloudFrontAccess" Effect = "Allow" Principal = { Service = "cloudfront.amazonaws.com" } Action = "s3:GetObject" Resource = "${aws_s3_bucket.main.arn}/*" Condition = { StringEquals = { "AWS:SourceArn" = aws_cloudfront_distribution.main.arn } } } ] })}
# ============================================================# Lifecycle Configuration# ============================================================
resource "aws_s3_bucket_lifecycle_configuration" "main" { bucket = aws_s3_bucket.main.id
rule { id = "transition-to-ia" status = "Enabled"
filter { prefix = "logs/" }
transition { days = 30 storage_class = "STANDARD_IA" }
transition { days = 90 storage_class = "GLACIER" }
expiration { days = 365 } }
rule { id = "versioned-objects" status = "Enabled"
filter { prefix = "archive/" }
noncurrent_version_transition { noncurrent_days = 30 storage_class = "STANDARD_IA" }
noncurrent_version_transition { noncurrent_days = 90 storage_class = "GLACIER" }
noncurrent_version_expiration { noncurrent_days = 365 } }}
# ============================================================# S3 Access Point# ============================================================
resource "aws_s3_access_point" "finance" { bucket = aws_s3_bucket.main.id name = "finance-access-point"
public_access_block_configuration { block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true }
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Sid = "AllowFinanceTeam" Effect = "Allow" Principal = { AWS = "arn:aws:iam::123456789012:role/FinanceRole" } Action = "s3:GetObject" Resource = "${aws_s3_access_point.finance.arn}/object/finance/*" } ] })}
# ============================================================# S3 Event Notification# ============================================================
resource "aws_s3_bucket_notification" "main" { bucket = aws_s3_bucket.main.id
lambda_function { lambda_function_arn = aws_lambda_function.image_processor.arn events = ["s3:ObjectCreated:*"] filter_prefix = "images/" filter_suffix = ".jpg" }
sqs { id = "sqs-notification" queue_arn = aws_sqs_queue.s3_events.arn events = ["s3:ObjectCreated:*"] filter_prefix = "uploads/" }}
# ============================================================# S3 Object Lock Configuration# ============================================================
resource "aws_s3_bucket_object_lock_configuration" "main" { bucket = aws_s3_bucket.main.id
object_lock_enabled = "Enabled"
rule { default_retention { mode = "COMPLIANCE" days = 365 } }}
# ============================================================# S3 Replication Configuration# ============================================================
resource "aws_s3_bucket_replication_configuration" "main" { bucket = aws_s3_bucket.main.id role = aws_iam_role.replication.arn
destination { bucket = aws_s3_bucket.replica.arn storage_class = "STANDARD" }
rule { id = "replication-rule" status = "Enabled"
filter {}
delete_marker_replication { status = "Enabled" } }}
# Enable versioning for replicationresource "aws_s3_bucket_versioning" "replica" { bucket = aws_s3_bucket.replica.id
versioning_configuration { status = "Enabled" }}16.7 Exam Tips
Section titled “16.7 Exam Tips”- Bucket Names: Globally unique, 3-63 characters, lowercase
- Objects: Up to 5TB, key + value + metadata + version ID
- Storage Classes: Standard, Intelligent-Tiering, IA, One Zone-IA, Glacier
- Lifecycle Policies: Automate transitions and expiration
- Encryption: SSE-S3, SSE-KMS, SSE-C, Client-side
- Versioning: Preserves all versions, delete markers
- Object Lock: WORM, governance or compliance mode
- Access Points: Simplify access management for shared data
- S3 Select: Query objects with SQL, reduce data transfer
- Replication: CRR (cross-region), SRR (same-region)
Next Chapter
Section titled “Next Chapter”Chapter 17: Amazon EBS - Elastic Block Store
Last Updated: February 2026