AWS Health & Trusted Advisor
Chapter 40: AWS Health Dashboard & Personal Health Dashboard
Section titled “Chapter 40: AWS Health Dashboard & Personal Health Dashboard”Service Health Monitoring
Section titled “Service Health Monitoring”40.1 Overview
Section titled “40.1 Overview”AWS Health Dashboard provides personalized views of AWS service health, alerts, and remediation guidance for your AWS resources.
AWS Health Dashboard Overview+------------------------------------------------------------------+| || +------------------------+ || | AWS Health Dashboard | || +------------------------+ || | || +---------------------+---------------------+ || | | | | || v v v v || +----------+ +----------+ +----------+ +----------+ || | Service | | Personal | | Event | | Affected | || | Health | | Health | | Log | | Resources| || | | | | | | | | || | - Public | | - Account| | - History| | - Your | || | Status | | - Specific| | - Details| | AWS | || | - Region | | - Alerts | | - Search | | - Resources| || +----------+ +----------+ +----------+ +----------+ || |+------------------------------------------------------------------+Key Features
Section titled “Key Features”| Feature | Description |
|---|---|
| Service Health | Public AWS service status |
| Personal Health | Account-specific alerts |
| Event Log | Historical events |
| Affected Resources | Your impacted resources |
40.2 Service Health Dashboard
Section titled “40.2 Service Health Dashboard”Public Service Health
Section titled “Public Service Health” AWS Service Health Dashboard+------------------------------------------------------------------+| || Public Status Page || +------------------------------------------------------------+ || | | || | https://status.aws.amazon.com | || | | || | Features: | || | +------------------------------------------------------+ | || | | - Real-time service status | | || | | - Regional availability | | || | | - Historical service events | | || | | - RSS feeds for updates | | || | | - Service availability history | | || | +------------------------------------------------------+ | || | | || | Status Indicators: | || | +------------------------------------------------------+ | || | | - Green: Service is operating normally | | || | | - Yellow: Performance issues | | || | | - Red: Service disruption | | || | +------------------------------------------------------+ | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+40.3 Personal Health Dashboard
Section titled “40.3 Personal Health Dashboard”Account-Specific Events
Section titled “Account-Specific Events” Personal Health Dashboard+------------------------------------------------------------------+| || Event Categories || +------------------------------------------------------------+ || | | || | Scheduled Changes: | || | +------------------------------------------------------+ | || | | - Planned maintenance | | || | | - Service upgrades | | || | | - Retirement announcements | | || | +------------------------------------------------------+ | || | | || | Account Issues: | || | +------------------------------------------------------+ | || | | - Billing issues | | || | | - Security notifications | | || | | - Abuse reports | | || | +------------------------------------------------------+ | || | | || | Service Issues: | || | +------------------------------------------------------+ | || | | - AWS service disruptions | | || | | - Performance degradation | | || | | - Resource-specific issues | | || | +------------------------------------------------------+ | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+Event Types
Section titled “Event Types” AWS Health Event Types+------------------------------------------------------------------+| || issue || +------------------------------------------------------------+ || | | || | - Active service issues | || | - Affects your AWS resources | || | - Requires attention | || | | || +------------------------------------------------------------+ || || accountNotification || +------------------------------------------------------------+ || | | || | - Account-specific notifications | || | - Billing, security, abuse | || | - May require action | || | | || +------------------------------------------------------------+ || || scheduledChange || +------------------------------------------------------------+ || | | || | - Planned maintenance | || | - Service changes | || | - Advance notice provided | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+40.4 Event Details
Section titled “40.4 Event Details”Event Structure
Section titled “Event Structure” AWS Health Event Structure+------------------------------------------------------------------+| || Event Information || +------------------------------------------------------------+ || | | || | Event ARN: arn:aws:health:us-east-1::event/EC2/... | || | Service: Amazon EC2 | || | Region: us-east-1 | || | EventTypeCode: AWS_EC2_INSTANCE_RETIREMENT_SCHEDULED | || | EventTypeCategory: scheduledChange | || | StartTime: 2024-02-01T00:00:00Z | || | EndTime: 2024-02-15T00:00:00Z | || | StatusCode: open | || | | || +------------------------------------------------------------+ || || Affected Entities || +------------------------------------------------------------+ || | | || | Entity ARN: arn:aws:ec2:us-east-1:123:instance/i-xxx | || | Entity Value: i-1234567890abcdef0 | || | Entity Type: AWS::EC2::Instance | || | Status: IMPAIRED | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+40.5 Integration with Other Services
Section titled “40.5 Integration with Other Services”CloudWatch Events Integration
Section titled “CloudWatch Events Integration” AWS Health + CloudWatch Events+------------------------------------------------------------------+| || Event Pattern || +------------------------------------------------------------+ || | | || | { | || | "source": ["aws.health"], | || | "detail-type": ["AWS Health Event"], | || | "detail": { | || | "eventTypeCategory": ["issue", "scheduledChange"] | || | } | || | } | || | | || +------------------------------------------------------------+ || || Target Actions || +------------------------------------------------------------+ || | | || | - SNS: Send notification | || | - Lambda: Automated remediation | || | - Systems Manager: Run automation | || | - Slack/Teams: Webhook integration | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+Automated Response Architecture
Section titled “Automated Response Architecture” Automated Health Response+------------------------------------------------------------------+| || +------------------------+ || | AWS Health Service | || +------------------------+ || | || v || +------------------------+ || | CloudWatch Events | || +------------------------+ || | || +---------------------+---------------------+ || | | | || v v v || +----------+ +----------+ +----------+ || | SNS | | Lambda | | Systems | || | Topic | | Function | | Manager | || | | | | | | || | - Email | | - Auto | | - Auto | || | - SMS | | Fix | | Mation | || | - HTTP | | - Notify | | - Patch | || +----------+ +----------+ +----------+ || |+------------------------------------------------------------------+40.6 CLI Commands
Section titled “40.6 CLI Commands”# Describe eventsaws health describe-events \ --filter "eventStatusCodes=open,upcoming"
# Describe event typesaws health describe-event-types
# Describe affected entitiesaws health describe-affected-entities \ --filter "eventArns=arn:aws:health:us-east-1::event/EC2/..."
# Describe event detailsaws health describe-event-details \ --event-arns "arn:aws:health:us-east-1::event/EC2/..."
# Get event aggregationaws health describe-event-aggregates \ --aggregate-field eventTypeCategory
# Enable Health API (Business/Enterprise support required)# Note: Health API requires Business or Enterprise Support plan40.7 Advanced Health Monitoring
Section titled “40.7 Advanced Health Monitoring”AWS Health API Deep Dive
Section titled “AWS Health API Deep Dive” AWS Health API Capabilities+------------------------------------------------------------------+| || API Endpoints || +------------------------------------------------------------+ || | | || | describe-events | || | +--------------------------------------------------------+ | || | | - List health events affecting your account | | || | | - Filter by status, region, service | | || | | - Pagination support | | || | +--------------------------------------------------------+ | || | | || | describe-event-details | || | +--------------------------------------------------------+ | || | | - Get detailed information about events | | || | | - Includes event description, timeline | | || | | - Affected AWS services | | || | +--------------------------------------------------------+ | || | | || | describe-affected-entities | || | +--------------------------------------------------------+ | || | | - List resources affected by an event | | || | | - Entity ARNs and status | | || | | - Resource-specific impact details | | || | +--------------------------------------------------------+ | || | | || | describe-event-aggregates | || | +--------------------------------------------------------+ | || | | - Aggregate events by category | | || | | - Summary statistics | | || | | - Trend analysis | | || | +--------------------------------------------------------+ | || | | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+Advanced Event Patterns
Section titled “Advanced Event Patterns”{ "source": ["aws.health"], "detail-type": ["AWS Health Event"], "detail": { "eventTypeCategory": ["issue", "scheduledChange", "accountNotification"], "service": ["EC2", "RDS", "S3"], "region": ["us-east-1", "us-west-2"], "statusCode": ["open", "upcoming"] }}Lambda-Based Automated Response
Section titled “Lambda-Based Automated Response”import boto3import json
def lambda_handler(event, context): """ Automated response to AWS Health events """ health = boto3.client('health') ec2 = boto3.client('ec2') sns = boto3.client('sns')
# Parse the health event for record in event.get('Records', []): detail = json.loads(record['body']) if 'body' in record else record
event_arn = detail.get('detail', {}).get('eventArn') event_type = detail.get('detail', {}).get('eventTypeCode', '')
# Handle EC2 instance retirement if 'EC2_INSTANCE_RETIREMENT' in event_type: # Get affected instances entities = health.describe_affected_entities( filter={'eventArns': [event_arn]} )
for entity in entities['entities']: instance_id = entity['entityValue']
# Get instance details instance = ec2.describe_instances( InstanceIds=[instance_id] )
# Create replacement instance # ... implementation details ...
# Send notification sns.publish( TopicArn='arn:aws:sns:us-east-1:123456789:alerts', Message=f'EC2 instance {instance_id} scheduled for retirement', Subject='AWS Health Alert: Instance Retirement' )
# Handle RDS maintenance elif 'RDS_MAINTENANCE' in event_type: # Handle RDS maintenance scheduling pass
return {'statusCode': 200, 'body': 'Processed successfully'}Multi-Region Health Monitoring
Section titled “Multi-Region Health Monitoring” Multi-Region Health Architecture+------------------------------------------------------------------+| || +------------------------+ || | AWS Health Service | || +------------------------+ || | || +---------------------+---------------------+ || | | | | || v v v v || +----------+ +----------+ +----------+ +----------+ || | us-east-1| | us-west-2| | eu-west-1| | ap-south-1| || | Events | | Events | | Events | | Events | || +----------+ +----------+ +----------+ +----------+ || | | | | || +---------------------+---------------------+ || | || v || +------------------------+ || | Centralized Event | || | Aggregator | || +------------------------+ || | || +---------------------+---------------------+ || | | | || v v v || +----------+ +----------+ +----------+ || | Dashboard| | Alerting | | Remediation || | | | System | | Automation || +----------+ +----------+ +----------+ || |+------------------------------------------------------------------+Health Dashboard Integration with ServiceNow
Section titled “Health Dashboard Integration with ServiceNow”import requestsimport json
def create_servicenow_incident(health_event): """ Create ServiceNow incident from AWS Health event """ servicenow_url = "https://your-instance.service-now.com/api/now/table/incident"
# Map AWS Health event to ServiceNow incident incident_data = { "short_description": f"AWS Health: {health_event['eventTypeCode']}", "description": health_event.get('eventDescription', ''), "urgency": "1" if health_event['eventTypeCategory'] == 'issue' else "2", "impact": "2" if health_event['eventTypeCategory'] == 'issue' else "3", "category": "Infrastructure", "subcategory": "AWS", "cmdb_ci": "AWS Infrastructure", "work_notes": f"AWS Event ARN: {health_event['arn']}\n" f"Service: {health_event['service']}\n" f"Region: {health_event['region']}" }
response = requests.post( servicenow_url, auth=('username', 'password'), headers={"Content-Type": "application/json"}, data=json.dumps(incident_data) )
return response.json()40.8 Best Practices
Section titled “40.8 Best Practices”AWS Health Best Practices
Section titled “AWS Health Best Practices” AWS Health Best Practices+------------------------------------------------------------------+| || 1. Set up automated notifications || +------------------------------------------------------------+ || | - Use CloudWatch Events for automated alerts | || | - Route to SNS, Slack, or PagerDuty | || +------------------------------------------------------------+ || || 2. Monitor scheduled changes || +------------------------------------------------------------+ || | - Review upcoming maintenance | || | - Plan for instance retirements | || +------------------------------------------------------------+ || || 3. Implement automated remediation || +------------------------------------------------------------+ || | - Use Lambda for automatic response | || | - Replace retired instances automatically | || +------------------------------------------------------------+ || || 4. Review affected resources || +------------------------------------------------------------+ || | - Identify impacted resources | || | - Take proactive action | || +------------------------------------------------------------+ || || 5. Maintain support plan || +------------------------------------------------------------+ || | - Business or Enterprise for Health API access | || | - Enhanced support response | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+40.9 Why This Matters in DevOps/SRE
Section titled “40.9 Why This Matters in DevOps/SRE”AWS Health is critical for proactive incident management. SREs use it to stay informed about AWS-facing issues and scheduled changes that could impact infrastructure.
AWS Health in DevOps/SRE+------------------------------------------------------------------+| || SRE Proactive Monitoring: || || 1. Early Warning System || +----------------------------------------------------------+ || | - Get notified before AWS issues impact your services | || | - Monitor scheduled changes (maintenance, retirements) | || | - Plan capacity changes proactively | || +----------------------------------------------------------+ || || 2. Incident Preparation || +----------------------------------------------------------+ || | - Review affected resources before maintenance | || | - Test failover procedures when issues arise | || | - Maintain runbooks for AWS events | || +----------------------------------------------------------+ || || 3. Communication || +----------------------------------------------------------+ || | - Alert stakeholders about potential impacts | || | - Update status pages proactively | || | - Coordinate with AWS support when needed | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+40.10 Linux Systems Perspective
Section titled “40.10 Linux Systems Perspective”AWS Health Automation from Arch Linux
Section titled “AWS Health Automation from Arch Linux”# Install AWS CLI and jqsudo pacman -S aws-cli-v2 jq
# Check current AWS health events#!/bin/bash# ~/bin/aws-health-check.shset -euo pipefail
echo "=== Open Health Events ==="aws health describe-events \ --filter 'statusCodes=open' \ --output table
echo ""echo "=== Affected Resources ==="aws health describe-affected-entities \ --filter 'eventStatusCodes=open' \ --output table
# Set up CloudWatch Event for Healthaws events put-rule \ --name "aws-health-notifications" \ --event-pattern '{"source": ["aws.health"]}'
aws events put-targets \ --rule aws-health-notifications \ --targets '[{"Id": "1", "Arn": "arn:aws:sns:us-east-1:123456789012:alerts"}]'40.11 Common Mistakes & Anti-Patterns
Section titled “40.11 Common Mistakes & Anti-Patterns” AWS Health Anti-Patterns+------------------------------------------------------------------+| || ❌ Mistake 1: Not Monitoring Scheduled Changes || +----------------------------------------------------------+ || | Problem: Missing instance retirement notifications | || | Impact: Unexpected instance terminations | || | Fix: Set up alerts for scheduledChange events | || +----------------------------------------------------------+ || || ❌ Mistake 2: Not Using Health API || +----------------------------------------------------------+ || Problem: Relying only on public status page || Impact: Missing account-specific issues || Fix: Enable Personal Health Dashboard with API access || +----------------------------------------------------------+ || || ❌ Mistake 3: Not Automating Response || +----------------------------------------------------------+ || | Problem: Manual response to health events | || | Impact: Delayed remediation, extended outages | || | Fix: Use Lambda for automated remediation | || +----------------------------------------------------------+ || || ❌ Mistake 4: Ignoring Health Events During Low Traffic || +----------------------------------------------------------+ || | Problem: Not using maintenance windows for changes | || | Impact: User impact during peak times | || | Fix: Schedule changes during low-traffic periods | || +----------------------------------------------------------+ || |+------------------------------------------------------------------+40.12 Interview Questions
Section titled “40.12 Interview Questions”Conceptual Questions
Section titled “Conceptual Questions”-
Q: What’s the difference between Service Health Dashboard and Personal Health Dashboard?
- A: Service Health Dashboard shows public AWS service status (all regions, general issues). Personal Health Dashboard shows account-specific events affecting your resources, including scheduled changes and your affected resources. Personal Health requires Business/Enterprise support.
-
Q: How does AWS Health integrate with CloudWatch Events?
- A: CloudWatch Events can trigger on AWS Health events. You can create rules to catch specific events (e.g., EC2_INSTANCE_RETIREMENT) and route to targets like SNS, Lambda, or EventBridge. This enables automated remediation.
Scenario-Based Questions
Section titled “Scenario-Based Questions”- Q: You receive an EC2 instance retirement notice. What’s your response?
- A: Use Health event info to identify affected instance. Plan migration: create AMI, launch replacement, migrate data/services, test. Schedule during maintenance window. Use Auto Scaling Group for automatic replacement if using ASG. Test failover before retirement date.
40.13 Exam Tips
Section titled “40.13 Exam Tips” Key Exam Points+------------------------------------------------------------------+| || 1. Personal Health Dashboard shows account-specific events || || 2. Service Health Dashboard shows public AWS status || || 3. Health API requires Business or Enterprise Support || || 4. CloudWatch Events can trigger on Health events || || 5. Event types: issue, accountNotification, scheduledChange || || 6. Affected entities show your impacted resources || || 7. Scheduled changes include instance retirements || || 8. Use Lambda for automated remediation || || 9. SNS can send notifications for Health events || || 10. Health events are region-specific || |+------------------------------------------------------------------+40.14 Summary
Section titled “40.14 Summary” Chapter 40 Summary+------------------------------------------------------------------+| || AWS Health Dashboard || +------------------------------------------------------------+ || | - Service Health: Public AWS status | || | - Personal Health: Account-specific alerts | || | - Event Log: Historical events | || +------------------------------------------------------------+ || || Event Categories || +------------------------------------------------------------+ || | - issue: Active service problems | || | - accountNotification: Account-specific notices | || | - scheduledChange: Planned maintenance | || +------------------------------------------------------------+ || || Integration || +------------------------------------------------------------+ || | - CloudWatch Events for automation | || | - SNS for notifications | || | - Lambda for remediation | || +------------------------------------------------------------+ || |+------------------------------------------------------------------+Previous Chapter: Chapter 39: Amazon OpenSearch Service - Log Analytics Next Part: Part 9: Infrastructure as Code & Automation