Skip to content

Amazon OpenSearch Service

Chapter 39: Amazon OpenSearch Service - Log Analytics

Section titled “Chapter 39: Amazon OpenSearch Service - Log Analytics”

Amazon OpenSearch Service is a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud.

Amazon OpenSearch Service Overview
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | Amazon OpenSearch | |
| | Service | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | | |
| v v v v |
| +----------+ +----------+ +----------+ +----------+ |
| | Search | | Log | | Analytics| | Dashboards| |
| | | | Analytics| | | | | |
| | - Full | | - Cloud | | - Real | | - Visual | |
| | Text | | Watch | | Time | | ize | |
| | - Index | | - Logs | | - Query | | - Kibana | |
| +----------+ +----------+ +----------+ +----------+ |
| |
+------------------------------------------------------------------+
FeatureDescription
Search EngineFull-text search and indexing
Log AnalyticsCentralized log management
Real-time AnalyticsLive data analysis
DashboardsOpenSearch Dashboards (Kibana)

OpenSearch Cluster Architecture
+------------------------------------------------------------------+
| |
| +------------------------+ |
| | OpenSearch Cluster | |
| +------------------------+ |
| | |
| +---------------------+---------------------+ |
| | | | |
| v v v |
| +----------+ +----------+ +----------+ |
| | Master | | Data | | Coordinating |
| | Nodes | | Nodes | | Nodes | |
| | | | | | | |
| | - Cluster| | - Store | | - Query | |
| | Manage | | Data | | Routing| |
| | - State | | - Index | | - Load | |
| | | | | | Balance| |
| +----------+ +----------+ +----------+ |
| |
+------------------------------------------------------------------+
OpenSearch Node Types
+------------------------------------------------------------------+
| |
| Master Nodes |
| +------------------------------------------------------------+ |
| | | |
| | - Cluster state management | |
| | - Index creation/deletion | |
| | - Node coordination | |
| | - Recommended: 3 for HA | |
| | | |
| +------------------------------------------------------------+ |
| |
| Data Nodes |
| +------------------------------------------------------------+ |
| | | |
| | - Store and index data | |
| | - Execute queries | |
| | - Handle CRUD operations | |
| | - Scale horizontally | |
| | | |
| +------------------------------------------------------------+ |
| |
| Coordinating Nodes |
| +------------------------------------------------------------+ |
| | | |
| | - Route queries to data nodes | |
| | - Aggregate results | |
| | - Handle incoming requests | |
| | | |
| +------------------------------------------------------------+ |
| |
| UltraWarm Nodes |
| +------------------------------------------------------------+ |
| | | |
| | - Cost-effective cold storage | |
| | - S3-backed storage | |
| | - For infrequently accessed data | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

OpenSearch Index Structure
+------------------------------------------------------------------+
| |
| Index |
| +------------------------------------------------------------+ |
| | | |
| | logs-2024-01-15 | |
| | +--------------------------------------------------------+ | |
| | | Shard 1 | | |
| | | +----------------------------------------------------+ | | |
| | | | Document 1: { "timestamp": ..., "message": ... } | | | |
| | | | Document 2: { "timestamp": ..., "message": ... } | | | |
| | | +----------------------------------------------------+ | | |
| | +--------------------------------------------------------+ | |
| | +--------------------------------------------------------+ | |
| | | Shard 2 | | |
| | | +----------------------------------------------------+ | | |
| | | | Document 3: { "timestamp": ..., "message": ... } | | | |
| | | +----------------------------------------------------+ | | |
| | +--------------------------------------------------------+ | |
| | | |
| +------------------------------------------------------------+ |
| |
| Shards |
| +------------------------------------------------------------+ |
| | | |
| | - Primary shards: Handle write operations | |
| | - Replica shards: Read scaling and redundancy | |
| | - Number of shards affects performance | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Index State Management
+------------------------------------------------------------------+
| |
| ISM Policy |
| +------------------------------------------------------------+ |
| | | |
| | States: | |
| | +------------------------------------------------------+ | |
| | | Hot --> Warm --> Cold --> Delete | | |
| | +------------------------------------------------------+ | |
| | | |
| | Example Policy: | |
| | +------------------------------------------------------+ | |
| | | { | | |
| | | "policy": { | | |
| | | "description": "Log retention policy", | | |
| | | "states": [ | | |
| | | { | | |
| | | "name": "hot", | | |
| | | "transitions": [{ | | |
| | | "state_name": "warm", | | |
| | | "conditions": { | | |
| | | "min_index_age": "7d" | | |
| | | } | | |
| | | }] | | |
| | | }, | | |
| | | { | | |
| | | "name": "warm", | | |
| | | "actions": [{ "replica_count": 1 }], | | |
| | | "transitions": [{ | | |
| | | "state_name": "delete", | | |
| | | "conditions": { "min_index_age": "30d" } | | |
| | | }] | | |
| | | }, | | |
| | | { "name": "delete", "actions": [{ "delete": {} }] } | |
| | | ] | | |
| | | } | | |
| | | } | | |
| | +------------------------------------------------------+ | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

OpenSearch Data Ingestion
+------------------------------------------------------------------+
| |
| Direct Ingestion |
| +------------------------------------------------------------+ |
| | | |
| | - OpenSearch API (REST) | |
| | - OpenSearch SDK | |
| | - Bulk API for high-volume ingestion | |
| | | |
| +------------------------------------------------------------+ |
| |
| AWS Service Integration |
| +------------------------------------------------------------+ |
| | | |
| | CloudWatch Logs: | |
| | +------------------------------------------------------+ | |
| | | - Subscription filter to Lambda --> OpenSearch | | |
| | | - CloudWatch Logs subscription | | |
| | +------------------------------------------------------+ | |
| | | |
| | Kinesis Data Firehose: | |
| | +------------------------------------------------------+ | |
| | | - Direct delivery to OpenSearch | | |
| | | - Data transformation via Lambda | | |
| | | - Buffer and batch configuration | | |
| | +------------------------------------------------------+ | |
| | | |
| | Amazon S3: | |
| | +------------------------------------------------------+ | |
| | | - S3 event notification --> Lambda --> OpenSearch | | |
| | | - Batch loading from S3 | | |
| | +------------------------------------------------------+ | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+
Kinesis Firehose to OpenSearch
+------------------------------------------------------------------+
| |
| Data Flow |
| +------------------------------------------------------------+ |
| | | |
| | +----------+ +----------+ +----------+ | |
| | | Data | --> | Kinesis | --> | Lambda | | |
| | | Source | | Firehose | | Transform| | |
| | +----------+ +----------+ +----------+ | |
| | | | |
| | v | |
| | +----------+ | |
| | | OpenSearch| | |
| | | Service | | |
| | +----------+ | |
| | | | |
| | v | |
| | +----------+ | |
| | | S3 Backup | | |
| | +----------+ | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

OpenSearch Dashboards
+------------------------------------------------------------------+
| |
| Visualization Types |
| +------------------------------------------------------------+ |
| | | |
| | - Line, area, and bar charts | |
| | - Pie and donut charts | |
| | - Heat maps | |
| | - Gauge and metric visualizations | |
| | - Maps (geospatial data) | |
| | - Tables and data grids | |
| | - Markdown text widgets | |
| | | |
| +------------------------------------------------------------+ |
| |
| Discover (Log Search) |
| +------------------------------------------------------------+ |
| | | |
| | - Query logs using Lucene or KQL syntax | |
| | - Filter by field values | |
| | - Expand document details | |
| | - Save searches | |
| | | |
| +------------------------------------------------------------+ |
| |
| Alerting |
| +------------------------------------------------------------+ |
| | | |
| | - Create monitors for queries | |
| | - Set trigger conditions | |
| | - Configure actions (SNS, Slack, etc.) | |
| | - Track alert history | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

OpenSearch Security
+------------------------------------------------------------------+
| |
| Authentication |
| +------------------------------------------------------------+ |
| | | |
| | - Fine-grained access control (FGAC) | |
| | - IAM authentication | |
| | - SAML authentication | |
| | - Basic authentication (username/password) | |
| | | |
| +------------------------------------------------------------+ |
| |
| Encryption |
| +------------------------------------------------------------+ |
| | | |
| | - Encryption at rest (AWS KMS) | |
| | - Encryption in transit (TLS) | |
| | - Node-to-node encryption | |
| | | |
| +------------------------------------------------------------+ |
| |
| Access Control |
| +------------------------------------------------------------+ |
| | | |
| | - Role-based access control (RBAC) | |
| | - Index-level permissions | |
| | - Document-level security | |
| | - Field-level security | |
| | | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Terminal window
# Create OpenSearch domain
aws opensearch create-domain \
--domain-name my-domain \
--engine-version OpenSearch_2.3 \
--cluster-config InstanceType=r6g.large.search,InstanceCount=3,DedicatedMasterEnabled=true,DedicatedMasterType=r6g.large.search,DedicatedMasterCount=3 \
--ebs-options EBSEnabled=true,VolumeType=gp3,VolumeSize=100 \
--node-to-node-encryption-options Enabled=true \
--encryption-at-rest-options Enabled=true \
--domain-endpoint-options EnforceHTTPS=true
# Describe domain
aws opensearch describe-domain \
--domain-name my-domain
# List domains
aws opensearch list-domain-names
# Get domain status
aws opensearch describe-domain-health \
--domain-name my-domain
# Update domain configuration
aws opensearch update-domain-config \
--domain-name my-domain \
--cluster-config InstanceCount=5
# Delete domain
aws opensearch delete-domain \
--domain-name my-domain
# Authorize VPC endpoint
aws opensearch authorize-vpc-endpoint-access \
--domain-name my-domain \
--account 123456789012
# Create data source (for OpenSearch Serverless)
aws opensearch create-data-source \
--name my-data-source \
--type OPENSEARCH_SERVERLESS \
--open-search-serverless-options '{"collectionArn": "arn:aws:aoss:..."}'
# Create access policy
aws opensearch create-access-policy \
--name my-access-policy \
--type data \
--policy '{"Rules":[{"Resource":["collection/*"],"Permission":["aoss:DescribeCollection"],"Principal":["arn:aws:iam::..."]}]}'

OpenSearch Best Practices
+------------------------------------------------------------------+
| |
| 1. Use appropriate instance types |
| +------------------------------------------------------------+ |
| | - Storage-optimized for large data | |
| | - Compute-optimized for search-heavy workloads | |
| +------------------------------------------------------------+ |
| |
| 2. Configure proper shard count |
| +------------------------------------------------------------+ |
| | - Aim for 10-50 GB per shard | |
| | - Avoid oversharding | |
| +------------------------------------------------------------+ |
| |
| 3. Use UltraWarm for cold data |
| +------------------------------------------------------------+ |
| | - Reduce costs for infrequently accessed data | |
| | - Configure ISM policies | |
| +------------------------------------------------------------+ |
| |
| 4. Enable security features |
| +------------------------------------------------------------+ |
| | - Fine-grained access control | |
| | - Encryption at rest and in transit | |
| +------------------------------------------------------------+ |
| |
| 5. Monitor cluster health |
| +------------------------------------------------------------+ |
| | - Set up CloudWatch alarms | |
| | - Monitor JVM memory pressure | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

OpenSearch is essential for log analysis, search, and observability. SREs use it to analyze application logs, trace errors, and create operational dashboards.

OpenSearch in DevOps/SRE
+------------------------------------------------------------------+
| |
| SRE Observability & Troubleshooting: |
| |
| 1. Centralized Logging |
| +----------------------------------------------------------+ |
| | - Aggregate logs from all services in one place | |
| | - Search and analyze across multiple sources | |
| | - Create alerts on error patterns | |
| +----------------------------------------------------------+ |
| |
| 2. Operational Dashboards |
| +----------------------------------------------------------+ |
| | - Real-time monitoring of application health | |
| | - Error rate trends and anomaly detection | |
| | - Customer-impact visualization | |
| +----------------------------------------------------------+ |
| |
| 3. Incident Response |
| +----------------------------------------------------------+ |
| | - Fast search across large log volumes | |
| | - Trace errors across microservices | |
| | - Create runbook searches for common issues | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Terminal window
# Install OpenSearch client on Arch Linux
sudo pacman -S python python-pip
pip install opensearch-py
# Query OpenSearch from CLI
#!/bin/bash
# ~/bin/opensearch-query.sh
set -euo pipefail
DOMAIN_ENDPOINT="search-my-domain.us-east-1.es.amazonaws.com"
INDEX="application-logs-*"
curl -s -X GET "$DOMAIN_ENDPOINT/$INDEX/_search" \
-H 'Content-Type: application/json' \
-d '{
"query": {
"bool": {
"must": [
{"match": {"level": "ERROR"}},
{"range": {"@timestamp": {"gte": "now-1h"}}}
]
}
},
"size": 10
}' | jq '.hits.hits[]._source'

OpenSearch Anti-Patterns
+------------------------------------------------------------------+
| |
| ❌ Mistake 1: Not Using Index Lifecycle Management |
| +----------------------------------------------------------+ |
| | Problem: Indices grow unbounded, high costs | |
| | Impact: Performance degradation, cost overruns | |
| | Fix: Configure ISM to rotate and delete old indices | |
| +----------------------------------------------------------+ |
| |
| ❌ Mistake 2: Oversharding |
| +----------------------------------------------------------+ |
| | Problem: Too many small shards | |
| | Impact: Memory overhead, slower searches | |
| | Fix: Aim for 10-50GB per shard, use index templates | |
| +----------------------------------------------------------+ |
| |
| ❌ Mistake 3: Not Using UltraWarm for Cold Data |
| +----------------------------------------------------------+ |
| | Problem: All data on hot storage is expensive | |
| | Impact: High costs for rarely accessed logs | |
| | Fix: Use UltraWarm for data older than 30 days | |
| +----------------------------------------------------------+ |
| |
| ❌ Mistake 4: Not Enabling Encryption |
| +----------------------------------------------------------+ |
| | Problem: Data at rest/transit not encrypted | |
| | Impact: Security vulnerabilities, compliance failures | |
| | Fix: Enable encryption at rest and in transit | |
| +----------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

  1. Q: Explain the difference between hot, warm, and UltraWarm storage in OpenSearch.

    • A: Hot storage uses instance store or EBS for frequently accessed data with fast读写. Warm storage uses EBS for less frequently accessed data at lower cost. UltraWarm uses S3 for cold data with minimal compute, offering significant cost savings. Data automatically moves between tiers using ISM policies.
  2. Q: How does OpenSearch scaling work?

    • A: OpenSearch scales horizontally by adding data nodes. Each index is split into shards (primary + replicas). For scaling: add more data nodes, adjust shard count, use UltraWarm for cold data. Master nodes handle cluster management separately from data nodes.
  1. Q: Design a log aggregation solution using OpenSearch.
    • A: Use Kinesis Firehose to ingest CloudWatch logs and direct to OpenSearch. Create index per service/day with ISM for rotation. Use UltraWarm for logs older than 30 days. Configure Fine-Grained Access Control for security. Create Kibana dashboards for error rates and latency percentiles.

Exam Tip

Key Exam Points
+------------------------------------------------------------------+
| |
| 1. OpenSearch is the successor to Elasticsearch Service |
| |
| 2. Use Kinesis Firehose for log ingestion |
| |
| 3. UltraWarm provides cost-effective cold storage |
| |
| 4. ISM automates index lifecycle management |
| |
| 5. Fine-grained access control for security |
| |
| 6. OpenSearch Dashboards for visualization |
| |
| 7. Master nodes manage cluster state |
| |
| 8. Data nodes store and index data |
| |
| 9. Shards affect performance and scalability |
| |
| 10. Cross-cluster search for multi-region queries |
| |
+------------------------------------------------------------------+

Chapter 39 Summary
+------------------------------------------------------------------+
| |
| OpenSearch Core Concepts |
| +------------------------------------------------------------+ |
| | - Clusters: Managed OpenSearch deployment | |
| | - Indices: Data containers | |
| | - Shards: Data partitions | |
| | - Documents: Individual records | |
| +------------------------------------------------------------+ |
| |
| Key Features |
| +------------------------------------------------------------+ |
| | - Full-text search | |
| | - Log analytics | |
| | - Real-time dashboards | |
| | - Alerting | |
| +------------------------------------------------------------+ |
| |
| Integration |
| +------------------------------------------------------------+ |
| | - Kinesis Data Firehose | |
| | - CloudWatch Logs | |
| | - Lambda for transformation | |
| +------------------------------------------------------------+ |
| |
+------------------------------------------------------------------+

Previous Chapter: Chapter 38: AWS X-Ray - Distributed Tracing Next Chapter: Chapter 40: AWS Health Dashboard & Personal Health Dashboard