Skip to content

Design_twitter


Twitter Core Features
=====================
1. Tweet
- Post tweet (280 chars)
- View timeline
- Media support (images, videos)
2. Follow System
- Follow/unfollow users
- View followers/following
3. Timeline
- Home timeline
- User timeline
4. Social
- Likes, retweets, replies
- Mentions
- Hashtags
RequirementTarget
Availability99.99%
Latency< 200ms for timeline
Scalability500M+ users
ConsistencyEventual for timeline

Twitter Architecture
====================
Internet
|
v
+--------------------------------------------------+
| CDN (Edge Locations) |
+--------------------------------------------------+
|
v
+--------------------------------------------------+
| Load Balancers |
+--------------------------------------------------+
|
+-------------------+-------------------+
| | |
v v v
+---------+ +---------+ +---------+
| Web API | | Mobile | |Internal |
| Servers | | API | | Services|
+---------+ +---------+ +---------+
| |
+-------------------+
|
v
+--------------------------------------------------+
| Service Mesh / API Gateway |
+--------------------------------------------------+
|
+----------+ +----------+ +----------+ +----------+
| Tweet | | User | | Timeline | | Social |
| Service | | Service | | Service | | Service |
+----------+ +----------+ +----------+ +----------+
|
v
+--------------------------------------------------+
| Message Queue (Kafka) |
+--------------------------------------------------+
|
+-----------+-----------+
| | |
v v v
+----------+ +----------+ +----------+
| User | | Tweet | | Social |
| DB | | DB | | DB |
+----------+ +----------+ +----------+

User Entity
==========
{
"userId": "uuid",
"username": "john",
"displayName": "John Doe",
"email": "john@example.com",
"createdAt": "2024-01-01T00:00:00Z",
"followersCount": 1000,
"followingCount": 500
}
Tweet Entity
============
{
"tweetId": "uuid",
"userId": "uuid",
"content": "Hello world!",
"mediaUrls": ["url1", "url2"],
"replyToTweetId": "uuid (optional)",
"retweetOfTweetId": "uuid (optional)",
"likesCount": 100,
"retweetsCount": 50,
"repliesCount": 10,
"createdAt": "2024-01-01T12:00:00Z"
}
Follow Entity
=============
{
"followerId": "uuid",
"followingId": "uuid",
"createdAt": "2024-01-01T12:00:00Z"
}

Data TypeDatabaseReason
UsersMySQL (Sharded)ACID for relationships
TweetsCassandraHigh write throughput
TimelinesRedisFast read
FollowsRedis + MySQLFast lookups
MediaS3 + CloudFrontBlob storage

Timeline Generation (Fan-out)
============================
When user posts tweet:
1. Write to tweet DB (Cassandra)
2. Get user's followers (Redis)
3. Fan-out to each follower's timeline cache
Timeline Read:
1. User requests timeline
2. Read from timeline cache (Redis)
3. If cache miss, merge from tweet DB
Pros: Fast reads
Cons: Slow writes for popular users
Hybrid Approach:
===============
Active users (< 1M followers):
- Fan-out to timeline cache
Popular users (> 1M followers):
- Don't fan-out
- Generate on read (pull model)

Twitter Scale
=============
- 200M+ daily active users
- ~5000 tweets/second average
- ~100K tweets/second peak
- Timeline requests: millions/second
Solutions:
==========
1. Read Replicas
+--------------------------------+
| Primary DB -> Read Replicas |
| Distribute read load |
+--------------------------------+
2. Caching (Redis)
+--------------------------------+
| Timeline cache |
| User cache |
| Tweet cache |
+--------------------------------+
3. Sharding
+--------------------------------+
| User-based sharding |
| Tweet ID-based sharding |
+--------------------------------+

Search Architecture
==================
User Query -> API Gateway -> Search Service
|
+--------------------+--------------------+
| | |
v v v
+----------+ +----------+ +----------
|Elastic | | Redis | | Search
|search | | Cache | | Ranking
|Cluster | | | |
+----------+ +----------+ +----------
Features:
- Full-text search
- Filters (hashtags, users)
- Trending topics
- Ranking algorithm

Complete Twitter Architecture
============================
+---------------------------------------------------------------+
| Internet |
+---------------------------------------------------------------+
|
v
+---------------------------------------------------------------+
| CDN (CloudFront) |
| (Static assets: JS, CSS, Images) |
+---------------------------------------------------------------+
|
v
+---------------------------------------------------------------+
| Load Balancers (ALB) |
+---------------------------------------------------------------+
|
+-------------+-------------+
| |
v v
+---------------+ +---------------+
| Web App | | Mobile App |
| Servers | | Gateway |
+---------------+ +---------------+
| |
+-------------+-------------+
|
v
+---------------------------------------------------------------+
| API Gateway (Kong) |
| (Auth, Rate limiting, Routing) |
+---------------------------------------------------------------+
|
+-------------+-------------+-------------+-------------+
| | | | |
v v v v v
+---------+ +---------+ +---------+ +---------+ +---------+
| Tweet | | User | |Timeline | | Search | | Notif |
| Service | | Service | | Service| | Service | |Service |
+---------+ +---------+ +---------+ +---------+ +---------+
| | | |
+-------------+-------------+-------------+
|
v
+---------------------------------------------------------------+
| Apache Kafka (Message Queue) |
+---------------------------------------------------------------+
|
+-------------+-------------+-------------+
| | | |
v v v v
+---------+ +---------+ +---------+ +---------+
|Cassandra| | MySQL | | Redis | |Elastic |
| (Tweets)| | (Users) | | (Cache) | |search |
+---------+ +---------+ +---------+ +---------+

DecisionRationale
Cassandra for tweetsHigh write throughput
Redis for timelineFast reads
Fan-out on writeFast timeline reads
Hybrid for popular usersAvoid overwhelming
Eventual consistencyAcceptable for timeline
S3 + CDNScalable media storage

Key Twitter design concepts:

  1. Fan-out strategy - Write-heavy vs read-heavy
  2. Database per feature - Different data, different DBs
  3. Caching everywhere - Redis for timelines
  4. Eventual consistency - Acceptable for social
  5. Media offload - S3 + CDN
  6. Search integration - Elasticsearch

Next: Chapter 47: Designing Netflix