Designing Twitter at Scale: From 100K to 1B Users
Building a social media platform like Twitter is a masterclass in distributed systems design. What starts as a simple CRUD application quickly evolves into one of the most challenging engineering problems in tech: delivering personalized, real-time content to billions of users across the globe.
In this post, we’ll walk through the architectural journey from a startup with 100,000 users to a global platform serving 1 billion people. Along the way, we’ll examine the critical tradeoffs that shape how we build, scale, and maintain such a system.
The Core Problem
At its heart, Twitter solves a seemingly simple problem: allowing users to broadcast short messages to their followers. But this simplicity is deceptive. Let’s break down what we’re really building:
Core Features:
- Post tweets (280 characters, with media)
- Follow other users
- View a personalized timeline of tweets from followed users
- Search tweets and users
- Like, retweet, and reply to tweets
- Real-time notifications
Non-Functional Requirements:
- High availability (99.99%+)
- Low latency for reads (less than 200ms p99)
- Eventual consistency is acceptable for most operations
- Strong consistency for critical operations (following, blocking)
Phase 1: The Monolith (100K Users)
When you’re starting out, premature optimization is the enemy. Your goal is to validate product-market fit, not to over-engineer for scale you don’t have.
Architecture
┌─────────────────┐
│ │
│ Load Balancer │
│ (nginx) │
│ │
└────────┬────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌────▼─────┐ ┌───▼──────┐ ┌────▼─────┐
│ │ │ │ │ │
│ App │ │ App │ │ App │
│ Server │ │ Server │ │ Server │
│ (Rails) │ │ (Rails) │ │ (Rails) │
│ │ │ │ │ │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└────────────────────┼───────────────────┘
│
┌────────▼────────┐
│ │
│ PostgreSQL │
│ (Primary) │
│ │
└────────┬────────┘
│
┌────────▼────────┐
│ │
│ PostgreSQL │
│ (Replica) │
│ │
└─────────────────┘
Database Schema (Simplified)
users
- id (PK)
- username (unique)
- email (unique)
- created_at
tweets
- id (PK)
- user_id (FK)
- content
- created_at
- (index on user_id, created_at)
follows
- follower_id (FK)
- followee_id (FK)
- created_at
- (PK: follower_id, followee_id)
- (index on follower_id)
- (index on followee_id)
likes
- user_id (FK)
- tweet_id (FK)
- created_at
- (PK: user_id, tweet_id)
The Timeline Query (Fan-out on Read)
The most critical query is fetching a user’s home timeline. At this stage, we compute it on-demand:
SELECT t.*
FROM tweets t
JOIN follows f ON t.user_id = f.followee_id
WHERE f.follower_id = ?
ORDER BY t.created_at DESC
LIMIT 20;
Tradeoffs:
- ✅ Simple to implement and reason about
- ✅ No data staleness—always fresh
- ✅ Easy to change ranking algorithms
- ❌ Expensive query for users who follow many people
- ❌ Gets slower as tweet volume grows
- ❌ Every timeline fetch hits the database
At 100K users with modest activity, this works fine. PostgreSQL with proper indexing can handle thousands of these queries per second.
Phase 2: Caching & Read Optimization (1M Users)
As we grow to 1 million users, the same query pattern starts showing cracks. Database CPU spikes during peak hours, and p99 latencies creep up.
Architecture Evolution
┌─────────────────┐
│ │
│ Load Balancer │
│ │
└────────┬────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
┌────▼─────┐ ┌───▼──────┐ ┌────▼─────┐
│ │ │ │ │ │
│ App │◄────────┤ Redis │───────► App │
│ Server │ │ Cluster │ │ Server │
│ │ │ (Cache) │ │ │
└────┬─────┘ └──────────┘ └────┬─────┘
│ │
└────────────────┬───────────────────────┘
│
┌────────────────┼────────────────┐
│ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│PostgreSQL│ │PostgreSQL│ │PostgreSQL│
│ Primary │───►│ Replica │ │ Replica │
│ │ │ │ │ │
└──────────┘ └──────────┘ └──────────┘
Caching Strategy
We introduce Redis for multiple caching layers:
- User Object Cache: Cache user profiles (username, bio, avatar URL)
- Tweet Cache: Cache individual tweets by ID
- Timeline Cache: Cache computed timelines for active users
def get_home_timeline(user_id, limit=20):
# Try cache first
cache_key = f"timeline:{user_id}"
cached = redis.get(cache_key)
if cached:
return deserialize(cached)
# Cache miss - compute from database
timeline = db.query("""
SELECT t.*
FROM tweets t
JOIN follows f ON t.user_id = f.followee_id
WHERE f.follower_id = %s
ORDER BY t.created_at DESC
LIMIT %s
""", user_id, limit)
# Cache for 60 seconds
redis.setex(cache_key, 60, serialize(timeline))
return timeline
Tradeoffs:
- ✅ Dramatically reduces database load
- ✅ Sub-10ms cache hits for hot timelines
- ❌ Introduces cache invalidation complexity
- ❌ Stale data for up to 60 seconds
- ❌ Memory costs for Redis cluster
The Thundering Herd Problem
When a celebrity tweets, millions of timelines become invalid simultaneously. If we invalidate all caches, we create a “thundering herd” that slams the database.
Solution: Probabilistic early expiration and stale-while-revalidate pattern:
def get_home_timeline_swr(user_id):
cache_key = f"timeline:{user_id}"
cached = redis.get(cache_key)
if cached:
data = deserialize(cached)
# 10% chance to refresh in background if > 30s old
if data.age > 30 and random.random() less than 0.1:
background_refresh(user_id)
return data.timeline
return compute_and_cache_timeline(user_id)
Phase 3: The Fan-out Problem (10M Users)
At 10 million users, we hit a fundamental scaling wall. Consider a celebrity with 10 million followers posting a tweet. With our current “fan-out on read” approach:
- Each of those 10M followers will eventually query for their timeline
- Each query scans the
followstable and joins withtweets - Even with caching, the initial computation is expensive
- Celebrity tweets cause huge database spikes
It’s time for a paradigm shift: fan-out on write.
Architecture Evolution
┌──────────────────┐
│ │
│ Load Balancer │
│ │
└────────┬─────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│ │ │ │ │ │
│ API │ │ API │ │ API │
│ Servers │ │ Servers │ │ Servers │
│ │ │ │ │ │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└──────────┬──────────┴──────────┬──────────┘
│ │
┌────▼─────┐ ┌───▼──────┐
│ │ │ │
│PostgreSQL│ │ Redis │
│ Cluster │ │ Cluster │
│ │ │ │
└────┬─────┘ └──────────┘
│
│
┌────▼─────────────────────────┐
│ │
│ Message Queue (Kafka) │
│ │
└────┬─────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│ │ │ │ │ │
│ Fanout │ │ Fanout │ │ Fanout │
│ Workers │ │ Workers │ │ Workers │
│ │ │ │ │ │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└──────────────┼──────────────┘
│
┌────▼─────┐
│ │
│ Redis │
│(Timeline │
│ Cache) │
│ │
└──────────┘
Fan-out on Write
When a user posts a tweet, we immediately push it to all their followers’ timelines:
def publish_tweet(user_id, content):
# 1. Store the tweet
tweet_id = db.insert_tweet(user_id, content)
# 2. Publish to message queue for async fan-out
kafka.publish('new_tweets', {
'tweet_id': tweet_id,
'user_id': user_id,
'timestamp': now()
})
return tweet_id
# Worker process
def fanout_worker():
for message in kafka.consume('new_tweets'):
tweet = db.get_tweet(message['tweet_id'])
followers = db.get_followers(message['user_id'])
# Push to each follower's timeline (in Redis)
for follower_id in followers:
redis.lpush(f"timeline:{follower_id}", tweet.id)
redis.ltrim(f"timeline:{follower_id}", 0, 799) # Keep 800 tweets
Now reading a timeline is trivial:
def get_home_timeline(user_id, limit=20):
# Get tweet IDs from pre-computed timeline
tweet_ids = redis.lrange(f"timeline:{user_id}", 0, limit-1)
# Batch fetch tweet data
tweets = redis.mget([f"tweet:{tid}" for tid in tweet_ids])
return tweets
Tradeoffs:
- ✅ Extremely fast reads (1-2ms)
- ✅ Database load shifts from reads to writes
- ✅ Scales reads independently
- ❌ Expensive writes for users with many followers
- ❌ Significant memory for timeline storage
- ❌ Harder to change ranking algorithms
- ❌ Eventual consistency—new followers see delayed tweets
The Celebrity Problem
But we’ve created a new problem: when @celebrity with 50M followers tweets, we need to write to 50M timelines. This is slow and expensive.
Solution: Hybrid approach
def is_celebrity(user_id):
follower_count = get_follower_count(user_id)
return follower_count > 1_000_000
def publish_tweet(user_id, content):
tweet_id = db.insert_tweet(user_id, content)
if is_celebrity(user_id):
# Don't fan-out for celebrities
# Mark as celebrity tweet for special handling
cache.set(f"celebrity_tweet:{tweet_id}", True)
else:
# Normal fan-out
kafka.publish('new_tweets', {
'tweet_id': tweet_id,
'user_id': user_id
})
return tweet_id
def get_home_timeline(user_id, limit=20):
# Get pre-computed timeline
timeline = redis.lrange(f"timeline:{user_id}", 0, 100)
# Merge in tweets from celebrities we follow
celebrity_follows = get_celebrity_follows(user_id)
for celeb_id in celebrity_follows:
recent_tweets = db.query("""
SELECT id FROM tweets
WHERE user_id = %s
AND created_at > NOW() - INTERVAL '1 day'
ORDER BY created_at DESC
""", celeb_id)
timeline.extend(recent_tweets)
# Sort and return top N
return sort_by_time(timeline)[:limit]
We’re back to fan-out on read for celebrities, but only for a small percentage of users.
Phase 4: Global Scale (100M Users)
At 100 million users, we need to think about geographical distribution. Users in Tokyo shouldn’t hit servers in Virginia for every request.
Multi-Region Architecture
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ US-EAST │◄────────────►│ EU-WEST │◄────────────►│ AP-TOKYO │
│ Region │ │ Region │ │ Region │
│ │ │ │ │ │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
│ │ │
┌──────────▼──────────┐ ┌─────────▼─────────┐ ┌──────────▼──────────┐
│ │ │ │ │ │
│ API Servers │ │ API Servers │ │ API Servers │
│ Load Balancers │ │ Load Balancers │ │ Load Balancers │
│ Redis Clusters │ │ Redis Clusters │ │ Redis Clusters │
│ │ │ │ │ │
└──────────┬──────────┘ └─────────┬─────────┘ └──────────┬──────────┘
│ │ │
│ │ │
└───────────────────────────┼────────────────────────────┘
│
│
┌────────────▼────────────┐
│ │
│ Global Database Layer │
│ (PostgreSQL + Vitess) │
│ Sharded by user_id │
│ │
└─────────────────────────┘
Data Partitioning Strategy
We shard data by user_id using consistent hashing:
def get_shard_for_user(user_id):
# 1024 virtual shards mapped to physical databases
shard_id = hash(user_id) % 1024
return shard_mapping[shard_id]
# User data lives on their home shard
users_shard_123 = {
'users': ...,
'tweets': ..., # All tweets by these users
'follows': ..., # All follows where follower is in this shard
}
Cross-shard challenges:
When user A (shard 1) follows user B (shard 2), we need the relationship accessible from both shards:
def follow_user(follower_id, followee_id):
follower_shard = get_shard_for_user(follower_id)
followee_shard = get_shard_for_user(followee_id)
# Write to both shards
with transaction():
follower_shard.insert('follows', {
'follower_id': follower_id,
'followee_id': followee_id,
'type': 'outgoing'
})
followee_shard.insert('follows', {
'follower_id': follower_id,
'followee_id': followee_id,
'type': 'incoming'
})
This duplication enables local reads at the cost of write complexity and storage.
CDN for Media
User-generated media (images, videos) now flows through a CDN:
User Upload → API Server → S3 (us-east-1) → CloudFront CDN → Global Edge Locations
Images are resized and optimized at upload time:
def upload_image(image_data):
# Generate multiple sizes
sizes = {
'thumb': resize(image_data, 150, 150),
'small': resize(image_data, 400, 400),
'large': resize(image_data, 1200, 1200),
}
# Upload to S3 with unique key
image_id = generate_uuid()
for size_name, image_bytes in sizes.items():
s3.put_object(
Bucket='twitter-images',
Key=f'{image_id}/{size_name}.jpg',
Body=image_bytes,
ContentType='image/jpeg'
)
# Return CDN URL
return f"https://cdn.twitter.com/images/{image_id}"
Phase 5: The Billion-User Platform
At 1 billion users, every component must be built for extreme scale. Let’s examine the final architecture.
Complete System Architecture
┌─────────────────────────────┐
│ │
│ Global DNS / CDN │
│ (Route53 + CloudFront) │
│ │
└──────────────┬──────────────┘
│
┌────────────────────────┼────────────────────────┐
│ │ │
┌────────▼────────┐ ┌───────▼────────┐ ┌────────▼────────┐
│ │ │ │ │ │
│ US-EAST Region │ │ EU-WEST Region│ │ AP-SOUTHEAST │
│ │ │ │ │ Region │
└────────┬────────┘ └───────┬────────┘ └────────┬────────┘
│ │ │
┌──────────▼──────────┐ ┌────────▼────────┐ ┌─────────▼─────────┐
│ │ │ │ │ │
│ API Gateway │ │ API Gateway │ │ API Gateway │
│ (Rate Limiting) │ │ (Rate Limiting)│ │ (Rate Limiting) │
│ │ │ │ │ │
└──────────┬──────────┘ └────────┬────────┘ └─────────┬─────────┘
│ │ │
┌─────────────┼───────────┐ │ ┌──────────┼──────────┐
│ │ │ │ │ │ │
┌─────▼────┐ ┌────▼─────┐ ┌───▼────┐ │ ┌─────▼────┐ ┌──▼────┐ │
│ │ │ │ │ │ │ │ │ │ │ │
│Timeline │ │Tweet │ │Search │ │ │Timeline │ │Tweet │ │
│Service │ │Service │ │Service │ │ │Service │ │Service│ │
│ │ │ │ │ │ │ │ │ │ │ │
└─────┬────┘ └────┬─────┘ └───┬────┘ │ └─────┬────┘ └──┬────┘ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ ┌────▼───────────▼───┐ │ │ │ │
│ │ │ │ │ │ │
└──────►│ Redis Cluster │◄─────┼────────────┘ │ │
│ (Timeline Cache) │ │ │ │
│ │ │ │ │
└────────────────────┘ │ │ │
│ │ │
┌────────────────────┐ │ ┌──────────────▼──────────▼─┐
│ │ │ │ │
│ Kafka Cluster │◄─────┼──────►│ Kafka Cluster │
│ (Event Streaming) │ │ │ (Event Streaming) │
│ │ │ │ │
└─────────┬──────────┘ │ └────────────┬──────────────┘
│ │ │
┌─────────────┼─────────────┐ │ │
│ │ │ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌────▼───▼──┐ ┌─────▼─────┐
│ │ │ │ │ │ │ │
│ Fanout │ │ Search │ │ Tweet │ │ Analytics │
│ Workers │ │ Indexer │ │ Workers │ │ Workers │
│ │ │ │ │ │ │ │
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └───────────┘
│ │ │
│ │ │
│ ┌─────▼─────┐ │
│ │ │ │
│ │Elasticsearch │
│ │ Cluster │ │
│ │ │ │
│ └───────────┘ │
│ │
└───────────────┬───────────┘
│
┌───────────▼───────────┐
│ │
│ PostgreSQL Cluster │
│ (Vitess - Sharded) │
│ 1024 shards │
│ │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ │
│ Object Storage │
│ (S3 / Media Files) │
│ │
└───────────────────────┘
Microservices Breakdown
Timeline Service: Assembles user timelines from multiple sources
Tweet Service: Handles tweet creation, deletion, and retrieval
Search Service: Full-text search across all tweets
Graph Service: Manages follow relationships
Notification Service: Real-time alerts and push notifications
Analytics Service: Aggregates metrics and trends
Timeline Service Deep Dive
The timeline service is the most complex component. At 1B users, it must handle:
- 500M timeline requests per second (peak)
- Mix of read patterns (active users check every minute, passive users check daily)
- Personalization and ranking
class TimelineService:
def get_home_timeline(self, user_id, limit=20):
# 1. Check if user timeline is in hot cache (L1)
cached = self.hot_cache.get(f"tl:{user_id}")
if cached:
return self._hydrate_tweets(cached[:limit])
# 2. Check warm cache (L2 - Redis)
warm = self.redis.lrange(f"timeline:{user_id}", 0, 199)
if warm:
self.hot_cache.set(f"tl:{user_id}", warm, ttl=60)
return self._hydrate_tweets(warm[:limit])
# 3. Reconstruct from sources (cold path)
return self._build_timeline_cold(user_id, limit)
def _build_timeline_cold(self, user_id, limit):
# Parallel fetch from multiple sources
futures = []
# Source 1: Regular users (fan-out on write)
futures.append(
self.executor.submit(
self._fetch_fanned_timeline, user_id
)
)
# Source 2: Celebrities (fan-out on read)
celebrity_follows = self._get_celebrity_follows(user_id)
for celeb_id in celebrity_follows:
futures.append(
self.executor.submit(
self._fetch_celebrity_tweets, celeb_id
)
)
# Source 3: Promoted tweets (ads)
futures.append(
self.executor.submit(
self._fetch_promoted_tweets, user_id
)
)
# Merge and rank
all_tweets = []
for future in futures:
all_tweets.extend(future.result())
ranked = self._rank_tweets(user_id, all_tweets)
# Cache the result
self.redis.delete(f"timeline:{user_id}")
self.redis.rpush(f"timeline:{user_id}", *ranked[:200])
self.redis.expire(f"timeline:{user_id}", 300)
return self._hydrate_tweets(ranked[:limit])
def _rank_tweets(self, user_id, tweets):
# ML-based ranking model
user_features = self.feature_store.get_user(user_id)
scores = []
for tweet in tweets:
tweet_features = self._extract_features(tweet)
score = self.ranking_model.predict(
user_features,
tweet_features
)
scores.append((score, tweet))
# Sort by score, descending
scores.sort(reverse=True, key=lambda x: x[0])
return [tweet for score, tweet in scores]
Search at Scale
Twitter search needs to handle:
- 500K+ searches per second
- Full-text search across 500B+ tweets
- Real-time indexing (new tweets searchable in less than10 seconds)
- Faceted search (by user, date, engagement, media type)
We use Elasticsearch with custom optimizations:
Tweet Published
│
├──► Kafka Topic: "new_tweets"
│
▼
Search Indexer Workers (100s of instances)
│
├──► Batch tweets (100 tweets or 1 second)
│
▼
Elasticsearch Cluster
│
├──► Index: tweets_2026_02_10 (daily rolling indices)
│
└──► Sharded by tweet_id hash (1024 shards)
Index structure:
{
"mappings": {
"properties": {
"tweet_id": {"type": "keyword"},
"user_id": {"type": "keyword"},
"content": {
"type": "text",
"analyzer": "tweet_analyzer",
"fields": {
"exact": {"type": "keyword"}
}
},
"hashtags": {"type": "keyword"},
"mentions": {"type": "keyword"},
"created_at": {"type": "date"},
"engagement_score": {"type": "float"},
"has_media": {"type": "boolean"},
"lang": {"type": "keyword"}
}
}
}
Custom analyzer for tweet-specific tokenization:
{
"analysis": {
"analyzer": {
"tweet_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"hashtag_filter",
"mention_filter",
"emoji_filter"
]
}
}
}
}
Real-Time Notifications
Notifications present a unique challenge: delivering messages to millions of users instantly when they’re offline, without holding open millions of WebSocket connections.
┌─────────────┐
│ Tweet │
│ Service │
└──────┬──────┘
│
┌──────▼──────┐
│ Kafka │
│ notifications│
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌───▼─────┐
│Notif │ │Notif │ │Notif │
│Worker │ │Worker │ │Worker │
│ │ │ │ │ │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└────────────┼────────────┘
│
┌────────────▼────────────┐
│ │
│ Notification Store │
│ (Redis Sorted Sets) │
│ │
└────────────┬────────────┘
│
┌────────────┼────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌───▼─────┐
│ Push │ │WebSocket│ │ SMS │
│Notif │ │Gateway │ │Gateway │
│(APNs/ │ │ │ │ │
│ FCM) │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘
For online users, we maintain WebSocket connections to regional gateway servers:
class WebSocketGateway:
def __init__(self):
self.connections = {} # user_id -> websocket
self.redis_pubsub = redis.pubsub()
async def handle_connection(self, websocket, user_id):
self.connections[user_id] = websocket
# Subscribe to user's notification channel
self.redis_pubsub.subscribe(f"notif:{user_id}")
try:
async for message in self.redis_pubsub.listen():
if message['type'] == 'message':
await websocket.send(message['data'])
finally:
del self.connections[user_id]
For offline users, we batch notifications and send via push services:
def process_notification(notification):
user_id = notification['user_id']
# Check if user is online
if websocket_gateway.is_online(user_id):
# Send via WebSocket
redis.publish(f"notif:{user_id}", notification)
else:
# Store for later delivery
redis.zadd(
f"notif_queue:{user_id}",
{notification['id']: notification['timestamp']}
)
# Send push notification
user_devices = get_user_devices(user_id)
for device in user_devices:
if device.type == 'ios':
apns.send(device.token, notification)
elif device.type == 'android':
fcm.send(device.token, notification)
Critical Tradeoffs Summary
Throughout this journey, we’ve made countless tradeoffs. Here are the most impactful:
1. Fan-out on Write vs. Fan-out on Read
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Fan-out on Read | Simple, flexible, always fresh | Slow for users following many people | Small scale, less than100K users |
| Fan-out on Write | Fast reads, predictable latency | Expensive for celebrities, hard to change ranking | Medium scale, normal users |
| Hybrid | Best of both worlds | Complex, requires two code paths | Large scale, 1B+ users |
Our solution: Hybrid with celebrity threshold at 1M followers.
2. Consistency vs. Availability
Twitter prioritizes availability over consistency (AP in CAP theorem). It’s acceptable if:
- A new follower doesn’t see historical tweets immediately
- Like counts are slightly off
- Timelines show tweets in slightly wrong order
But we require strong consistency for:
- Follow/unfollow operations (can’t follow someone twice)
- Block operations (must take effect immediately)
- Account deletion (must be permanent)
3. Normalization vs. Denormalization
We heavily denormalize for read performance:
# Instead of joining 3 tables on every timeline read:
SELECT u.username, u.avatar, t.content, t.created_at
FROM tweets t
JOIN users u ON t.user_id = u.id
JOIN likes l ON t.id = l.tweet_id
WHERE ...
# We cache denormalized tweets:
{
"tweet_id": "123",
"content": "Hello world",
"author": {
"username": "alice",
"avatar": "https://..."
},
"stats": {
"likes": 42,
"retweets": 7
}
}
Cost: When a user changes their avatar, we must update millions of cached tweets. We solve this with eventual consistency and lazy updates.
4. Latency vs. Freshness
| Timeline Update Strategy | P99 Latency | Freshness | Complexity |
|---|---|---|---|
| Real-time (no cache) | 500ms | Instant | Low |
| Cache with TTL | 50ms | 30-60s delay | Medium |
| Pre-computed + cache | 5ms | Delayed fan-out | High |
Our solution: Pre-computed timelines with 5-10 second fan-out delay. Users see tweets within seconds of posting, with sub-10ms read latency.
5. Storage Costs vs. Query Performance
At 1B users posting 500M tweets/day, storage is expensive:
- 500M tweets/day × 280 chars × 365 days = ~50 TB/year (text only)
- With metadata, indices, replicas: ~500 TB/year
- With media (images, videos): ~50 PB/year
Optimizations:
- Store old tweets (>1 year) in cold storage (S3 Glacier)
- Keep only 90 days in hot database
- Compress media aggressively
- Delete tweets with zero engagement after 5 years
Performance Numbers
Here’s what we achieve at 1B users:
| Metric | Target | Achieved |
|---|---|---|
| Timeline load time (p99) | less than200ms | 150ms |
| Tweet publish latency (p99) | less than500ms | 300ms |
| Search latency (p99) | less than500ms | 400ms |
| Availability | 99.99% | 99.97% |
| Write throughput | 10K tweets/sec | 12K tweets/sec |
| Read throughput | 500K requests/sec | 600K requests/sec |
Lessons Learned
-
Start simple, scale progressively: Don’t build for 1B users on day one. Each scale tier has different optimal solutions.
-
Denormalize aggressively for reads: In social media, reads outnumber writes 1000:1. Optimize for the common case.
-
Embrace eventual consistency: Perfect consistency is expensive. Users tolerate slight staleness in timelines.
-
Cache everything, but carefully: Caching is powerful but cache invalidation is hard. Use TTLs and probabilistic expiration.
-
Hybrid approaches win: Pure fan-out on write OR read doesn’t scale. The best solution is often a hybrid.
-
Measure everything: You can’t optimize what you don’t measure. Instrument every service, every query, every cache hit.
-
Failure is normal: At scale, something is always broken. Design for partial failure and graceful degradation.
Conclusion
Scaling Twitter from 100K to 1B users is a journey through the full spectrum of distributed systems challenges. What starts as a simple Rails app evolves into a globally distributed, microservices-based platform handling terabytes of data and millions of requests per second.
The key insight is that there’s no single “right” architecture. The optimal design depends entirely on your current scale, growth trajectory, and engineering resources. A startup with 100K users has radically different needs than a mature platform with 100M users.
The architectures we’ve explored represent common patterns, but every company’s implementation will differ based on their specific constraints and requirements. The principles, however, remain constant: start simple, measure relentlessly, and evolve your architecture to match your scale.
Building a Twitter-scale system is one of the hardest problems in software engineering. But by understanding the tradeoffs, learning from others’ experiences, and staying focused on your users’ needs, it’s absolutely achievable.
Want to dive deeper? Check out papers on Cassandra, Kafka, and Vitess. Also explore High Scalability for real-world architecture case studies.