Design_spotify
Chapter 50: Designing Spotify
Section titled “Chapter 50: Designing Spotify”Music Streaming with Real-Time Personalization
Section titled “Music Streaming with Real-Time Personalization”50.1 Spotify Overview
Section titled “50.1 Spotify Overview”Spotify is the world’s most popular music streaming service with over 500 million users and 100+ million tracks.
Spotify by the Numbers ====================
┌─────────────────────────────────────────────────────────────┐ │ 500M+ monthly active users │ │ 200M+ subscribers (paid) │ │ 100M+ tracks │ │ 4B+ playlists │ │ 100K+ new tracks added daily │ │ 2B+ hours streamed monthly │ └─────────────────────────────────────────────────────────────┘Requirements Analysis
Section titled “Requirements Analysis”| Requirement | Scale | Technical Challenge |
|---|---|---|
| Streaming | Sub-200ms latency | Global CDN |
| Music catalog | 100M+ tracks | Metadata management |
| Recommendations | Real-time personalization | ML at scale |
| Availability | 99.99% | Global infrastructure |
| Uploads | 100K/day | Ingestion pipeline |
50.2 High-Level Architecture
Section titled “50.2 High-Level Architecture” Spotify Architecture =================
┌─────────────────────────────────────────────────────────────┐ │ Mobile Apps │ │ (iOS, Android, Desktop) │ └────────────────────────────┬────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ API Gateway │ │ (Edge, Authentication) │ └────────────────────────────┬────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Backend Services │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ Playback│ │Metadata │ │Playlist │ │ Search │ │ │ │ Service │ │ Service │ │ Service │ │ Service │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ User │ │Library │ │ Social │ │Upload │ │ │ │ Profile │ │ Service │ │ Service │ │ Service │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Recommendation Services (Secret Sauce!) │ │ │ └─────────────────────────────────────────────────────┘ │ └────────────────────────────┬────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Data Layer │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │Cassandra │ │ PostgreSQL│ │ Redis │ │ │ │(Metadata)│ │ (User/Pay)│ │(Sessions)│ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ S3 │ │ Kafka │ │ Google │ │ │ │(Audio) │ │ (Events) │ │ BigQuery │ │ │ └──────────┘ └──────────┘ └──────────┘ │ └─────────────────────────────────────────────────────────────┘50.3 Music Storage & Delivery
Section titled “50.3 Music Storage & Delivery”Audio File Storage
Section titled “Audio File Storage” Spotify's Audio Pipeline =====================
┌─────────────────────────────────────────────────────────────┐ │ Upload Phase │ │ ────────────────────────────────────────────────────────│ │ │ │ Labels/Artists ──▶ Upload to S3 ──▶ Trigger processing │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Processing Pipeline (Several hours) │ │ ────────────────────────────────────────────────────────│ │ │ │ 1. Convert to Spotify format (OGG Vorbis) │ │ 2. Generate multiple quality levels │ │ 3. Create audio fingerprints │ │ 4. Analyze audio (BPM, key, energy) │ │ 5. Store in blob storage │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Storage & CDN │ │ ────────────────────────────────────────────────────────│ │ │ │ Stored in Google Cloud Storage │ │ Distributed via CDN (Google Cloud CDN) │ │ │ │ Multiple quality levels: │ │ • 24kbps (mobile, low bandwidth) │ │ • 96kbps (mobile, standard) │ │ • 160kbps (desktop, high) │ │ • 320kbps (premium, highest) │ │ │ └─────────────────────────────────────────────────────────────┘Streaming Protocol
Section titled “Streaming Protocol” Spotify Streaming Protocol ======================
Instead of HTTP streaming, Spotify uses a custom protocol:
┌─────────────────────────────────────────────────────────────┐ │ Why Custom Protocol? │ │ ─────────────────────────────────────────────────────────│ │ │ │ • Lower latency than HTTP │ │ • Better buffering control │ │ • Optimized for frequent seeking │ │ • Efficient for short playback sessions │ │ • Pirate-proof (encrypted content) │ │ │ └─────────────────────────────────────────────────────────────┘
Flow: ─────────────────────────────────────────────────────────
1. Client requests audio chunk 2. Server streams encrypted audio 3. Client decrypts and plays 4. Buffer next chunks ahead
Advantages: • ~200ms startup time • Seamless track transitions • Efficient seeking50.4 Event-Driven Architecture
Section titled “50.4 Event-Driven Architecture”Spotify processes billions of events daily using Kafka.
Spotify Event Infrastructure ==========================
┌─────────────────────────────────────────────────────────────┐ │ Event Types │ │ ────────────────────────────────────────────────────────│ │ │ │ • Playback events (song played, paused, skipped) │ │ • Search queries │ │ • Playlist modifications │ │ • Social interactions │ │ • Library changes │ │ • Errors and diagnostics │ │ │ └─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐ │ Event Pipeline │ │ ────────────────────────────────────────────────────────│ │ │ │ Apps ──▶ Kafka ──▶ Consumers │ │ │ │ │ ├──▶ Spark Streaming (real-time) │ │ │ │ │ ├──▶ Data Warehouse (batch) │ │ │ │ │ └──▶ Recommendation Models │ │ │ └─────────────────────────────────────────────────────────────┘Kafka at Scale
Section titled “Kafka at Scale” Spotify's Kafka Cluster ======================
┌─────────────────────────────────────────────────────────────┐ │ Scale: │ │ • 100+ Kafka brokers │ │ • Trillions of messages per day │ │ • Petabytes of data │ │ • Millions of events per second at peak │ │ │ │ Topics: │ │ • user-identity-events │ │ • playback-events │ │ • track-played-events │ │ • search-events │ │ • recommendation-events │ │ │ └─────────────────────────────────────────────────────────────┘50.5 Recommendation System
Section titled “50.5 Recommendation System”Spotify’s recommendation system is legendary, especially Discover Weekly.
Spotify Recommendation Pipeline ============================
┌─────────────────────────────────────────────────────────────┐ │ Data Collection (Real-time) │ │ ────────────────────────────────────────────────────────│ │ │ │ User Actions: │ │ • What they listen to (complete vs skip) │ │ • What they add to playlists │ │ • What they search for │ │ • What they like/heart │ │ • Time of day they listen │ │ • Social connections │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Batch Processing (Offline) │ │ ────────────────────────────────────────────────────────│ │ │ │ • Collaborative filtering │ │ • Audio analysis (the "audio" model) │ │ • Embeddings for all tracks │ │ • User clustering │ │ │ │ Using: Apache Spark, Python, TensorFlow │ │ │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Real-time Processing │ │ ────────────────────────────────────────────────────────│ │ │ │ • Update recommendations in real-time │ │ • "Because you played X" suggestions │ │ • "Made For You" personalized playlists │ │ │ │ Using: Kafka Streams, Redis │ │ │ └─────────────────────────────────────────────────────────────┘Recommendation Models
Section titled “Recommendation Models” Spotify's Recommendation Algorithms =================================
1. COLLABORATIVE FILTERING ─────────────────────── "Users with similar taste liked these tracks"
Matrix factorization on user-track interactions
2. AUDIO ANALYSIS ──────────────── "This track sounds similar to tracks you like"
• BPM, danceability, energy • Key, tempo • Instrumentalness • Audio embeddings
3. NATURAL LANGUAGE PROCESSING ────────────────────────── "Tracks described with similar words"
Scraped from music blogs, reviews
4. CONVOLUTIONAL NEURAL NETWORKS ────────────────────────────── Direct audio analysis
Raw audio → CNN → Embeddings
─────────────────────────────────────────────────────────
Discover Weekly: ──────────────── • 30 songs updated every Monday • Mix of: - Songs from similar users - Songs with similar audio - New releases from followed artists50.6 Microservices at Spotify
Section titled “50.6 Microservices at Spotify” Spotify's Microservices =====================
┌─────────────────────────────────────────────────────────────┐ │ ~1,000 microservices in production! │ │ │ │ Each team owns: │ │ • Own service (end-to-end) │ │ • Own data │ │ • Own deployment │ │ • On-call rotation │ └─────────────────────────────────────────────────────────────┘
Key Services: ───────────── • metadata-service (track, artist info) • playback-service (streaming control) • recommendation-service • playlist-service • search-service • user-service • social-service • billing-serviceBackend for Frontend (BFF)
Section titled “Backend for Frontend (BFF)” BFF Pattern at Spotify ====================
┌─────────────────────────────────────────────────────────────┐ │ Mobile App │ └────────────────────────────┬────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Mobile BFF │ │ (Dedicated for mobile) │ │ ──────────────────────────────────────────────────────────│ │ │ │ Aggregates: │ │ • User profile │ │ • Playlist data │ │ • Recommendations │ │ • Recently played │ │ │ │ Returns: Single optimized response │ └─────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Core Microservices │ │ │ │ • user-service │ │ • playlist-service │ │ • recommendation-service │ │ • ... │ └─────────────────────────────────────────────────────────────┘
Benefits: ───────── • Mobile-optimized responses • Reduced round trips • Independent scaling50.7 Key Learnings from Spotify
Section titled “50.7 Key Learnings from Spotify” Spotify Engineering Principles ============================
1. EVENT-DRIVEN ─────────────── • Kafka for everything • Decoupled services • Real-time + batch processing • Complete audit trail
2. MICROSERVICES ─────────────── • ~1,000 independent services • Autonomous teams • Own data, own deployment
3. GREMLIN CHAOS ENGINEERING ─────────────────────────── • Inspired by Netflix • Regular chaos experiments • Build confidence in resilience
4. RECOMMENDATIONS FIRST ───────────────────── • ML-driven experience • Multiple algorithms combined • Real-time personalization
5. DEVELOPER EXPERIENCE ──────────────────── • Internal tooling • Self-service platforms • Fast deploysSummary
Section titled “Summary”- Music streaming - Custom protocol, low latency
- Event-driven - Kafka for billions of events
- Microservices - ~1,000 services
- Recommendations - Multi-model ML pipeline
- Metadata - Cassandra for catalog
- CDN - Google Cloud CDN for audio delivery
Congratulations!
Section titled “Congratulations!”You’ve completed the System Design Guide!
This guide covered:
- Fundamentals: Scalability, load balancing, caching
- Database Design: SQL vs NoSQL, CAP theorem, replication, sharding
- Architecture Patterns: Monolith, microservices, event-driven, CQRS, serverless
- API Design: REST, GraphQL, authentication, message queues
- Reliability: Circuit breakers, rate limiting, retries, timeouts
- Observability: Logging, monitoring, alerting, distributed tracing
- Security: TLS, OAuth/JWT, secrets management, DDoS protection
- Real-world Case Studies: Twitter, Netflix, Uber, Amazon, Spotify
You’re now equipped to design large-scale distributed systems!
Keep learning and building! 🚀