Skip to content

Design_spotify

Music Streaming with Real-Time Personalization

Section titled “Music Streaming with Real-Time Personalization”

Spotify is the world’s most popular music streaming service with over 500 million users and 100+ million tracks.

Spotify by the Numbers
====================
┌─────────────────────────────────────────────────────────────┐
│ 500M+ monthly active users │
│ 200M+ subscribers (paid) │
│ 100M+ tracks │
│ 4B+ playlists │
│ 100K+ new tracks added daily │
│ 2B+ hours streamed monthly │
└─────────────────────────────────────────────────────────────┘
RequirementScaleTechnical Challenge
StreamingSub-200ms latencyGlobal CDN
Music catalog100M+ tracksMetadata management
RecommendationsReal-time personalizationML at scale
Availability99.99%Global infrastructure
Uploads100K/dayIngestion pipeline

Spotify Architecture
=================
┌─────────────────────────────────────────────────────────────┐
│ Mobile Apps │
│ (iOS, Android, Desktop) │
└────────────────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ API Gateway │
│ (Edge, Authentication) │
└────────────────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Backend Services │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Playback│ │Metadata │ │Playlist │ │ Search │ │
│ │ Service │ │ Service │ │ Service │ │ Service │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ User │ │Library │ │ Social │ │Upload │ │
│ │ Profile │ │ Service │ │ Service │ │ Service │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Recommendation Services (Secret Sauce!) │ │
│ └─────────────────────────────────────────────────────┘ │
└────────────────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Data Layer │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Cassandra │ │ PostgreSQL│ │ Redis │ │
│ │(Metadata)│ │ (User/Pay)│ │(Sessions)│ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ S3 │ │ Kafka │ │ Google │ │
│ │(Audio) │ │ (Events) │ │ BigQuery │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────┘

Spotify's Audio Pipeline
=====================
┌─────────────────────────────────────────────────────────────┐
│ Upload Phase │
│ ────────────────────────────────────────────────────────│
│ │
│ Labels/Artists ──▶ Upload to S3 ──▶ Trigger processing │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Processing Pipeline (Several hours) │
│ ────────────────────────────────────────────────────────│
│ │
│ 1. Convert to Spotify format (OGG Vorbis) │
│ 2. Generate multiple quality levels │
│ 3. Create audio fingerprints │
│ 4. Analyze audio (BPM, key, energy) │
│ 5. Store in blob storage │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Storage & CDN │
│ ────────────────────────────────────────────────────────│
│ │
│ Stored in Google Cloud Storage │
│ Distributed via CDN (Google Cloud CDN) │
│ │
│ Multiple quality levels: │
│ • 24kbps (mobile, low bandwidth) │
│ • 96kbps (mobile, standard) │
│ • 160kbps (desktop, high) │
│ • 320kbps (premium, highest) │
│ │
└─────────────────────────────────────────────────────────────┘
Spotify Streaming Protocol
======================
Instead of HTTP streaming, Spotify uses a custom protocol:
┌─────────────────────────────────────────────────────────────┐
│ Why Custom Protocol? │
│ ─────────────────────────────────────────────────────────│
│ │
│ • Lower latency than HTTP │
│ • Better buffering control │
│ • Optimized for frequent seeking │
│ • Efficient for short playback sessions │
│ • Pirate-proof (encrypted content) │
│ │
└─────────────────────────────────────────────────────────────┘
Flow:
─────────────────────────────────────────────────────────
1. Client requests audio chunk
2. Server streams encrypted audio
3. Client decrypts and plays
4. Buffer next chunks ahead
Advantages:
• ~200ms startup time
• Seamless track transitions
• Efficient seeking

Spotify processes billions of events daily using Kafka.

Spotify Event Infrastructure
==========================
┌─────────────────────────────────────────────────────────────┐
│ Event Types │
│ ────────────────────────────────────────────────────────│
│ │
│ • Playback events (song played, paused, skipped) │
│ • Search queries │
│ • Playlist modifications │
│ • Social interactions │
│ • Library changes │
│ • Errors and diagnostics │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Event Pipeline │
│ ────────────────────────────────────────────────────────│
│ │
│ Apps ──▶ Kafka ──▶ Consumers │
│ │ │
│ ├──▶ Spark Streaming (real-time) │
│ │ │
│ ├──▶ Data Warehouse (batch) │
│ │ │
│ └──▶ Recommendation Models │
│ │
└─────────────────────────────────────────────────────────────┘
Spotify's Kafka Cluster
======================
┌─────────────────────────────────────────────────────────────┐
│ Scale: │
│ • 100+ Kafka brokers │
│ • Trillions of messages per day │
│ • Petabytes of data │
│ • Millions of events per second at peak │
│ │
│ Topics: │
│ • user-identity-events │
│ • playback-events │
│ • track-played-events │
│ • search-events │
│ • recommendation-events │
│ │
└─────────────────────────────────────────────────────────────┘

Spotify’s recommendation system is legendary, especially Discover Weekly.

Spotify Recommendation Pipeline
============================
┌─────────────────────────────────────────────────────────────┐
│ Data Collection (Real-time) │
│ ────────────────────────────────────────────────────────│
│ │
│ User Actions: │
│ • What they listen to (complete vs skip) │
│ • What they add to playlists │
│ • What they search for │
│ • What they like/heart │
│ • Time of day they listen │
│ • Social connections │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Batch Processing (Offline) │
│ ────────────────────────────────────────────────────────│
│ │
│ • Collaborative filtering │
│ • Audio analysis (the "audio" model) │
│ • Embeddings for all tracks │
│ • User clustering │
│ │
│ Using: Apache Spark, Python, TensorFlow │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Real-time Processing │
│ ────────────────────────────────────────────────────────│
│ │
│ • Update recommendations in real-time │
│ • "Because you played X" suggestions │
│ • "Made For You" personalized playlists │
│ │
│ Using: Kafka Streams, Redis │
│ │
└─────────────────────────────────────────────────────────────┘
Spotify's Recommendation Algorithms
=================================
1. COLLABORATIVE FILTERING
───────────────────────
"Users with similar taste liked these tracks"
Matrix factorization on user-track interactions
2. AUDIO ANALYSIS
────────────────
"This track sounds similar to tracks you like"
• BPM, danceability, energy
• Key, tempo
• Instrumentalness
• Audio embeddings
3. NATURAL LANGUAGE PROCESSING
──────────────────────────
"Tracks described with similar words"
Scraped from music blogs, reviews
4. CONVOLUTIONAL NEURAL NETWORKS
──────────────────────────────
Direct audio analysis
Raw audio → CNN → Embeddings
─────────────────────────────────────────────────────────
Discover Weekly:
────────────────
• 30 songs updated every Monday
• Mix of:
- Songs from similar users
- Songs with similar audio
- New releases from followed artists

Spotify's Microservices
=====================
┌─────────────────────────────────────────────────────────────┐
│ ~1,000 microservices in production! │
│ │
│ Each team owns: │
│ • Own service (end-to-end) │
│ • Own data │
│ • Own deployment │
│ • On-call rotation │
└─────────────────────────────────────────────────────────────┘
Key Services:
─────────────
• metadata-service (track, artist info)
• playback-service (streaming control)
• recommendation-service
• playlist-service
• search-service
• user-service
• social-service
• billing-service
BFF Pattern at Spotify
====================
┌─────────────────────────────────────────────────────────────┐
│ Mobile App │
└────────────────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Mobile BFF │
│ (Dedicated for mobile) │
│ ──────────────────────────────────────────────────────────│
│ │
│ Aggregates: │
│ • User profile │
│ • Playlist data │
│ • Recommendations │
│ • Recently played │
│ │
│ Returns: Single optimized response │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Core Microservices │
│ │
│ • user-service │
│ • playlist-service │
│ • recommendation-service │
│ • ... │
└─────────────────────────────────────────────────────────────┘
Benefits:
─────────
• Mobile-optimized responses
• Reduced round trips
• Independent scaling

Spotify Engineering Principles
============================
1. EVENT-DRIVEN
───────────────
• Kafka for everything
• Decoupled services
• Real-time + batch processing
• Complete audit trail
2. MICROSERVICES
───────────────
• ~1,000 independent services
• Autonomous teams
• Own data, own deployment
3. GREMLIN CHAOS ENGINEERING
───────────────────────────
• Inspired by Netflix
• Regular chaos experiments
• Build confidence in resilience
4. RECOMMENDATIONS FIRST
─────────────────────
• ML-driven experience
• Multiple algorithms combined
• Real-time personalization
5. DEVELOPER EXPERIENCE
────────────────────
• Internal tooling
• Self-service platforms
• Fast deploys

  1. Music streaming - Custom protocol, low latency
  2. Event-driven - Kafka for billions of events
  3. Microservices - ~1,000 services
  4. Recommendations - Multi-model ML pipeline
  5. Metadata - Cassandra for catalog
  6. CDN - Google Cloud CDN for audio delivery

You’ve completed the System Design Guide!

This guide covered:

  • Fundamentals: Scalability, load balancing, caching
  • Database Design: SQL vs NoSQL, CAP theorem, replication, sharding
  • Architecture Patterns: Monolith, microservices, event-driven, CQRS, serverless
  • API Design: REST, GraphQL, authentication, message queues
  • Reliability: Circuit breakers, rate limiting, retries, timeouts
  • Observability: Logging, monitoring, alerting, distributed tracing
  • Security: TLS, OAuth/JWT, secrets management, DDoS protection
  • Real-world Case Studies: Twitter, Netflix, Uber, Amazon, Spotify

You’re now equipped to design large-scale distributed systems!

Keep learning and building! 🚀