Skip to content

Load_balancing

Distributing Traffic Across Multiple Servers

Section titled “Distributing Traffic Across Multiple Servers”

Load Balancing is the process of distributing network traffic across multiple servers to ensure no single server bears too much load, improving reliability and performance.

Without Load Balancing With Load Balancing
===================== ===================
Client Client
| |
| |
+---+ +-------+
|Server| | LB |
+---+ +---+---+
| |
| +---------+---------+
| | | |
v v v v
+---+ +---+ +---+
| S1 | | S2 | | S3 |
+---+ +---+ +---+
Problems: Solutions:
- Overloaded server - Even distribution
- Single point of - Fault tolerance
failure - Scalability

Physical devices for load balancing:

Hardware Load Balancer
=====================
+--------------------------------------------------+
| F5 Big IP |
| |
| +----------+ +----------+ +----------+ |
| | VIP 1 | | VIP 2 | | VIP 3 | |
| +----------+ +----------+ +----------+ |
| | | | |
| +------------+------------+ |
| | |
+-----------------------------------------------+
| Pros: High performance, Dedicated hardware |
| Cons: Expensive, Less flexible |
+-----------------------------------------------+

Software-based solutions:

TypeExamplesUse Case
Layer 4 (Transport)HAProxy, AWS NLBTCP/UDP traffic
Layer 7 (Application)NGINX, AWS ALBHTTP/HTTPS
DNS-basedRoute 53, CloudflareGeographic routing

Managed services from cloud providers:

Cloud Load Balancer Types (AWS Example)
======================================
+------------------+ +------------------+
| Application | | Network |
| Load Balancer | | Load Balancer |
| (Layer 7) | | (Layer 4) |
+------------------+ +------------------+
| - Path-based | | - TCP/UDP |
| - Host-based | | - Static IP |
| - HTTP headers | | - High through- |
+------------------+ | put |
+------------------+
+------------------+ +------------------+
| Gateway | | Classic |
| Load Balancer | | Load Balancer |
+------------------+ +------------------+
| - Third-party | | - Legacy |
| firewalls | | - Simple cases |
+------------------+ +------------------+

Round Robin
==========
Requests: R1 -> R2 -> R3 -> R4 -> R5 -> R6
| | | | |
v v v v v
+----+ +----+ +----+ +----+ +----+ +----+
| S1 | | S2 | | S3 | | S1 | | S2 | | S3 |
+----+ +----+ +----+ +----+ +----+ +----+
Code:
```python
def round_robin(servers, request):
index = request_count % len(servers)
return servers[index]
```
Best for: Servers with equal capacity
Least Connections
=================
Current State:
+----+ connections=5 +----+ connections=2 +----+ connections=8
| S1 | | S2 | | S3 |
+----+ +----+ +----+
Next Request -> Goes to S2 (least connections)
Code:
```python
def least_connections(servers, request):
return min(servers, key=lambda s: s.active_connections)
```
Best for: Long-lived connections (WebSocket, databases)
IP Hash
=======
Client IP Hash -> Server Assignment
Client A (IP: 192.168.1.10) -> Hash(192.168.1.10) % 3 = 1 -> Server 1
Client B (IP: 192.168.1.11) -> Hash(192.168.1.11) % 3 = 2 -> Server 2
Client C (IP: 192.168.1.12) -> Hash(192.168.1.12) % 3 = 0 -> Server 1
Best for: Session affinity without sticky sessions
Weighted Load Balancing
========================
Server Configuration:
+----+-------+
| S1 | weight=5 | (50% of traffic)
+----+-------+
| S2 | weight=3 | (30% of traffic)
+----+-------+
| S3 | weight=2 | (20% of traffic)
+----+-------+
Traffic Distribution (10 requests):
+----+----+----+----+----+----+----+----+----+----+
| R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10|
+----+----+----+----+----+----+----+----+----+----+
| S1 | S1 | S1 | S1 | S1 | S2 | S2 | S2 | S3 | S3 |
+----+----+----+----+----+----+----+----+----+----+
Best for: Servers with different capacities
Least Response Time
===================
Server Health:
+----+-------------+-----------------+
| S1 | Avg: 50ms | Active: 10 |
+----+-------------+-----------------+
| S2 | Avg: 100ms | Active: 5 |
+----+-------------+-----------------+
| S3 | Avg: 30ms | Active: 8 |
+----+-------------+-----------------+
Next Request -> Goes to S3 (fastest response)
Formula: Score = (Active Connections / Response Time)
Best for: Real-time applications

4.4.1 Layer 4 (Transport Layer) Load Balancing

Section titled “4.4.1 Layer 4 (Transport Layer) Load Balancing”
Layer 4 Load Balancing
======================
+----------+ +---------------------------+ +----------+
| Client | --> | Load Balancer | --> | Server |
+----------+ | (TCP/UDP level) | +----------+
+---------------------------+
How it works:
- Routes based on IP address and port
- Does NOT inspect packet content
- Fast, low latency
- No session persistence at application level
Example:
Client:192.168.1.10:54321 -> LB -> Server:10.0.0.1:80

4.4.2 Layer 7 (Application Layer) Load Balancing

Section titled “4.4.2 Layer 7 (Application Layer) Load Balancing”
Layer 7 Load Balancing
======================
+----------+ +---------------------------+ +----------+
| Client | --> | Load Balancer | --> | Server |
+----------+ | (HTTP/HTTPS level) | +----------+
+---------------------------+
How it works:
- Inspects HTTP headers, URL, cookies
- Can make routing decisions based on content
- Can terminate SSL/TLS
- More intelligent routing
Example routing:
/api/users/* -> User Service
/api/orders/* -> Order Service
/api/payments/* -> Payment Service

Health Check Mechanism
=====================
+--------+ +------------------------+
| | Health | |
| Load |-------->| Server Pool |
| Balancer| +------------------------+
| |<--------| /health endpoint |
+--------+ Check +------------------------+
| | |
v v v
+----+ +----+ +----+
| OK | | OK | |FAIL|
+----+ +----+ +----+
Health Check Types:
+------------------+------------------------+
| Type | Description |
+------------------+------------------------+
| TCP | Port open/closed |
| HTTP/HTTPS | Specific endpoint |
| HTTPS | Certificate valid |
| Custom | Script-based |
+------------------+------------------------+
Configuration Parameters
=======================
+-------------------------+--------------------------+
| Parameter | Typical Value |
+-------------------------+--------------------------+
| Interval | 10-30 seconds |
| Timeout | 5-10 seconds |
| Unhealthy threshold | 2-3 failures |
| Healthy threshold | 2-3 successes |
+-------------------------+--------------------------+
Timing Diagram:
==============
Time: 0 10s 20s 30s 40s 50s 60s
|-----|-----|-----|-----|-----|-----|
Health OK OK OK OK OK OK OK
Check: | | | | | | |
When server fails:
Time: 0 10s 20s 30s 40s
|-----|-----|-----|-----|
Health OK OK FAIL FAIL (Remove from pool)
Check: | | | |

SSL Termination
==============
Client Load Balancer Backend Servers
+-----+ +-----------+ +----------------+
|HTTPS| --------> | Decrypt | HTTP/TCP | |
| req | | SSL/TLS | ---------> | Application |
+-----+ +-----------+ +----------------+
Benefits:
- Less CPU load on backend servers
- Centralized certificate management
- Easier to implement security policies
Sticky Sessions
==============
Client A Load Balancer Servers
| | +---+
GET / | -> | S1|
| | +---+
| | Remember: |
POST /login Cookie: S1 +---+
(login to S1) Session=A | S2|
| | +---+
| | |
v v v
GET /dashboard ---------> Redirect to S1
Use cases:
- Shopping carts
- Game servers
- Stateful applications

Active-Passive HA
=================
+----------------+
| Active LB |
+----------------+
|
+-----v-----+
| Heartbeat | (Monitor active)
+-----v-----+
|
+----------------+
| Passive LB |
+----------------+
When Active fails:
1. Heartbeat detects failure
2. Virtual IP moves to Passive
3. Passive becomes Active
4. < 1 second failover
Active-Active HA
================
+----------------+ +----------------+
| Active LB 1 |<--->| Active LB 2 |
+----------------+ +----------------+
| |
v v
+-----------+ +-----------+
| Server | | Server |
| Pool A | | Pool B |
+-----------+ +-----------+
Benefits:
- Both LBs handle traffic
- No wasted resources
- Better utilization

Global Load Balancing
=====================
User DNS Servers
+-----+ +-------------+ +------------------+
| US | --> | Route 53 | --> | US-East Server |
+-----+ | (Latency | +------------------+
+-----+ | routing) | --> | EU-West Server |
| EU | +-------------+ +------------------+
+-----+ | (Geo routing)| | Asia Server |
+-----+ +-------------+ +------------------+
| Asia| --> |
+-----+

Kubernetes Load Balancing
=========================
+--------------------------------------------------+
| Kubernetes Cluster |
+--------------------------------------------------+
| |
| +----------------+ |
| | Ingress | (Layer 7 Load Balancer) |
| +----------------+ |
| | |
| +----+---+---+----+ |
| | | | | | |
| v v v v v |
| +--+ +--+ +--+ +--+ |
| |P1| |P2| |P3| |P4| (Pods) |
| +--+ +--+ +--+ +--+ |
| |
+--------------------------------------------------+
Ingress Controller:
- NGINX Ingress Controller
- AWS ALB Ingress
- Traefik

Key load balancing concepts:

  1. Choose the right algorithm - Round robin, least connections, weighted, IP hash
  2. Layer matters - L4 for speed, L7 for intelligent routing
  3. Health checks are critical - Detect and remove failed servers
  4. Plan for HA - Active-passive or active-active
  5. Consider SSL termination - Offload SSL from backend
  6. Monitor performance - Track metrics and adjust

Next: Chapter 5: Caching Strategies