Load_balancing

Chapter 4: Load Balancing

Distributing Traffic Across Multiple Servers

4.1 What is Load Balancing?

Load Balancing is the process of distributing network traffic across multiple servers to ensure no single server bears too much load, improving reliability and performance.

    Without Load Balancing     With Load Balancing
    =====================     ===================

        Client                     Client
           |                         |
           |                         |
        +---+                    +-------+
        |Server|                  |   LB   |
        +---+                    +---+---+
           |                        |
           |              +---------+---------+
           |              |         |         |
           v              v         v         v
                         +---+   +---+   +---+
                         | S1 |   | S2 |   | S3 |
                         +---+   +---+   +---+

    Problems:                Solutions:
    - Overloaded server      - Even distribution
    - Single point of       - Fault tolerance
      failure               - Scalability

4.2 Types of Load Balancers

4.2.1 Hardware Load Balancers

Physical devices for load balancing:

    Hardware Load Balancer
    =====================

    +--------------------------------------------------+
    |                  F5 Big IP                        |
    |                                                  |
    |  +----------+  +----------+  +----------+      |
    |  | VIP 1    |  | VIP 2    |  | VIP 3    |      |
    |  +----------+  +----------+  +----------+      |
    |       |            |            |              |
    |       +------------+------------+              |
    |                    |                          |
    +-----------------------------------------------+
    |  Pros: High performance, Dedicated hardware   |
    |  Cons: Expensive, Less flexible               |
    +-----------------------------------------------+

4.2.2 Software Load Balancers

Software-based solutions:

Type	Examples	Use Case
Layer 4 (Transport)	HAProxy, AWS NLB	TCP/UDP traffic
Layer 7 (Application)	NGINX, AWS ALB	HTTP/HTTPS
DNS-based	Route 53, Cloudflare	Geographic routing

4.2.3 Cloud Load Balancers

Managed services from cloud providers:

    Cloud Load Balancer Types (AWS Example)
    ======================================

    +------------------+     +------------------+
    |  Application    |     |  Network         |
    |  Load Balancer  |     |  Load Balancer   |
    |  (Layer 7)      |     |  (Layer 4)       |
    +------------------+     +------------------+
    | - Path-based    |     | - TCP/UDP        |
    | - Host-based    |     | - Static IP      |
    | - HTTP headers |     | - High through-  |
    +------------------+     |   put            |
                             +------------------+

    +------------------+     +------------------+
    |  Gateway         |     |  Classic        |
    |  Load Balancer   |     |  Load Balancer   |
    +------------------+     +------------------+
    | - Third-party    |     | - Legacy        |
    |   firewalls      |     | - Simple cases  |
    +------------------+     +------------------+

4.3 Load Balancing Algorithms

4.3.1 Round Robin

    Round Robin
    ==========

    Requests: R1 -> R2 -> R3 -> R4 -> R5 -> R6
               |      |      |      |      |
               v      v      v      v      v
            +----+ +----+ +----+ +----+ +----+ +----+
            | S1 | | S2 | | S3 | | S1 | | S2 | | S3 |
            +----+ +----+ +----+ +----+ +----+ +----+

    Code:
    ```python
    def round_robin(servers, request):
        index = request_count % len(servers)
        return servers[index]
    ```

    Best for: Servers with equal capacity

4.3.2 Least Connections

    Least Connections
    =================

    Current State:
    +----+ connections=5  +----+ connections=2  +----+ connections=8
    | S1 |                | S2 |                | S3 |
    +----+                +----+                +----+

    Next Request -> Goes to S2 (least connections)

    Code:
    ```python
    def least_connections(servers, request):
        return min(servers, key=lambda s: s.active_connections)
    ```

    Best for: Long-lived connections (WebSocket, databases)

4.3.3 IP Hash

    IP Hash
    =======

    Client IP Hash -> Server Assignment

    Client A (IP: 192.168.1.10) -> Hash(192.168.1.10) % 3 = 1 -> Server 1
    Client B (IP: 192.168.1.11) -> Hash(192.168.1.11) % 3 = 2 -> Server 2
    Client C (IP: 192.168.1.12) -> Hash(192.168.1.12) % 3 = 0 -> Server 1

    Best for: Session affinity without sticky sessions

4.3.4 Weighted Algorithms

    Weighted Load Balancing
    ========================

    Server Configuration:
    +----+-------+
    | S1 | weight=5 |  (50% of traffic)
    +----+-------+
    | S2 | weight=3 |  (30% of traffic)
    +----+-------+
    | S3 | weight=2 |  (20% of traffic)
    +----+-------+

    Traffic Distribution (10 requests):
    +----+----+----+----+----+----+----+----+----+----+
    | R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10|
    +----+----+----+----+----+----+----+----+----+----+
    | S1 | S1 | S1 | S1 | S1 | S2 | S2 | S2 | S3 | S3 |
    +----+----+----+----+----+----+----+----+----+----+

    Best for: Servers with different capacities

4.3.5 Least Response Time

    Least Response Time
    ===================

    Server Health:
    +----+-------------+-----------------+
    | S1 | Avg: 50ms   | Active: 10     |
    +----+-------------+-----------------+
    | S2 | Avg: 100ms  | Active: 5      |
    +----+-------------+-----------------+
    | S3 | Avg: 30ms   | Active: 8      |
    +----+-------------+-----------------+

    Next Request -> Goes to S3 (fastest response)

    Formula: Score = (Active Connections / Response Time)

    Best for: Real-time applications

4.4 Load Balancing at Different Layers

4.4.1 Layer 4 (Transport Layer) Load Balancing

    Layer 4 Load Balancing
    ======================

    +----------+     +---------------------------+     +----------+
    |  Client  | --> |      Load Balancer       | --> |  Server  |
    +----------+     |  (TCP/UDP level)         |     +----------+
                      +---------------------------+

    How it works:
    - Routes based on IP address and port
    - Does NOT inspect packet content
    - Fast, low latency
    - No session persistence at application level

    Example:
    Client:192.168.1.10:54321 -> LB -> Server:10.0.0.1:80

4.4.2 Layer 7 (Application Layer) Load Balancing

    Layer 7 Load Balancing
    ======================

    +----------+     +---------------------------+     +----------+
    |  Client  | --> |      Load Balancer       | --> |  Server  |
    +----------+     |  (HTTP/HTTPS level)      |     +----------+
                      +---------------------------+

    How it works:
    - Inspects HTTP headers, URL, cookies
    - Can make routing decisions based on content
    - Can terminate SSL/TLS
    - More intelligent routing

    Example routing:
    /api/users/*    -> User Service
    /api/orders/*   -> Order Service
    /api/payments/* -> Payment Service

4.5 Health Checks

How Health Checks Work

    Health Check Mechanism
    =====================

    +--------+         +------------------------+
    |        |  Health |                        |
    |  Load  |-------->|   Server Pool         |
    | Balancer|         +------------------------+
    |        |<--------|   /health endpoint     |
    +--------+  Check  +------------------------+
                          |      |       |
                          v      v       v
                       +----+ +----+ +----+
                       | OK | | OK | |FAIL|
                       +----+ +----+ +----+

    Health Check Types:
    +------------------+------------------------+
    | Type             | Description           |
    +------------------+------------------------+
    | TCP              | Port open/closed       |
    | HTTP/HTTPS       | Specific endpoint     |
    | HTTPS            | Certificate valid     |
    | Custom           | Script-based          |
    +------------------+------------------------+

Health Check Configuration

    Configuration Parameters
    =======================

    +-------------------------+--------------------------+
    | Parameter               | Typical Value            |
    +-------------------------+--------------------------+
    | Interval                | 10-30 seconds            |
    | Timeout                 | 5-10 seconds             |
    | Unhealthy threshold     | 2-3 failures             |
    | Healthy threshold       | 2-3 successes           |
    +-------------------------+--------------------------+

    Timing Diagram:
    ==============

    Time: 0    10s   20s   30s   40s   50s   60s
          |-----|-----|-----|-----|-----|-----|
    Health OK    OK    OK    OK    OK    OK    OK
    Check:  |    |    |    |    |    |    |

    When server fails:
    Time: 0    10s   20s   30s   40s
          |-----|-----|-----|-----|
    Health OK    OK    FAIL  FAIL  (Remove from pool)
    Check:  |    |    |    |

4.6 Load Balancer Features

SSL/TLS Termination

    SSL Termination
    ==============

    Client              Load Balancer           Backend Servers
    +-----+           +-----------+           +----------------+
    |HTTPS| --------> | Decrypt   |  HTTP/TCP  |               |
    | req |           | SSL/TLS   | ---------> | Application    |
    +-----+           +-----------+           +----------------+

    Benefits:
    - Less CPU load on backend servers
    - Centralized certificate management
    - Easier to implement security policies

Session Persistence (Sticky Sessions)

    Sticky Sessions
    ==============

    Client A          Load Balancer         Servers
       |                   |                 +---+
    GET /                  |       ->        | S1|
       |                   |                 +---+
       |                   |   Remember:      |
    POST /login        Cookie: S1             +---+
    (login to S1)     Session=A              | S2|
       |                   |                 +---+
       |                   |                 |
       v                   v                 v
    GET /dashboard ---------> Redirect to S1

    Use cases:
    - Shopping carts
    - Game servers
    - Stateful applications

4.7 High Availability Setup

Active-Passive HA Setup

    Active-Passive HA
    =================

    +----------------+
    |   Active LB   |
    +----------------+
          |
    +-----v-----+
    | Heartbeat |  (Monitor active)
    +-----v-----+
          |
    +----------------+
    |  Passive LB   |
    +----------------+

    When Active fails:
    1. Heartbeat detects failure
    2. Virtual IP moves to Passive
    3. Passive becomes Active
    4. < 1 second failover

Active-Active HA Setup

    Active-Active HA
    ================

    +----------------+     +----------------+
    |   Active LB 1 |<--->|  Active LB 2   |
    +----------------+     +----------------+
          |                         |
          v                         v
    +-----------+             +-----------+
    |  Server   |             |  Server   |
    |  Pool A   |             |  Pool B   |
    +-----------+             +-----------+

    Benefits:
    - Both LBs handle traffic
    - No wasted resources
    - Better utilization

4.8 Global Load Balancing

DNS-Based Global Load Balancing

    Global Load Balancing
    =====================

    User              DNS                 Servers
    +-----+     +-------------+     +------------------+
    | US  | --> | Route 53    | --> | US-East Server  |
    +-----+     | (Latency    |     +------------------+
    +-----+     |  routing)   | --> | EU-West Server  |
    | EU  |     +-------------+     +------------------+
    +-----+     | (Geo routing)|    | Asia Server    |
    +-----+     +-------------+     +------------------+
    | Asia| --> |
    +-----+

4.9 Load Balancing in Kubernetes

    Kubernetes Load Balancing
    =========================

    +--------------------------------------------------+
    |              Kubernetes Cluster                  |
    +--------------------------------------------------+
    |                                                  |
    |  +----------------+                             |
    |  |   Ingress      |  (Layer 7 Load Balancer)   |
    |  +----------------+                             |
    |          |                                       |
    |  +----+---+---+----+                             |
    |  |    |   |   |    |                             |
    |  v    v   v   v    v                             |
    | +--+ +--+ +--+ +--+                             |
    | |P1| |P2| |P3| |P4|  (Pods)                       |
    | +--+ +--+ +--+ +--+                             |
    |                                                  |
    +--------------------------------------------------+

    Ingress Controller:
    - NGINX Ingress Controller
    - AWS ALB Ingress
    - Traefik

Summary

Key load balancing concepts:

Choose the right algorithm - Round robin, least connections, weighted, IP hash
Layer matters - L4 for speed, L7 for intelligent routing
Health checks are critical - Detect and remove failed servers
Plan for HA - Active-passive or active-active
Consider SSL termination - Offload SSL from backend
Monitor performance - Track metrics and adjust

Next: Chapter 5: Caching Strategies