Cap_theorem

Chapter 6: CAP Theorem & Distributed Systems

Understanding Trade-offs in Distributed Databases

6.1 Introduction to CAP Theorem

The CAP Theorem states that a distributed system can only provide two of the three guarantees simultaneously:

Consistency (C): Every read receives the most recent write
Availability (A): Every request gets a response
Partition Tolerance (P): System continues despite network failures

    CAP Theorem Visualization
    ===========================

                    CAP
                   -----

                     C
                    / \
                   /   \
                  /     \
                 /   P   \
                /         \
               A-----------A

    You can only pick TWO:
    1. CA - Consistent + Available (no partitions)
    2. CP - Consistent + Partition Tolerant
    3. AP - Available + Partition Tolerant

6.2 Understanding Each Property

6.2.1 Consistency

    Strong Consistency
    ==================

    Client A        Node 1          Node 2          Node 3
       |              |               |               |
       |  Write X=5   |               |               |
       |------------->|               |               |
       |              |  Replicate    |               |
       |              |-------------->|               |
       |              |               |-------------->|
       |              |               |               |
       |  Read X=5    |               |               |
       |------------->|               |               |
       |<-------------|               |               |
       |              |               |               |
       |              |  (All nodes have X=5)         |

    Result: Always returns latest written value

6.2.2 Availability

    Availability
    =============

    Client A        Node 1          Node 2          Node 3
       |              |               |               |
       |  Read X?    |               |   (down)      |
       |------------->|               |               |
       |<-------------|               |               |
       |  X=5         |               |               |
       |              |               |               |

    Result: Returns data even if some nodes are down

6.2.3 Partition Tolerance

    Partition Tolerance
    ====================

    Network Partition Occurs:

    Node 1          Node 2          Node 3
       |               |               |
       |   +-----------+-----------+   |
       |   |   Network Partition   |   |
       |   +-----------+-----------+   |
       |               |               |
       v               v               v
    Active          Active          Active

    Partition Tolerance:
    - System continues operating during partition
    - Detect partition and respond appropriately
    - Heal when partition resolves

6.3 CAP Combinations

6.3.1 CA (Consistency + Availability)

    CA Systems
    ==========

    Not Partition Tolerant

    Use Cases:
    - Single-node databases
    - Traditional RDBMS in controlled environments
    - Systems where network partitions are rare

    Examples:
    - Traditional MySQL
    - PostgreSQL (single node)
    - Oracle Database

    Limitation:
    +----------------------------------+
    |  If network partition occurs:   |
    |                                  |
    |  - Must stop responding          |
    |  - Or reject requests            |
    |  - Cannot maintain both C and A |
    +----------------------------------+

6.3.2 CP (Consistency + Partition Tolerance)

    CP Systems
    ==========

    Sacrifices Availability during partitions

    Behavior:
    +----------------------------------+
    |  Network Partition Detected      |
    |              |                    |
    |              v                    |
    |  - Stop accepting writes         |
    |  - Continue serving reads        |
    |  - Or reject all requests        |
    |              |                    |
    |              v                    |
    |  Partition Heals                |
    |              |                    |
    |              v                    |
    |  - Re-sync data                 |
    |  - Resume normal operation      |
    +----------------------------------+

    Examples:
    - MongoDB (with appropriate config)
    - HBase
    - Zookeeper
    - Etcd

6.3.3 AP (Availability + Partition Tolerance)

    AP Systems
    ==========

    Sacrifices Consistency during partitions

    Behavior:
    +----------------------------------+
    |  Network Partition Detected      |
    |              |                    |
    |              v                    |
    |  - Continue accepting reads     |
    |  - Continue accepting writes    |
    |  - Serve stale data if needed    |
    |              |                    |
    |              v                    |
    |  Partition Heals                |
    |              |                    |
    |              v                    |
    |  - Conflict resolution          |
    |  - Eventual consistency         |
    +----------------------------------+

    Examples:
    - Cassandra
    - DynamoDB
    - CouchDB
    - Riak

6.4 Real-World Implications

How Modern Databases Handle CAP

    Database CAP Classification
    ===========================

    +---------------+----------+--------------------------------+
    | Database      | Type     | CAP Behavior                  |
    +---------------+----------+--------------------------------+
    | MongoDB       | CP       | Consistent by default          |
    | Cassandra     | AP       | Available by default          |
    | DynamoDB      | AP       | Tunable consistency           |
    | Redis         | CP       | Strong consistency            |
    | PostgreSQL    | CA       | Single node, not partition     |
    |               |          | tolerant                      |
    | MySQL         | CA       | Single node                   |
    | Etcd          | CP       | Strong consistency            |
    | CockroachDB   | CP       | Globally consistent           |
    +---------------+----------+--------------------------------+

Tunable Consistency

    Tunable Consistency in DynamoDB
    ================================

    Consistency Levels:
    +------------------+------------------+-------------------+
    | Strong           | Eventual         | Bounded           |
    | Consistency      | Consistency      | Staleness         |
    +------------------+------------------+-------------------+
    | Read-after-      | Reads may        | Reads within K    |
    | write           | return stale      | versions of      |
    | guaranteed     | data              | writes            |
    |                 |                  |                   |
    | 1 RTT           | 0 RTT            | 0-1 RTT          |
    +------------------+------------------+-------------------+

    Consistency Trade-off:
    ======================

    Strong: Slower, but accurate
    Eventual: Faster, may be stale
    Bounded: Balance between both

6.5 PACELC Model

An extension to CAP that considers latency:

    PACELC Model
    ============

    If there is a Partition (P)
      - Then (E)rror or (A)vailability?
        - If Availability: (L)atency or (C)onsistency?

    If there is no Partition (E)LSE
      - Then (L)atency or (C)onsistency?

    +---------------+-------+--------------------------------+
    | System        | PACELC| Description                    |
    +---------------+-------+--------------------------------+
    | DynamoDB      | PA/EL | Available, low latency         |
    | Cassandra     | PA/EL | Available, low latency         |
    | MongoDB      | PC/EC | Consistent, higher latency     |
    | Bigtable      | PC/EC | Consistent                     |
    | PNUTS        | PA/EL | Available, eventuall consistent|
    +---------------+-------+--------------------------------+

6.6 Designing for Partitions

Strategies

    Partition Handling Strategies
    ============================

    1. Detect Partition
    +------------------+
    | - Heartbeats    |  (PING/PONG between nodes)
    | - Timeouts      |
    | - Consensus    |
    +------------------+

    2. Respond to Partition
    +------------------+
    | - Choose C or A  |
    | - Isolate failed |
    |   portion       |
    | - Queue writes  |
    +------------------+

    3. Heal from Partition
    +------------------+
    | - Re-sync data  |
    | - Resolve con-  |
    |   flicts         |
    | - Resume normal |
    |   operation     |
    +------------------+

6.7 Consistency Levels

Beyond Strong Consistency

    Consistency Spectrum
    ====================

    Strong > Sequential > Causal > Eventual > Weak

    +------------------+----------------------------------+
    | Level            | Description                      |
    +------------------+----------------------------------+
    | Strong           | All reads see latest write       |
    | Sequential       | All processes see writes         |
    |                  | in same order                    |
    | Causal           | If A causes B, B sees A's effect |
    | Eventual         | All replicas eventually converge |
    | Read-your-writes| Writer sees own writes           |
    | Session          | Within session, read-your-writes|
    | Weak             | May lose updates                 |
    +------------------+----------------------------------+

6.8 Practical Decisions

Choosing Between CP and AP

    Decision Framework
    =================

    Questions to Ask:
    +----------------------------------------------------------+
    | 1. Can the system be unavailable during partition?     |
    |    - Banking: NO -> CP                                  |
    |    - Social media: YES -> AP                           |
    |                                                          |
    | 2. How recent must data be?                             |
    |    - Financial: Recent -> CP                            |
    |    - Likes/views: Stale OK -> AP                       |
    |                                                          |
    | 3. What happens if data is lost?                        |
    |    - Critical: CP                                       |
    |    - Acceptable: AP                                     |
    |                                                          |
    | 4. How long can system be down?                         |
    |    - Minutes: CP                                         |
    |    - Hours: AP                                          |
    +----------------------------------------------------------+

    Hybrid Approach:
    ===============

    Different data, different requirements:

    +------------------+----------+
    | Data Type        | Strategy |
    +------------------+----------+
    | User accounts    | CP        |
    | Payment records | CP        |
    | User preferences| AP        |
    | Activity logs   | AP        |
    | Session data    | AP        |
    +------------------+----------+

Summary

Key CAP theorem concepts:

Partitions happen - Network failures are inevitable
Pick your trade-off - CP or AP (CA is not realistic)
Consistency vs latency - PACELC model helps
Not all data needs same consistency - Hybrid approach
Modern databases are tunable - Choose based on use case
Eventual consistency works - In practice, converges quickly

Next: Chapter 7: SQL vs NoSQL Databases