Retry_patterns
Chapter 23: Retry Patterns & Dead Letter Queues
Section titled “Chapter 23: Retry Patterns & Dead Letter Queues”Handling Transient Failures
Section titled “Handling Transient Failures”23.1 Why Retry?
Section titled “23.1 Why Retry?”Network and service failures are often transient - they happen once and then things work again.
Transient Failure Example =========================
Request 1: Timeout (network hiccup) Request 2: Success! (network recovered)
Without retry: User sees error With retry: User gets success23.2 Retry Strategies
Section titled “23.2 Retry Strategies”Simple Retry
Section titled “Simple Retry” Simple Retry ===========
Try once, fail -> Try again immediately
Problem: Doesn't help if service is temporarily downRetry with Backoff
Section titled “Retry with Backoff” Exponential Backoff ==================
Attempt 1: Immediate Attempt 2: Wait 100ms Attempt 3: Wait 200ms Attempt 4: Wait 400ms Attempt 5: Wait 800ms ...
max_delay: Cap at some limit (e.g., 30s)Jitter
Section titled “Jitter” Jitter (Randomization) =====================
Without jitter: All clients retry at same time -> Thundering herd
With jitter: Attempt 1: Random(0-100ms) Attempt 2: Random(0-200ms) ...
Formula: random(0, min(cap, base * 2^attempt))23.3 Handling Different Errors
Section titled “23.3 Handling Different Errors”| Error Type | Retry? |
|---|---|
| Timeout | Yes |
| 503 Service Unavailable | Yes |
| 429 Rate Limited | Yes (after wait) |
| 400 Bad Request | No (fix request) |
| 401 Unauthorized | No (re-auth) |
| 500 Internal Error | Yes |
| 404 Not Found | Usually no |
23.4 Idempotency
Section titled “23.4 Idempotency” Idempotency ===========
Same request can be sent multiple times Same result occurs
GET /users/123 -> Idempotent POST /users -> NOT idempotent (creates new) PUT /users/123 -> Idempotent (replace) DELETE /users/123 -> Idempotent
For non-idempotent: Use idempotency keys23.5 Dead Letter Queue
Section titled “23.5 Dead Letter Queue” Dead Letter Queue (DLQ) ======================
Message fails all retries -> Send to DLQ
+----------+ | Main Queue| +----------+ | (max retries exceeded) v +----------+ | DLQ | +----------+
Later: Manual review or reprocess23.6 Implementation Example
Section titled “23.6 Implementation Example”import randomimport time
def retry_with_backoff(func, max_retries=3, base_delay=0.1): for attempt in range(max_retries): try: return func() except Exception as e: if attempt == max_retries - 1: raise e
delay = min(30, base_delay * (2 ** attempt)) delay *= random.uniform(0.5, 1.5) # Jitter time.sleep(delay)Summary
Section titled “Summary”- Transient failures - Network hiccups happen
- Backoff - Wait longer between retries
- Jitter - Randomize to prevent thundering herd
- DLQ - Handle messages that fail permanently
- Idempotency - Handle duplicate requests