Skip to content

Retry_patterns

Chapter 23: Retry Patterns & Dead Letter Queues

Section titled “Chapter 23: Retry Patterns & Dead Letter Queues”

Network and service failures are often transient - they happen once and then things work again.

Transient Failure Example
=========================
Request 1: Timeout (network hiccup)
Request 2: Success! (network recovered)
Without retry: User sees error
With retry: User gets success

Simple Retry
===========
Try once, fail -> Try again immediately
Problem: Doesn't help if service is temporarily down
Exponential Backoff
==================
Attempt 1: Immediate
Attempt 2: Wait 100ms
Attempt 3: Wait 200ms
Attempt 4: Wait 400ms
Attempt 5: Wait 800ms
...
max_delay: Cap at some limit (e.g., 30s)
Jitter (Randomization)
=====================
Without jitter:
All clients retry at same time -> Thundering herd
With jitter:
Attempt 1: Random(0-100ms)
Attempt 2: Random(0-200ms)
...
Formula: random(0, min(cap, base * 2^attempt))

Error TypeRetry?
TimeoutYes
503 Service UnavailableYes
429 Rate LimitedYes (after wait)
400 Bad RequestNo (fix request)
401 UnauthorizedNo (re-auth)
500 Internal ErrorYes
404 Not FoundUsually no

Idempotency
===========
Same request can be sent multiple times
Same result occurs
GET /users/123 -> Idempotent
POST /users -> NOT idempotent (creates new)
PUT /users/123 -> Idempotent (replace)
DELETE /users/123 -> Idempotent
For non-idempotent: Use idempotency keys

Dead Letter Queue (DLQ)
======================
Message fails all retries -> Send to DLQ
+----------+
| Main Queue|
+----------+
| (max retries exceeded)
v
+----------+
| DLQ |
+----------+
Later: Manual review or reprocess

import random
import time
def retry_with_backoff(func, max_retries=3, base_delay=0.1):
for attempt in range(max_retries):
try:
return func()
except Exception as e:
if attempt == max_retries - 1:
raise e
delay = min(30, base_delay * (2 ** attempt))
delay *= random.uniform(0.5, 1.5) # Jitter
time.sleep(delay)

  1. Transient failures - Network hiccups happen
  2. Backoff - Wait longer between retries
  3. Jitter - Randomize to prevent thundering herd
  4. DLQ - Handle messages that fail permanently
  5. Idempotency - Handle duplicate requests