API Design & Microservices Patterns
Microservices Architecture Patterns
Microservices questions dominate senior backend interviews. You need to explain patterns, draw architecture diagrams, and reason about trade-offs. This lesson covers the patterns interviewers ask about most.
Monolith to Microservices: The Strangler Fig Pattern
Never rewrite a monolith from scratch. The Strangler Fig pattern lets you incrementally extract services while the monolith keeps running.
┌─────────────┐
│ API │
│ Gateway │
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
v v v
┌──────────┐ ┌──────────┐ ┌──────────┐
│ New │ │ New │ │ Monolith │
│ Auth │ │ Payment │ │ (still │
│ Service │ │ Service │ │ handles │
└──────────┘ └──────────┘ │ orders, │
│ users, │
│ etc.) │
└──────────┘
How it works:
- Put an API Gateway in front of the monolith
- Identify a bounded context to extract (start with the least coupled)
- Build the new service alongside the monolith
- Route traffic to the new service via the gateway
- Remove the old code from the monolith once the new service is stable
- Repeat for the next bounded context
Service Boundaries: Bounded Contexts from DDD
Each microservice should own a single bounded context — a cohesive domain with its own data, rules, and language.
┌──────────────────────────────────────────────────────────┐
│ E-Commerce Platform │
├──────────────┬──────────────┬──────────────┬─────────────┤
│ Order │ Payment │ Inventory │ Shipping │
│ Context │ Context │ Context │ Context │
│ │ │ │ │
│ - Order │ - Payment │ - Product │ - Shipment │
│ - OrderItem │ - Refund │ - Stock │ - Tracking │
│ - Cart │ - Invoice │ - Warehouse │ - Carrier │
│ │ │ │ │
│ Own DB │ Own DB │ Own DB │ Own DB │
└──────────────┴──────────────┴──────────────┴─────────────┘
Key principle: Database per service. Each service owns its data. No shared databases. Communication happens through APIs or events, never through direct database access.
Inter-Service Communication
Synchronous vs Asynchronous
| Aspect | Synchronous (REST/gRPC) | Asynchronous (Message Queue) |
|---|---|---|
| Coupling | Temporal coupling (caller waits) | Decoupled (fire and forget) |
| Latency | Higher for chains (A -> B -> C) | Lower perceived latency |
| Availability | Cascading failures if downstream is down | Resilient, messages wait in queue |
| Debugging | Easier (request/response trace) | Harder (trace across queues) |
| Data consistency | Immediate (within request) | Eventual consistency |
| Best for | Reads, queries needing immediate response | Writes, events, long-running tasks |
Asynchronous Messaging
┌─────────┐ ┌─────────────────┐ ┌───────────┐
│ Order │────>│ Message Broker │────>│ Payment │
│ Service │ │ (Kafka/SQS/ │ │ Service │
└─────────┘ │ RabbitMQ) │ └───────────┘
└────────┬────────┘
│
├────────────>┌───────────┐
│ │ Inventory │
│ │ Service │
│ └───────────┘
│
└────────────>┌───────────┐
│ Notification│
│ Service │
└───────────┘
| Broker | Strengths | Best For |
|---|---|---|
| Apache Kafka | High throughput, persistent log, replay | Event sourcing, streaming, audit logs |
| RabbitMQ | Flexible routing, priority queues, low latency | Task queues, RPC patterns |
| AWS SQS | Fully managed, auto-scaling, dead letter queues | Serverless architectures, AWS-native |
Saga Pattern: Distributed Transactions
In microservices, you cannot use a single database transaction across services. The Saga pattern coordinates a sequence of local transactions, with compensating actions if any step fails.
E-Commerce Order Flow Example
Happy Path:
Order Service Payment Service Inventory Service Shipping Service
│ │ │ │
1. Create Order ──────────> │ │
│ 2. Charge Card │ │
│ ────────────────> │ │
│ │ 3. Reserve Stock │
│ │ ─────────────────> │
│ │ │ 4. Create Shipment
│ │ │ ────────────────>
│ │ │ │
◄────────────────────────────── Success ──────────────────────┘
Failure at Step 3 (out of stock):
Order Service Payment Service Inventory Service
│ │ │
1. Create Order ──────────> │
│ 2. Charge Card │
│ ────────────────> │
│ │ 3. Reserve Stock ──> FAILS!
│ │ │
│ 4. COMPENSATE: │
│ Refund Card │
│ <────────────── │
5. COMPENSATE: │ │
Cancel Order │ │
│ │ │
Orchestration vs Choreography
Orchestration: A central coordinator (saga orchestrator) tells each service what to do.
┌──────────────────┐
│ Saga │
│ Orchestrator │
└───────┬──────────┘
│
┌─────────────┼─────────────┐
│ │ │
v v v
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Payment │ │ Inventory│ │ Shipping │
│ Service │ │ Service │ │ Service │
└──────────┘ └──────────┘ └──────────┘
Choreography: Each service reacts to events and publishes its own events. No central coordinator.
Order Created ──> Payment Service ──> Payment Completed ──> Inventory Service
│
Stock Reserved
│
v
Shipping Service
| Aspect | Orchestration | Choreography |
|---|---|---|
| Coordination | Central orchestrator | Distributed (each service listens) |
| Coupling | Services coupled to orchestrator | Services coupled to events |
| Visibility | Easy to see full flow in one place | Flow distributed across services |
| Complexity | Orchestrator can become complex | Harder to track full saga |
| Failure handling | Orchestrator manages compensations | Each service handles its own |
| Best for | Complex multi-step workflows | Simple flows, high autonomy |
| Risk | Single point of failure | Circular dependencies, event storms |
Interview answer: "For a payment flow with 4+ steps and strict ordering, I'd use orchestration — the centralized view makes it easier to handle compensating transactions and debug failures. For simpler event-driven flows like sending notifications after an order, choreography works well because it keeps services autonomous."
Circuit Breaker Pattern
When a downstream service is failing, the Circuit Breaker stops calling it to prevent cascading failures. It has three states:
┌──────────┐
│ CLOSED │ (normal operation)
│ │
│ Requests │
│ pass │
│ through │
└────┬─────┘
│
Failure threshold reached
│
v
┌──────────┐
│ OPEN │ (all requests fail fast)
│ │
│ Returns │
│ fallback │
│ or error │
└────┬─────┘
│
Timeout expires
│
v
┌──────────┐
│HALF-OPEN │ (test with limited requests)
│ │
│ Allows │
│ few test │
│ requests │
└────┬─────┘
│
┌──────────┴──────────┐
│ │
Tests pass Tests fail
│ │
v v
┌──────────┐ ┌──────────┐
│ CLOSED │ │ OPEN │
└──────────┘ └──────────┘
import time
from enum import Enum
class CircuitState(Enum):
CLOSED = "closed"
OPEN = "open"
HALF_OPEN = "half_open"
class CircuitBreaker:
def __init__(self, failure_threshold: int = 5, timeout: float = 30.0, half_open_max: int = 3):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.half_open_max = half_open_max
self.state = CircuitState.CLOSED
self.failure_count = 0
self.last_failure_time = 0
self.half_open_calls = 0
def call(self, func, *args, **kwargs):
if self.state == CircuitState.OPEN:
if time.time() - self.last_failure_time > self.timeout:
self.state = CircuitState.HALF_OPEN
self.half_open_calls = 0
else:
raise CircuitOpenError("Circuit is open — request blocked")
if self.state == CircuitState.HALF_OPEN:
if self.half_open_calls >= self.half_open_max:
raise CircuitOpenError("Half-open limit reached")
self.half_open_calls += 1
try:
result = func(*args, **kwargs)
self._on_success()
return result
except Exception as e:
self._on_failure()
raise e
def _on_success(self):
if self.state == CircuitState.HALF_OPEN:
self.state = CircuitState.CLOSED
self.failure_count = 0
def _on_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
API Gateway
The API Gateway is the single entry point for all client requests. It handles cross-cutting concerns so individual services do not have to.
┌──────────────────────┐
Mobile App ──────────>│ │──> User Service
Web App ─────────────>│ API Gateway │──> Order Service
Third-party ─────────>│ │──> Payment Service
│ - Routing │──> Inventory Service
│ - Authentication │
│ - Rate Limiting │
│ - Request Aggregation│
│ - SSL Termination │
│ - Load Balancing │
└──────────────────────┘
Popular choices: Kong (plugin-based), Envoy (high-performance proxy), AWS API Gateway (serverless), NGINX (lightweight).
Service Mesh
For complex microservice deployments, a service mesh adds observability, security, and traffic management without changing application code. It uses a sidecar proxy pattern.
┌─────────────────────┐ ┌─────────────────────┐
│ Pod A │ │ Pod B │
│ ┌───────────────┐ │ │ ┌───────────────┐ │
│ │ Order Service │ │ │ │ Payment Service│ │
│ └───────┬───────┘ │ │ └───────┬───────┘ │
│ │ │ │ │ │
│ ┌───────▼───────┐ │ │ ┌───────▼───────┐ │
│ │ Envoy Proxy │◄─┼─mTLS┼─►│ Envoy Proxy │ │
│ │ (sidecar) │ │ │ │ (sidecar) │ │
│ └───────────────┘ │ │ └───────────────┘ │
└─────────────────────┘ └─────────────────────┘
│ │
└───────────┬───────────────┘
│
┌───────▼────────┐
│ Control Plane │
│ (Istio/Linkerd)│
│ - mTLS certs │
│ - Traffic rules│
│ - Observability│
└────────────────┘
What a service mesh provides:
- mTLS: Automatic encryption between all services — zero-trust networking
- Traffic management: Canary deployments, A/B testing, fault injection
- Observability: Distributed tracing, metrics, access logs — without code changes
- Retries and timeouts: Configurable retry policies at the proxy level
CQRS (Command Query Responsibility Segregation)
Separate the read and write models of your application. The write side optimizes for consistency, while the read side optimizes for query performance.
┌──────────────────────────────────────┐
│ API Layer │
└────────────┬─────────────────────────┘
│
┌────────────┴────────────┐
│ │
Commands (writes) Queries (reads)
│ │
v v
┌──────────────┐ ┌──────────────┐
│ Write Model │ │ Read Model │
│ (normalized, │ Sync │ (denormalized│
│ consistent) │ ──────> │ fast reads) │
│ │ events │ │
│ PostgreSQL │ │ Elasticsearch│
│ │ │ or Redis │
└──────────────┘ └──────────────┘
When to use CQRS:
- Read-heavy workloads (100:1 read/write ratio)
- Complex queries that would slow down the write database
- Different scaling needs for reads vs writes
- Need for different data representations (e.g., search index)
When NOT to use CQRS:
- Simple CRUD applications
- Low traffic where a single model suffices
- Teams unfamiliar with eventual consistency
Event Sourcing
Instead of storing the current state, store a sequence of events that led to that state. You can rebuild any state by replaying events.
Traditional (state-based):
Account: { id: "acc-1", balance: 150 }
Event Sourcing (event-based):
Event 1: AccountCreated { id: "acc-1", owner: "Alice" }
Event 2: MoneyDeposited { amount: 200 }
Event 3: MoneyWithdrawn { amount: 50 }
─────────────────────────────────────────
Current state: balance = 0 + 200 - 50 = 150
// Event sourcing with an event store
interface DomainEvent {
eventId: string;
aggregateId: string;
eventType: string;
payload: Record<string, unknown>;
timestamp: string;
version: number;
}
// Rebuild account state from events
function rebuildAccount(events: DomainEvent[]): Account {
return events.reduce((account, event) => {
switch (event.eventType) {
case 'AccountCreated':
return { id: event.aggregateId, balance: 0, owner: event.payload.owner as string };
case 'MoneyDeposited':
return { ...account, balance: account.balance + (event.payload.amount as number) };
case 'MoneyWithdrawn':
return { ...account, balance: account.balance - (event.payload.amount as number) };
default:
return account;
}
}, {} as Account);
}
Benefits of Event Sourcing:
- Complete audit trail — every change is recorded
- Time travel — rebuild state at any point in time
- Event replay — fix bugs by replaying events through corrected logic
- Natural fit with CQRS — events update the read model
Challenges:
- Eventual consistency between event store and read model
- Event schema evolution (versioning events)
- Storage growth — need snapshotting for aggregates with many events
Interview tip: CQRS and Event Sourcing are often mentioned together but are independent patterns. You can use CQRS without Event Sourcing (sync read models from the write DB) and Event Sourcing without CQRS (single model rebuilt from events).
This completes the API Design & Microservices module. Test your knowledge with the module quiz, then apply these patterns in the Payment API Design lab. :::