Checkpointing & Persistence
Production Checkpointers
Choosing the Right Checkpointer (January 2026)
| Checkpointer | Use Case | Throughput | Setup |
|---|---|---|---|
| MemorySaver | Development only | Fastest | None |
| SqliteSaver | Single instance | Medium | File path |
| PostgresSaver | Distributed, multi-instance | High | Connection string |
| RedisSaver | High-throughput, ephemeral | Highest | Redis URL |
SqliteSaver: Single Instance Production
from langgraph.checkpoint.sqlite import SqliteSaver
# File-based persistence
checkpointer = SqliteSaver.from_conn_string("./checkpoints.db")
# Or in-memory for testing (persists during runtime)
checkpointer = SqliteSaver.from_conn_string(":memory:")
app = graph.compile(checkpointer=checkpointer)
Best For:
- Single-server deployments
- Embedded applications
- Simple production setups
- When PostgreSQL is overkill
Limitations:
- Single process only
- No horizontal scaling
- File-based (backup needed)
PostgresSaver: Distributed Production
from langgraph.checkpoint.postgres import PostgresSaver
import asyncpg
# Sync connection
checkpointer = PostgresSaver.from_conn_string(
"postgresql://user:pass@localhost:5432/langgraph"
)
# Async for high-throughput (recommended)
async def create_async_checkpointer():
pool = await asyncpg.create_pool(
"postgresql://user:pass@localhost:5432/langgraph",
min_size=5,
max_size=20
)
return PostgresSaver(pool)
app = graph.compile(checkpointer=checkpointer)
Best For:
- Multi-instance deployments
- Kubernetes/container environments
- High availability requirements
- Long-term checkpoint storage
Production Setup:
-- Create dedicated database
CREATE DATABASE langgraph_checkpoints;
-- Recommended: connection pooling (PgBouncer)
-- Recommended: regular VACUUM for large tables
RedisSaver: High-Throughput Ephemeral
from langgraph.checkpoint.redis import RedisSaver
import redis
# Basic setup
checkpointer = RedisSaver.from_conn_string("redis://localhost:6379")
# With connection pool
pool = redis.ConnectionPool.from_url(
"redis://localhost:6379",
max_connections=50
)
checkpointer = RedisSaver(redis.Redis(connection_pool=pool))
# With TTL (auto-expire old checkpoints)
checkpointer = RedisSaver.from_conn_string(
"redis://localhost:6379",
ttl=86400 # 24 hours
)
Best For:
- Highest throughput requirements
- Session-based workflows (TTL expiry)
- When checkpoints don't need long-term storage
- Real-time applications
Async Checkpointers: Non-Blocking Writes
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
import asyncpg
async def setup_async_checkpointer():
pool = await asyncpg.create_pool(
"postgresql://user:pass@localhost:5432/langgraph",
min_size=10,
max_size=50
)
checkpointer = AsyncPostgresSaver(pool)
# Use with async graph
app = graph.compile(checkpointer=checkpointer)
return app
# Non-blocking checkpoint writes
async def run_workflow():
app = await setup_async_checkpointer()
config = {"configurable": {"thread_id": "async-user-1"}}
# Checkpoints written asynchronously
result = await app.ainvoke({"query": "research"}, config)
Why Async Matters:
- Checkpoint writes don't block main execution
- Essential for high-concurrency workloads
- 3-5x throughput improvement in production
Connection Pool Best Practices
# PostgreSQL pool sizing
# Formula: connections = (cores * 2) + spindle_count
# For SSD: cores * 2 is usually sufficient
pool = await asyncpg.create_pool(
connection_string,
min_size=10, # Always ready connections
max_size=50, # Max during peak
max_inactive_connection_lifetime=300 # 5 min timeout
)
# Redis pool sizing
pool = redis.ConnectionPool.from_url(
redis_url,
max_connections=100, # Higher for Redis (lightweight)
socket_timeout=5,
socket_connect_timeout=5
)
Interview Questions
Q: When use PostgresSaver vs RedisSaver?
"PostgresSaver for durable, long-term checkpoint storage—multi-instance deployments, compliance requirements, when you need checkpoint history. RedisSaver for highest throughput with ephemeral checkpoints—real-time apps, session-based workflows where checkpoints can expire."
Q: Why use async checkpointers?
"Async checkpointers write to storage without blocking the main execution thread. In high-concurrency production, this means 3-5x better throughput. Sync checkpointers wait for each write to complete before continuing, creating a bottleneck."
Key Takeaways
✅ SqliteSaver for single-instance production ✅ PostgresSaver for distributed, multi-instance ✅ RedisSaver for highest throughput, ephemeral ✅ Always use async in production for non-blocking writes ✅ Size connection pools appropriately (10-50 typical)
:::