Checkpointing & Persistence

Production Checkpointers

5 min read

Choosing the Right Checkpointer (January 2026)

Checkpointer Use Case Throughput Setup
MemorySaver Development only Fastest None
SqliteSaver Single instance Medium File path
PostgresSaver Distributed, multi-instance High Connection string
RedisSaver High-throughput, ephemeral Highest Redis URL

SqliteSaver: Single Instance Production

from langgraph.checkpoint.sqlite import SqliteSaver

# File-based persistence
checkpointer = SqliteSaver.from_conn_string("./checkpoints.db")

# Or in-memory for testing (persists during runtime)
checkpointer = SqliteSaver.from_conn_string(":memory:")

app = graph.compile(checkpointer=checkpointer)

Best For:

  • Single-server deployments
  • Embedded applications
  • Simple production setups
  • When PostgreSQL is overkill

Limitations:

  • Single process only
  • No horizontal scaling
  • File-based (backup needed)

PostgresSaver: Distributed Production

from langgraph.checkpoint.postgres import PostgresSaver
import asyncpg

# Sync connection
checkpointer = PostgresSaver.from_conn_string(
    "postgresql://user:pass@localhost:5432/langgraph"
)

# Async for high-throughput (recommended)
async def create_async_checkpointer():
    pool = await asyncpg.create_pool(
        "postgresql://user:pass@localhost:5432/langgraph",
        min_size=5,
        max_size=20
    )
    return PostgresSaver(pool)

app = graph.compile(checkpointer=checkpointer)

Best For:

  • Multi-instance deployments
  • Kubernetes/container environments
  • High availability requirements
  • Long-term checkpoint storage

Production Setup:

-- Create dedicated database
CREATE DATABASE langgraph_checkpoints;

-- Recommended: connection pooling (PgBouncer)
-- Recommended: regular VACUUM for large tables

RedisSaver: High-Throughput Ephemeral

from langgraph.checkpoint.redis import RedisSaver
import redis

# Basic setup
checkpointer = RedisSaver.from_conn_string("redis://localhost:6379")

# With connection pool
pool = redis.ConnectionPool.from_url(
    "redis://localhost:6379",
    max_connections=50
)
checkpointer = RedisSaver(redis.Redis(connection_pool=pool))

# With TTL (auto-expire old checkpoints)
checkpointer = RedisSaver.from_conn_string(
    "redis://localhost:6379",
    ttl=86400  # 24 hours
)

Best For:

  • Highest throughput requirements
  • Session-based workflows (TTL expiry)
  • When checkpoints don't need long-term storage
  • Real-time applications

Async Checkpointers: Non-Blocking Writes

from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
import asyncpg

async def setup_async_checkpointer():
    pool = await asyncpg.create_pool(
        "postgresql://user:pass@localhost:5432/langgraph",
        min_size=10,
        max_size=50
    )

    checkpointer = AsyncPostgresSaver(pool)

    # Use with async graph
    app = graph.compile(checkpointer=checkpointer)
    return app

# Non-blocking checkpoint writes
async def run_workflow():
    app = await setup_async_checkpointer()
    config = {"configurable": {"thread_id": "async-user-1"}}

    # Checkpoints written asynchronously
    result = await app.ainvoke({"query": "research"}, config)

Why Async Matters:

  • Checkpoint writes don't block main execution
  • Essential for high-concurrency workloads
  • 3-5x throughput improvement in production

Connection Pool Best Practices

# PostgreSQL pool sizing
# Formula: connections = (cores * 2) + spindle_count
# For SSD: cores * 2 is usually sufficient

pool = await asyncpg.create_pool(
    connection_string,
    min_size=10,      # Always ready connections
    max_size=50,      # Max during peak
    max_inactive_connection_lifetime=300  # 5 min timeout
)

# Redis pool sizing
pool = redis.ConnectionPool.from_url(
    redis_url,
    max_connections=100,  # Higher for Redis (lightweight)
    socket_timeout=5,
    socket_connect_timeout=5
)

Interview Questions

Q: When use PostgresSaver vs RedisSaver?

"PostgresSaver for durable, long-term checkpoint storage—multi-instance deployments, compliance requirements, when you need checkpoint history. RedisSaver for highest throughput with ephemeral checkpoints—real-time apps, session-based workflows where checkpoints can expire."

Q: Why use async checkpointers?

"Async checkpointers write to storage without blocking the main execution thread. In high-concurrency production, this means 3-5x better throughput. Sync checkpointers wait for each write to complete before continuing, creating a bottleneck."


Key Takeaways

SqliteSaver for single-instance production ✅ PostgresSaver for distributed, multi-instance ✅ RedisSaver for highest throughput, ephemeral ✅ Always use async in production for non-blocking writes ✅ Size connection pools appropriately (10-50 typical)

:::

Quiz

Module 3: Checkpointing & Persistence

Take Quiz