LLM Application Architecture
Prompt Management
4 min read
Prompts are the "code" of AI systems. Managing them properly is as important as managing source code. This lesson covers production-grade prompt management.
Why Prompt Management Matters
| Problem | Impact |
|---|---|
| Prompts hardcoded in code | Hard to update without deployment |
| No version history | Can't rollback bad changes |
| No A/B testing | Can't optimize performance |
| Inconsistent formats | Different behaviors across features |
Prompt Template System
from jinja2 import Template
class PromptTemplate:
def __init__(self, template_str: str, version: str):
self.template = Template(template_str)
self.version = version
self.variables = self._extract_variables()
def render(self, **kwargs) -> str:
# Validate all required variables are provided
missing = set(self.variables) - set(kwargs.keys())
if missing:
raise ValueError(f"Missing variables: {missing}")
return self.template.render(**kwargs)
def _extract_variables(self) -> set:
# Extract {{ variable }} patterns
import re
return set(re.findall(r'\{\{\s*(\w+)\s*\}\}', self.template.source))
# Usage
qa_prompt = PromptTemplate(
template_str="""
You are a helpful assistant. Answer the question based on the context.
Context:
{{ context }}
Question: {{ question }}
Answer:""",
version="1.2.0"
)
rendered = qa_prompt.render(
context="Python was created by Guido van Rossum.",
question="Who created Python?"
)
Prompt Registry
Centralized storage for all prompts:
class PromptRegistry:
def __init__(self, storage_backend):
self.storage = storage_backend
self.cache = {}
async def get(self, name: str, version: str = "latest") -> PromptTemplate:
cache_key = f"{name}:{version}"
if cache_key in self.cache:
return self.cache[cache_key]
prompt_data = await self.storage.fetch(name, version)
template = PromptTemplate(
template_str=prompt_data["template"],
version=prompt_data["version"]
)
self.cache[cache_key] = template
return template
async def save(self, name: str, template: str, metadata: dict):
version = self._generate_version()
await self.storage.save(name, {
"template": template,
"version": version,
"metadata": metadata,
"created_at": datetime.utcnow()
})
# Initialize with database
registry = PromptRegistry(PostgresStorage())
Version Control for Prompts
# prompts/qa_assistant.yaml
name: qa_assistant
versions:
- version: "1.0.0"
date: "2024-01-15"
template: |
Answer the question: {{ question }}
notes: "Initial version"
- version: "1.1.0"
date: "2024-02-01"
template: |
You are a helpful assistant.
Answer the question: {{ question }}
notes: "Added system context"
- version: "1.2.0"
date: "2024-03-10"
template: |
You are a helpful assistant. Answer based on the context provided.
Context: {{ context }}
Question: {{ question }}
notes: "Added RAG context support"
active_version: "1.2.0"
rollback_version: "1.1.0"
A/B Testing Prompts
import random
class PromptExperiment:
def __init__(self, name: str, variants: dict):
self.name = name
self.variants = variants # {"control": 50, "variant_a": 50}
def select_variant(self, user_id: str) -> str:
# Deterministic selection based on user_id
hash_val = hash(f"{self.name}:{user_id}") % 100
cumulative = 0
for variant, percentage in self.variants.items():
cumulative += percentage
if hash_val < cumulative:
return variant
return list(self.variants.keys())[0]
# Usage
experiment = PromptExperiment(
name="system_prompt_test",
variants={
"control": 50, # Current prompt
"concise": 25, # Shorter prompt
"detailed": 25 # More detailed prompt
}
)
variant = experiment.select_variant(user_id="user_123")
prompt = registry.get(f"qa_assistant_{variant}")
Prompt Composition
Build complex prompts from reusable components:
class PromptComposer:
def __init__(self, registry: PromptRegistry):
self.registry = registry
async def compose(self, components: list, variables: dict) -> str:
"""Compose prompt from multiple components."""
parts = []
for component in components:
template = await self.registry.get(component)
# Only include variables relevant to this component
relevant_vars = {
k: v for k, v in variables.items()
if k in template.variables
}
parts.append(template.render(**relevant_vars))
return "\n\n".join(parts)
# Usage
composer = PromptComposer(registry)
full_prompt = await composer.compose(
components=["system_context", "user_history", "current_query"],
variables={
"role": "assistant",
"history": previous_messages,
"query": user_input
}
)
Best Practices
| Practice | Benefit |
|---|---|
| Store prompts externally | Update without code deployment |
| Version all changes | Audit trail and rollback capability |
| Test before deploying | Catch regressions early |
| Monitor performance metrics | Track quality over time |
| Document prompt intent | Help team understand design decisions |
Next, we'll explore caching strategies to reduce costs and latency. :::