Lesson 6 of 23

LLM Application Architecture

Prompt Management

4 min read

Prompts are the "code" of AI systems. Managing them properly is as important as managing source code. This lesson covers production-grade prompt management.

Why Prompt Management Matters

Problem Impact
Prompts hardcoded in code Hard to update without deployment
No version history Can't rollback bad changes
No A/B testing Can't optimize performance
Inconsistent formats Different behaviors across features

Prompt Template System

from jinja2 import Template

class PromptTemplate:
    def __init__(self, template_str: str, version: str):
        self.template = Template(template_str)
        self.version = version
        self.variables = self._extract_variables()

    def render(self, **kwargs) -> str:
        # Validate all required variables are provided
        missing = set(self.variables) - set(kwargs.keys())
        if missing:
            raise ValueError(f"Missing variables: {missing}")
        return self.template.render(**kwargs)

    def _extract_variables(self) -> set:
        # Extract {{ variable }} patterns
        import re
        return set(re.findall(r'\{\{\s*(\w+)\s*\}\}', self.template.source))

# Usage
qa_prompt = PromptTemplate(
    template_str="""
You are a helpful assistant. Answer the question based on the context.

Context:
{{ context }}

Question: {{ question }}

Answer:""",
    version="1.2.0"
)

rendered = qa_prompt.render(
    context="Python was created by Guido van Rossum.",
    question="Who created Python?"
)

Prompt Registry

Centralized storage for all prompts:

class PromptRegistry:
    def __init__(self, storage_backend):
        self.storage = storage_backend
        self.cache = {}

    async def get(self, name: str, version: str = "latest") -> PromptTemplate:
        cache_key = f"{name}:{version}"
        if cache_key in self.cache:
            return self.cache[cache_key]

        prompt_data = await self.storage.fetch(name, version)
        template = PromptTemplate(
            template_str=prompt_data["template"],
            version=prompt_data["version"]
        )
        self.cache[cache_key] = template
        return template

    async def save(self, name: str, template: str, metadata: dict):
        version = self._generate_version()
        await self.storage.save(name, {
            "template": template,
            "version": version,
            "metadata": metadata,
            "created_at": datetime.utcnow()
        })

# Initialize with database
registry = PromptRegistry(PostgresStorage())

Version Control for Prompts

# prompts/qa_assistant.yaml
name: qa_assistant
versions:
  - version: "1.0.0"
    date: "2024-01-15"
    template: |
      Answer the question: {{ question }}
    notes: "Initial version"

  - version: "1.1.0"
    date: "2024-02-01"
    template: |
      You are a helpful assistant.
      Answer the question: {{ question }}
    notes: "Added system context"

  - version: "1.2.0"
    date: "2024-03-10"
    template: |
      You are a helpful assistant. Answer based on the context provided.
      Context: {{ context }}
      Question: {{ question }}
    notes: "Added RAG context support"

active_version: "1.2.0"
rollback_version: "1.1.0"

A/B Testing Prompts

import random

class PromptExperiment:
    def __init__(self, name: str, variants: dict):
        self.name = name
        self.variants = variants  # {"control": 50, "variant_a": 50}

    def select_variant(self, user_id: str) -> str:
        # Deterministic selection based on user_id
        hash_val = hash(f"{self.name}:{user_id}") % 100

        cumulative = 0
        for variant, percentage in self.variants.items():
            cumulative += percentage
            if hash_val < cumulative:
                return variant
        return list(self.variants.keys())[0]

# Usage
experiment = PromptExperiment(
    name="system_prompt_test",
    variants={
        "control": 50,      # Current prompt
        "concise": 25,      # Shorter prompt
        "detailed": 25      # More detailed prompt
    }
)

variant = experiment.select_variant(user_id="user_123")
prompt = registry.get(f"qa_assistant_{variant}")

Prompt Composition

Build complex prompts from reusable components:

class PromptComposer:
    def __init__(self, registry: PromptRegistry):
        self.registry = registry

    async def compose(self, components: list, variables: dict) -> str:
        """Compose prompt from multiple components."""
        parts = []
        for component in components:
            template = await self.registry.get(component)
            # Only include variables relevant to this component
            relevant_vars = {
                k: v for k, v in variables.items()
                if k in template.variables
            }
            parts.append(template.render(**relevant_vars))
        return "\n\n".join(parts)

# Usage
composer = PromptComposer(registry)
full_prompt = await composer.compose(
    components=["system_context", "user_history", "current_query"],
    variables={
        "role": "assistant",
        "history": previous_messages,
        "query": user_input
    }
)

Best Practices

Practice Benefit
Store prompts externally Update without code deployment
Version all changes Audit trail and rollback capability
Test before deploying Catch regressions early
Monitor performance metrics Track quality over time
Document prompt intent Help team understand design decisions

Next, we'll explore caching strategies to reduce costs and latency. :::

Quiz

Module 2: LLM Application Architecture

Take Quiz