Choosing the Right Guardrails Stack

With many guardrails tools available—NeMo Guardrails, Guardrails AI, LlamaGuard, ShieldGemma, Presidio—how do you choose the right combination? This lesson provides a decision framework based on your requirements.

Tool Comparison Matrix

Tool	Type	Latency	Accuracy	Customization	Self-Hosted
NeMo Guardrails	Flow control + LLM	200-500ms	High	Very high (Colang)	Yes
Guardrails AI	Schema validation	10-50ms	Variable	High (Pydantic)	Yes
LlamaGuard 3 8B	Safety classifier	100-300ms	High	Medium	Yes
ShieldGemma 27B	Safety classifier	300-800ms	Highest	Low	Yes
Presidio	PII detection	20-50ms	High	High	Yes
OpenAI Moderation	Content filter	50-100ms	Good	None	API only

Decision Framework

By Use Case

┌─────────────────────────────────────────────────────────────────────┐
│                     Guardrails Selection Guide                       │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Need structured output validation?                                  │
│  ├── Yes → Guardrails AI (Pydantic schemas)                         │
│  └── No ↓                                                           │
│                                                                      │
│  Need conversation flow control?                                     │
│  ├── Yes → NeMo Guardrails (Colang rules)                           │
│  └── No ↓                                                           │
│                                                                      │
│  Need PII protection?                                                │
│  ├── Yes → Presidio + your choice of safety classifier              │
│  └── No ↓                                                           │
│                                                                      │
│  Need content safety classification?                                 │
│  ├── Highest accuracy → ShieldGemma 27B                             │
│  ├── Fast + accurate → LlamaGuard 3 8B                              │
│  ├── Ultra-fast → LlamaGuard 3 1B or toxic-bert                     │
│  └── Simple API → OpenAI Moderation                                 │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

By Industry Requirements

Industry	Primary Concerns	Recommended Stack
Healthcare	PII, medical accuracy	Presidio + LlamaGuard + NeMo (fact-checking)
Finance	PII, compliance, fraud	Presidio + Guardrails AI (schema) + LlamaGuard
Consumer Apps	Toxicity, speed	toxic-bert → LlamaGuard (escalation)
Enterprise Internal	Data leakage, compliance	Presidio + NeMo Guardrails
Education	Age-appropriate content	ShieldGemma + NeMo (topic control)

Building Your Stack

Example 1: High-Security Enterprise

from typing import List
from dataclasses import dataclass

@dataclass
class EnterpriseStack:
    """High-security guardrails stack for enterprise."""

    layers = [
        # Layer 1: Fast input validation
        ("blocklist", BlocklistFilter()),

        # Layer 2: PII protection (required for enterprise)
        ("presidio", PresidioFilter(
            entities=["PERSON", "EMAIL", "PHONE", "CREDIT_CARD", "SSN"],
            action="mask"
        )),

        # Layer 3: Safety classification
        ("llamaguard", LlamaGuard8B(
            threshold=0.7,
            categories=["violence", "hate", "self_harm"]
        )),

        # Layer 4: Dialog control
        ("nemo", NeMoGuardrails(
            config_path="./config",
            enable_fact_checking=True
        )),

        # Layer 5: Output validation
        ("guardrails_ai", GuardrailsAI(
            schema=ResponseSchema,
            on_fail="reask"
        )),
    ]

Example 2: Consumer Chat Application

@dataclass
class ConsumerStack:
    """Fast, user-friendly guardrails for consumer apps."""

    layers = [
        # Layer 1: Ultra-fast toxicity
        ("toxic_bert", ToxicBertClassifier(
            threshold=0.8,
            escalate_threshold=0.5
        )),

        # Layer 2: Escalation only for uncertain cases
        ("llamaguard_escalation", LlamaGuard1B(
            only_on_escalation=True
        )),

        # Layer 3: Simple output check
        ("output_toxic", ToxicBertClassifier(
            check_output=True
        )),
    ]

    # Total latency target: < 100ms for 90% of requests

Example 3: RAG Application

@dataclass
class RAGStack:
    """Guardrails for Retrieval-Augmented Generation."""

    input_layers = [
        ("blocklist", BlocklistFilter()),
        ("injection", InjectionClassifier()),
    ]

    retrieval_layers = [
        # Check retrieved chunks
        ("chunk_relevance", RelevanceFilter(min_score=0.7)),
        ("chunk_toxicity", ToxicityFilter()),
    ]

    output_layers = [
        ("hallucination", HallucinationDetector(
            compare_to_sources=True
        )),
        ("citation", CitationEnforcer()),
        ("pii", PresidioFilter(action="block")),
    ]

Cost Considerations

Approach	Compute Cost	API Cost	Notes
Self-hosted LlamaGuard	GPU required	None	Best for high volume
OpenAI Moderation API	None	$0.0001/req	Simple, no GPU
ShieldGemma on Cloud	~$0.01/req	None	High accuracy
Hybrid (fast local + API)	Low GPU	Low	Best of both

# Cost-optimized hybrid approach
async def cost_optimized_check(user_input: str):
    # Free local check first
    local_result = await toxic_bert.check(user_input)

    if local_result.confidence > 0.9:
        return local_result  # High confidence = no API call

    # Only escalate uncertain cases to paid API
    return await openai_moderation.check(user_input)

Stack Validation Checklist

Before deploying your guardrails stack:

Coverage: Does the stack address all threat categories?
Latency: Total latency within budget (< 500ms typical)?
Fallbacks: What happens when each component fails?
Monitoring: Can you observe each layer's performance?
Updates: How will you update blocklists and models?
Testing: Do you have adversarial test cases?

Key Takeaway: The best guardrails stack combines complementary tools—fast local filters for obvious cases, accurate classifiers for nuanced decisions, and schema validation for structured outputs.

Next module: Deep dive into input filtering at scale with Presidio and injection detection. :::