NeMo Guardrails Architecture

NVIDIA NeMo Guardrails is an open-source framework for adding programmable guardrails to LLM applications. Released in 2023, it provides a declarative approach to controlling LLM behavior using Colang, a domain-specific language for dialog flows.

Core Components

┌─────────────────────────────────────────────────────┐
│                 NeMo Guardrails                      │
├─────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
│  │  Input      │  │  Dialog     │  │  Output     │  │
│  │  Rails      │  │  Rails      │  │  Rails      │  │
│  └─────────────┘  └─────────────┘  └─────────────┘  │
├─────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────┐│
│  │            Colang Runtime Engine                 ││
│  └─────────────────────────────────────────────────┘│
├─────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  │
│  │  LLM        │  │  Actions    │  │  Knowledge  │  │
│  │  Provider   │  │  Library    │  │  Base       │  │
│  └─────────────┘  └─────────────┘  └─────────────┘  │
└─────────────────────────────────────────────────────┘

Installation and Setup

pip install nemoguardrails

Project Structure

my_guardrails_app/
├── config/
│   ├── config.yml          # Main configuration
│   ├── rails/
│   │   ├── input.co        # Input rails (Colang)
│   │   ├── output.co       # Output rails
│   │   └── dialog.co       # Dialog flows
│   ├── prompts/
│   │   └── prompts.yml     # Custom prompts
│   └── actions/
│       └── custom.py       # Python actions
└── main.py

Basic Configuration

# config.yml
models:
  - type: main
    engine: openai
    model: gpt-4o

rails:
  input:
    flows:
      - self check input      # Built-in input validation
      - check jailbreak       # Jailbreak detection

  output:
    flows:
      - self check output     # Built-in output validation
      - check hallucination   # Fact-checking

  dialog:
    user_messages:
      embeddings_only: true   # Use embeddings for matching

Basic Implementation

from nemoguardrails import LLMRails, RailsConfig

# Load configuration
config = RailsConfig.from_path("./config")

# Create rails instance
rails = LLMRails(config)

# Generate response with guardrails
async def chat(user_message: str) -> str:
    response = await rails.generate_async(
        messages=[{
            "role": "user",
            "content": user_message
        }]
    )
    return response["content"]

# Synchronous usage
def chat_sync(user_message: str) -> str:
    response = rails.generate(
        messages=[{
            "role": "user",
            "content": user_message
        }]
    )
    return response["content"]

Rail Types

Input Rails

Process and validate user input before LLM:

# input.co
define flow self check input
  $allowed = execute check_input_safety(user_message=$user_message)

  if not $allowed
    bot refuse to respond
    stop

Output Rails

Validate and modify LLM responses:

# output.co
define flow self check output
  $safe = execute check_output_safety(bot_message=$bot_message)

  if not $safe
    bot apologize and provide safe response

Dialog Rails

Control conversation flow:

# dialog.co
define user ask about pricing
  "How much does it cost?"
  "What's the price?"
  "Pricing information"

define flow handle pricing question
  user ask about pricing
  bot provide pricing info

Built-in Safety Rails

NeMo includes pre-built rails for common safety patterns:

# config.yml
rails:
  input:
    flows:
      - self check input
      - check jailbreak
      - mask sensitive data     # PII masking

  output:
    flows:
      - self check output
      - check facts             # Fact verification
      - check hallucination

Action Registration

from nemoguardrails.actions import action

@action(name="check_input_safety")
async def check_input_safety(user_message: str) -> bool:
    """Custom input safety check."""
    # Integrate with your safety classifier
    classifier = FastToxicityClassifier()
    result = classifier.classify(user_message)
    return result["toxic"] < 0.5

@action(name="get_user_context")
async def get_user_context(user_id: str) -> dict:
    """Retrieve user context for personalization."""
    # Fetch from database
    return {"tier": "premium", "region": "us"}

# Register with rails
rails.register_action(check_input_safety)
rails.register_action(get_user_context)

Streaming Support

from nemoguardrails import LLMRails

rails = LLMRails(config)

async def stream_response(user_message: str):
    """Stream response with guardrails."""
    async for chunk in rails.stream_async(
        messages=[{"role": "user", "content": user_message}]
    ):
        yield chunk["content"]

# Usage
async for token in stream_response("Tell me about Python"):
    print(token, end="", flush=True)

Performance Configuration

# config.yml
models:
  - type: main
    engine: openai
    model: gpt-4o

  # Faster model for safety checks
  - type: self_check
    engine: openai
    model: gpt-4o-mini

# Caching for repeated checks
cache:
  enabled: true
  type: redis
  url: redis://localhost:6379

# Parallel processing
streaming:
  enabled: true
  chunk_size: 20

Architecture Insight: NeMo Guardrails uses a dual-LLM approach by default—a main model for responses and a separate model for safety checks. Configure a faster model for checks to minimize latency overhead.

Next: Mastering Colang 2.0 for custom dialog flows. :::