Introduction to AI Red Teaming
Setting Scope & Rules of Engagement
2 min read
Before launching any red team exercise, you must define clear boundaries. Testing without authorization is illegal and unethical. This lesson covers the essential documentation every red team engagement requires.
The Rules of Engagement Document
Every professional red team exercise requires a signed Rules of Engagement (RoE):
from dataclasses import dataclass, field
from datetime import datetime
from typing import List, Optional
from enum import Enum
class TestingScope(Enum):
IN_SCOPE = "in_scope"
OUT_OF_SCOPE = "out_of_scope"
REQUIRES_APPROVAL = "requires_approval"
@dataclass
class RulesOfEngagement:
"""
Formal agreement defining red team boundaries.
Must be signed before any testing begins.
"""
project_name: str
client_name: str
start_date: datetime
end_date: datetime
authorized_testers: List[str]
emergency_contacts: List[str]
# Scope definitions
in_scope_systems: List[str] = field(default_factory=list)
out_of_scope_systems: List[str] = field(default_factory=list)
allowed_techniques: List[str] = field(default_factory=list)
prohibited_techniques: List[str] = field(default_factory=list)
# Legal
authorization_signature: Optional[str] = None
legal_review_date: Optional[datetime] = None
def is_authorized(self) -> bool:
"""Verify engagement is properly authorized."""
return (
self.authorization_signature is not None
and self.legal_review_date is not None
and datetime.now() >= self.start_date
and datetime.now() <= self.end_date
)
def check_scope(self, system: str) -> TestingScope:
"""Check if a system is in scope."""
if system in self.out_of_scope_systems:
return TestingScope.OUT_OF_SCOPE
if system in self.in_scope_systems:
return TestingScope.IN_SCOPE
return TestingScope.REQUIRES_APPROVAL
Essential Scope Elements
| Element | Description | Example |
|---|---|---|
| Target Systems | Which LLM applications to test | "customer-support-bot-v2" |
| Environment | Production, staging, or test | "staging-only" |
| Time Window | When testing is permitted | "Weekdays 9AM-5PM EST" |
| Data Handling | What to do with found vulnerabilities | "Report within 24 hours" |
| Excluded Actions | What you must NOT do | "No DoS, no data exfiltration" |
Pre-Engagement Checklist
@dataclass
class PreEngagementChecklist:
"""
Verify all requirements before starting.
"""
items: dict = field(default_factory=lambda: {
"written_authorization": False,
"scope_documented": False,
"emergency_contacts_confirmed": False,
"legal_review_complete": False,
"insurance_verified": False,
"communication_channels_established": False,
"backup_restoration_plan": False,
})
def is_ready(self) -> bool:
return all(self.items.values())
def get_blockers(self) -> List[str]:
return [item for item, complete in self.items.items() if not complete]
# Before any test
checklist = PreEngagementChecklist()
checklist.items["written_authorization"] = True
checklist.items["scope_documented"] = True
# ... complete all items
if not checklist.is_ready():
blockers = checklist.get_blockers()
print(f"Cannot proceed. Missing: {blockers}")
Legal Considerations
Warning: Unauthorized testing of AI systems violates computer fraud laws in most jurisdictions. Always obtain explicit written permission.
Key legal points:
- Written authorization from system owner is mandatory
- Third-party systems (APIs, models) may have separate ToS
- Data protection laws apply to any exposed PII
- Document everything for legal protection
Sample Scope Statement
# Example: Well-defined scope for LLM red team
roe = RulesOfEngagement(
project_name="Q1 2025 Chatbot Security Assessment",
client_name="Acme Corp",
start_date=datetime(2025, 1, 15),
end_date=datetime(2025, 1, 31),
authorized_testers=["alice@security.com", "bob@security.com"],
emergency_contacts=["security@acme.com", "+1-555-0123"],
in_scope_systems=[
"chatbot.acme.com",
"api.acme.com/v2/chat",
"internal-assistant.acme.local"
],
out_of_scope_systems=[
"production-database",
"payment-processing",
"third-party-apis"
],
allowed_techniques=[
"prompt_injection",
"jailbreak_attempts",
"context_manipulation",
"output_analysis"
],
prohibited_techniques=[
"denial_of_service",
"data_exfiltration",
"lateral_movement",
"social_engineering_employees"
],
authorization_signature="Jane Smith, CISO",
legal_review_date=datetime(2025, 1, 10)
)
if roe.is_authorized():
print("Engagement authorized. Proceed with testing.")
Next, we'll explore how red teams and blue teams work together effectively. :::