Case Study: AI Customer Support

Let's walk through a complete system design interview for an AI customer support agent. This demonstrates how to apply the RADIO framework to a real problem.

The Interview Question

"Design an AI-powered customer support system for an e-commerce company that handles 100,000 support tickets per day. The system should resolve simple issues automatically while escalating complex ones to human agents."

Step 1: Requirements (R)

Clarifying questions to ask:

What types of tickets? (orders, returns, account issues, product questions)
What's the target automation rate? (assume 70%)
What languages? (English + Spanish initially)
SLA requirements? (first response < 30 seconds)
Budget constraints? (assume $50k/month for AI)

Functional Requirements:

Classify incoming tickets by type and urgency
Auto-resolve common issues (order status, return policy)
Escalate complex issues with context summary
Support multi-turn conversations
Learn from human agent resolutions

Non-Functional Requirements:

99.9% uptime
< 2 second response time
Handle 100k tickets/day (~1.2 tickets/second average, 5x peak)
Cost under $0.50 per resolved ticket

Step 2: Architecture (A)

┌─────────────────────────────────────────────────────────────────────┐
│                        Customer Channels                             │
│              (Chat Widget, Email, Mobile App)                        │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      API Gateway + Rate Limiting                     │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      Ticket Router Service                           │
│  - Intent Classification (fine-tuned classifier)                     │
│  - Priority Assignment                                               │
│  - Language Detection                                                │
└─────────────────────────────────────────────────────────────────────┘
                    │                              │
          (Automated)│                    (Complex)│
                    ▼                              ▼
┌─────────────────────────┐       ┌────────────────────────────────────┐
│    Auto-Resolution      │       │         Human Escalation           │
│        Agent            │       │                                    │
│                         │       │  - Context Summary Generator       │
│  - RAG for policies     │       │  - Suggested Responses             │
│  - Order lookup tool    │       │  - Human Agent Queue               │
│  - Return initiation    │       │                                    │
└─────────────────────────┘       └────────────────────────────────────┘
                    │                              │
                    └──────────────┬───────────────┘
                                   ▼
┌─────────────────────────────────────────────────────────────────────┐
│                      Response Delivery                               │
│                    + Feedback Collection                             │
└─────────────────────────────────────────────────────────────────────┘

Step 3: Data (D)

Knowledge Base (RAG):

knowledge_sources = {
    "policies": {
        "source": "internal_docs",
        "update_frequency": "daily",
        "chunks": 5000,
        "examples": ["return policy", "shipping times", "warranty"]
    },
    "product_catalog": {
        "source": "product_db",
        "update_frequency": "real-time",
        "chunks": 50000,
        "examples": ["product specs", "compatibility", "availability"]
    },
    "past_resolutions": {
        "source": "ticket_history",
        "update_frequency": "weekly",
        "chunks": 100000,
        "examples": ["similar ticket resolutions", "agent responses"]
    }
}

Tools for Agent:

tools = [
    {
        "name": "lookup_order",
        "description": "Get order status, tracking, items",
        "requires": ["order_id or email"]
    },
    {
        "name": "initiate_return",
        "description": "Start return process for eligible items",
        "requires": ["order_id", "item_id", "reason"],
        "side_effects": True
    },
    {
        "name": "search_knowledge_base",
        "description": "Search policies and product info",
        "requires": ["query"]
    },
    {
        "name": "escalate_to_human",
        "description": "Transfer to human with context",
        "requires": ["reason", "urgency"]
    }
]

Step 4: Infrastructure (I)

Scaling Strategy:

scaling_config = {
    "ticket_router": {
        "type": "kubernetes_deployment",
        "min_replicas": 3,
        "max_replicas": 20,
        "scale_metric": "requests_per_second",
        "target": 100  # requests per pod
    },
    "auto_resolution_agent": {
        "type": "kubernetes_deployment",
        "min_replicas": 5,
        "max_replicas": 50,
        "scale_metric": "queue_depth",
        "target": 10  # tickets per pod
    },
    "vector_database": {
        "type": "managed_pinecone",
        "pods": 3,
        "replicas": 2
    },
    "llm_api": {
        "primary": "gpt-4o",
        "fallback": "gpt-4o-mini",
        "rate_limit": 10000  # requests per minute
    }
}

Cost Estimation:

cost_breakdown = {
    "llm_costs": {
        "auto_resolve": "70k tickets × $0.10 = $7,000/day",
        "summarization": "30k tickets × $0.05 = $1,500/day",
        "daily_total": "$8,500",
        "monthly_total": "$255,000"  # Over budget!
    },
    "optimization_needed": {
        "caching": "Cache common responses → 30% reduction",
        "smaller_model": "Use GPT-4o-mini for classification → 50% reduction",
        "optimized_budget": "$45,000/month"
    }
}

Step 5: Operations (O)

Key Metrics:

metrics_to_track = {
    "automation_rate": {
        "target": 0.70,
        "alert_threshold": 0.60
    },
    "customer_satisfaction": {
        "target": 4.5,  # out of 5
        "alert_threshold": 4.0
    },
    "resolution_time": {
        "automated_target": "30 seconds",
        "escalated_target": "4 hours"
    },
    "escalation_accuracy": {
        "target": 0.95,  # % of escalations that needed human
        "false_escalation_cost": "$5 per ticket"
    }
}

Safety Guardrails:

guardrails = {
    "prohibited_actions": [
        "Refund over $500 without approval",
        "Access payment information",
        "Make promises about delivery dates"
    ],
    "required_escalation": [
        "Customer mentions legal action",
        "Sentiment is very negative (score < 0.2)",
        "Issue involves safety concerns"
    ],
    "human_review_queue": [
        "First resolution for new issue types",
        "Random 5% sample for quality"
    ]
}

Trade-offs Discussed

Decision	Option A	Option B	Choice	Reason
Model	GPT-4o	GPT-4o-mini	Hybrid	GPT-4o-mini for routing, GPT-4o for resolution
RAG vs Fine-tune	RAG	Fine-tune	RAG	Policies change frequently
Sync vs Async	Sync	Queue	Sync	User expects immediate response

Interview Tip

This case study demonstrates:

Structured approach - RADIO framework keeps you organized

Trade-off analysis - Show you understand constraints

Cost awareness - Initial estimate was over budget, we optimized

Safety first - Guardrails for automated actions

Next, let's explore a code review agent case study. :::