Complete Design Problems & Interview Strategy

This final module brings everything together. We walk through three complete system design problems — YouTube, Uber, and Google Docs — applying the RESHADED framework, estimation skills, and architectural patterns from the previous modules. We then cover security patterns and AI-era questions, and close with interview communication mastery.

Design Problem 1: Design YouTube

A video streaming platform is one of the most asked system design questions. It tests your understanding of media processing, CDN delivery, and recommendation at scale.

Requirements

Type	Requirement
Functional	Upload videos, stream videos, search videos, show recommendations
Non-functional	Low latency playback (<200ms start), high availability (99.99%), global reach
Scale	2B monthly active users, 500 hours of video uploaded per minute, 1B hours watched daily

High-Level Architecture

Upload Flow:
  Client → API Gateway → Upload Service → Object Storage (S3)
                                              ↓
                                    Transcoding Pipeline (Kafka → Workers)
                                              ↓
                                    CDN Origin → Edge Servers

Playback Flow:
  Client → CDN Edge → Origin (if miss) → Object Storage
                    → Manifest file (HLS/DASH adaptive bitrate)

Recommendation:
  User Activity → Kafka → Feature Pipeline → ML Model → Recommendation Service

Key Design Decisions

Video Transcoding: Convert uploaded videos into multiple resolutions (240p, 480p, 720p, 1080p, 4K) and codecs (H.264, VP9, AV1). Use adaptive bitrate streaming — the client switches quality based on network conditions. HLS (Apple) and DASH (open standard) are the two dominant protocols.

CDN Strategy: Cache popular videos at edge locations. Use a tiered architecture: edge → regional → origin. The 80/20 rule applies — roughly 20% of videos account for 80% of views. Pre-warm popular content at edges.

View Counting: Eventual consistency is acceptable. Use Kafka to buffer view events, aggregate in batch (every few minutes), and write to a counter service. For real-time approximate counts, use a probabilistic counter (HyperLogLog for unique viewers).

Recommendation Feed: Collaborative filtering (users who watched X also watched Y) combined with content-based features (video metadata, categories). Pre-compute recommendations offline, serve from a cache layer.

Design Problem 2: Design Uber

Ride-sharing tests geospatial systems, real-time matching, and dynamic pricing — patterns rarely covered in other courses.

Requirements

Type	Requirement
Functional	Request rides, match with drivers, track rides in real-time, calculate fares, surge pricing
Non-functional	Match within 30 seconds, location updates every 4 seconds, 99.9% availability
Scale	100M monthly riders, 5M active drivers, 20M rides per day

Key Design Decisions

Geospatial Indexing: Divide the map into cells using geohash (encode latitude/longitude into a string prefix). Nearby locations share a geohash prefix. To find drivers within 2 km, query the current cell and its 8 neighbors.

# Geohash precision vs area
# Precision 4: ~39 km × 20 km (city level)
# Precision 5: ~5 km × 5 km (neighborhood)
# Precision 6: ~1.2 km × 0.6 km (street level)
# Precision 7: ~153 m × 153 m (block level)

Matching Algorithm: When a rider requests, find all available drivers in nearby geohash cells. Rank by: (1) straight-line distance (Haversine formula), (2) estimated time of arrival (ETA), (3) driver rating. Send the request to the top-ranked driver. If declined or timed out (15 seconds), try the next driver.

Surge Pricing: Calculate supply-demand ratio per geohash zone. When demand exceeds supply by a threshold, apply a multiplier (1.2x to 3.0x, capped). Update surge zones every 2 minutes. Communicate the multiplier to riders before they confirm.

Fare Calculation: fare = base_fare + (distance_km × rate_per_km) + (duration_min × rate_per_min) × surge_multiplier. Apply minimum fare floor. Deduct driver commission (typically 20-25%).

Real-Time Tracking: Drivers send location updates every 4 seconds via WebSocket. Store in Redis (geohash → driver set). Riders subscribe to their driver's location channel for live map updates.

Design Problem 3: Design Google Docs

Real-time collaborative editing is increasingly popular in interviews. It tests your knowledge of CRDTs vs OT, consistency, and presence systems.

Requirements

Type	Requirement
Functional	Create/edit documents, real-time collaboration, cursor presence, version history, offline editing
Non-functional	<100ms sync latency, no data loss on conflicts, support 100+ concurrent editors
Scale	800M monthly users, 3B documents

Key Design Decisions

OT vs CRDTs:

Aspect	OT (Operational Transformation)	CRDTs (Conflict-free Replicated Data Types)
Used by	Google Docs	Figma (switched from OT in 2019)
Server requirement	Centralized transformation server	Decentralized (peer-to-peer possible)
Offline support	Limited (needs server for transforms)	Native (merge on reconnect)
Complexity	Transformation functions grow with operation types	Data structure design is complex upfront
Consistency	Strong (via server)	Eventual (guaranteed convergence)

For an interview, recommend CRDTs if the question emphasizes offline editing or peer-to-peer sync. Recommend OT if strong consistency through a central server is acceptable.

Presence System: Each user's cursor position and selection range are broadcast to other editors via WebSocket. Use a lightweight channel per document. Throttle updates to 10-15 per second to reduce bandwidth. Show user avatars/colors for each active cursor.

Version History: Store document snapshots periodically (every N edits or every M seconds). Use a linked list of versions — each version stores either a full snapshot or a delta from the previous version. Allow users to browse and restore any version.

Security in System Design

Security comes up in nearly every design round. Here are patterns you should mention proactively:

Layer	Pattern	When to Use
Authentication	OAuth2/OIDC for user login, API keys for service-to-service	Any system with users
Authorization	RBAC for simple roles, ABAC for fine-grained policies	Multi-tenant systems
Transport	TLS 1.3 for external, mTLS for service-to-service	Always
Data at rest	AES-256 encryption, envelope encryption with KMS	Systems storing PII, financial data
API protection	Rate limiting, input validation, WAF	Any public-facing API
Secrets	HashiCorp Vault or AWS Secrets Manager, never in code	All systems

Interview tip: Even if the interviewer does not ask about security, briefly mention "I'd add rate limiting at the API gateway and mTLS between services." This demonstrates production awareness.

AI-Era System Design Questions

Modern interviews increasingly include questions that combine traditional distributed systems with AI/ML components:

"Design a chatbot platform that serves 10K concurrent conversations with LLM backends"
"Design an image generation service that handles 1M requests/day"
"Add AI-powered search to an e-commerce platform"

Key considerations for AI system design:

Latency: LLM inference is 500ms-5s per request. Use streaming responses, prompt caching, and model cascading (fast model for simple queries, large model for complex ones).
Cost: LLM API calls are expensive. Cache common responses, batch requests where possible, and set per-user token budgets.
Safety: Input guardrails (prompt injection detection), output guardrails (content filtering), and human-in-the-loop for high-stakes decisions.

Interview Communication Strategy

The 45-Minute Pacing Guide

Time	Phase	What to Do
0-5 min	Requirements	Ask clarifying questions. Confirm functional and non-functional requirements. Establish scale.
5-10 min	Estimation	Do back-of-envelope math. Establish QPS, storage, bandwidth.
10-25 min	High-Level Design	Draw the architecture. Define components, data flow, APIs.
25-40 min	Deep Dive	Dive into 1-2 critical components the interviewer cares about.
40-45 min	Wrap-Up	Discuss trade-offs, failure modes, and potential improvements.

Communication Techniques

Think aloud: Narrate your reasoning. "I'm choosing Kafka here because we need ordered, durable event processing at 100K messages/sec."
Check in: After each phase, ask "Does this direction make sense? Should I go deeper on any component?"
Draw as you talk: Use boxes for services, cylinders for databases, arrows for data flow. Label everything.
Acknowledge uncertainty: "I'm not 100% sure about the exact Kafka partition limit, but I believe it's in the thousands range. The key point is we'd partition by user_id for ordering."
Offer alternatives: "We could use Redis Pub/Sub instead of Kafka here. The trade-off is durability — Redis is faster but doesn't persist messages by default."

Common Mistakes and Recovery

Mistake	Impact	Recovery
Not asking enough requirements questions	You design the wrong system	"Before I continue, let me clarify a few requirements..."
Spending too long on one component	No time for breadth	"Let me sketch the full picture first, then we can deep-dive"
Not discussing failure modes	Appears junior	"Let me walk through what happens when [component] goes down"
Ignoring the interviewer's hints	Missing what they want to evaluate	Listen carefully to follow-up questions — they guide you to what matters

What's Next?

Congratulations on completing System Design Interview Mastery! Here are recommended next courses to continue your interview prep:

AI System Design Interviews — Apply system design thinking to AI-specific architectures: RAG systems, LLM serving, multi-agent platforms, and ML pipelines. Perfect if you're targeting AI/ML engineering roles.
Backend Engineer Interviews — Deep dive into backend-specific problems with hands-on labs: URL shorteners, rate limiters, API design, and distributed systems patterns.
Cloud/Solutions Architect Interviews — Multi-cloud architecture, Well-Architected Framework, enterprise strategy, and cost optimization at scale.
AI Agent Engineer Interviews — Premium hands-on course where you build five production agentic systems from scratch. Covers tool-calling, RAG agents, multi-agent orchestration, and safety guardrails.

The system design interview is a conversation, not an exam. Your goal is to demonstrate that you can navigate ambiguity, make reasoned decisions, and communicate clearly under pressure. With the frameworks and patterns from this course, you are ready. :::

Quiz