Design a URL Shortener Service
Instructions
Objective
Design and implement a URL shortener service that handles 100 million new URLs per month with a 10:1 read-to-write ratio and 5-year data retention. This lab tests your ability to combine API design, encoding algorithms, database modeling, caching, analytics, and scaling reasoning into a cohesive system.
Capacity Estimation
Before writing any code, complete the capacity estimation section in the starter code. Use these parameters:
- New URLs per month: 100 million
- Read-to-write ratio: 10:1
- Average URL record size: 500 bytes (short code + long URL + metadata)
- Retention period: 5 years
- Cache hit ratio target: 80% (Pareto distribution — 20% of URLs get 80% of traffic)
You should calculate:
- Write QPS (average and peak at 3x)
- Read QPS (average and peak at 3x)
- Total storage required over 5 years
- Daily cache memory requirement
- Daily bandwidth for reads
Requirements
1. API Endpoints
Design a RESTful API with these endpoints:
POST /api/shorten— Create a short URL from a long URL- Request body:
{ "long_url": "https://...", "custom_alias?": "my-link", "expires_at?": "2027-01-01" } - Response:
{ "short_url": "https://short.url/Ab3xK9z", "short_code": "Ab3xK9z", "expires_at": "2027-01-01" }
- Request body:
GET /:shortCode— Redirect to the original URL (HTTP 301 or 302)GET /api/stats/:shortCode— Get analytics for a short URL- Response:
{ "total_clicks": 1523, "clicks_by_day": [...], "top_referrers": [...], "top_countries": [...] }
- Response:
DELETE /api/urls/:shortCode— Delete a short URL (owner only)
2. Base62 Encoding Logic
Implement a Base62Encoder class that:
- Converts a numeric ID to a Base62 string (characters:
a-z,A-Z,0-9) - Converts a Base62 string back to a numeric ID
- Generates codes of exactly 7 characters (pad with leading
aif needed) - Handles the full range: 0 to 62^7 - 1 (3.5 trillion)
3. Database Schema
Design a schema that supports:
- Storing URL mappings with metadata (creator, creation date, expiration)
- Efficient lookup by short code (primary access pattern)
- Analytics event storage (click timestamp, IP, user agent, referrer, country)
- Rate limiting state per API key
Provide schemas for both PostgreSQL (relational) and DynamoDB (NoSQL) so the grader can evaluate your understanding of both paradigms.
4. Redis Caching Layer
Implement a CacheService class that:
- Uses cache-aside pattern for URL lookups
- Sets TTL based on URL popularity (hot URLs get longer TTL)
- Handles cache invalidation on URL deletion
- Includes cache warming for newly created URLs that are expected to be popular
5. Analytics Tracking
Implement a ClickTracker that:
- Records each redirect as an analytics event (non-blocking, fire-and-forget)
- Extracts geo-location from IP (stub the geo lookup)
- Parses user agent for device/browser info
- Batches events for efficient database writes
- Provides aggregated stats (clicks by day, top referrers, top countries)
6. Rate Limiting
Implement a token bucket rate limiter that:
- Limits URL creation to 10 per minute per API key
- Limits redirects to 100 per minute per IP (prevent abuse)
- Uses Redis for distributed state
- Returns proper HTTP 429 response with
Retry-Afterheader
7. Horizontal Scaling Plan
In the ScalingPlan section of your submission, describe (as code comments or a design document string):
- How you would shard the URLs database (which key, which strategy, why)
- How you would handle cache consistency across multiple API server instances
- How you would scale the analytics pipeline to handle 10x traffic
- What your monitoring and alerting strategy would be (key metrics)
- How you would handle the "thundering herd" problem for viral URLs
Hints
- For Base62 encoding, think of it like converting a number to base-62 (similar to converting decimal to hexadecimal, but with 62 digits instead of 16)
- For the database schema, consider that reads vastly outnumber writes — optimize the read path first
- For analytics, use a message queue (Kafka) to decouple click recording from the redirect path — the redirect must be fast
- For rate limiting, use a Lua script in Redis to make the check-and-decrement atomic
- The scaling plan should address the specific bottleneck at each tier: stateless API servers scale trivially, but the database and cache layers require careful sharding
What to Submit
Your submission should contain 1 file section in the editor below: a complete TypeScript file with all 8 sections.