Build an Estimation & Capacity Planner

In system design interviews, back-of-envelope estimation is the skill that separates confident candidates from struggling ones. In this lab, you'll build a complete capacity planning toolkit that performs the exact calculations you'd do on a whiteboard — but as reusable, testable code.

Architecture Overview

main.py (CLI entry point)
  ├── estimator/qps_calculator.py      — DAU → QPS conversion
  ├── estimator/storage_calculator.py   — Storage projections
  ├── estimator/bandwidth_calculator.py — Bandwidth estimation
  ├── estimator/infrastructure_planner.py — Server/cache/shard sizing
  ├── estimator/latency_analyzer.py     — Latency budget breakdown
  ├── estimator/cost_estimator.py       — Cloud cost projection
  └── estimator/__init__.py             — Module exports

Step-by-Step Instructions

FILE 1: `estimator/qps_calculator.py`

Build a QPS (Queries Per Second) calculator that converts user activity into request rates.

Take DAU (Daily Active Users) and actions per user per day as input
Calculate average QPS: DAU × actions / 86400
Apply a configurable peak multiplier (default 3x) for peak QPS
Split into read QPS and write QPS using a read-to-write ratio
Return a typed result with all computed values

FILE 2: `estimator/storage_calculator.py`

Estimate storage requirements over time.

Take per-object size in bytes, daily write count, and retention period in days
Calculate daily storage: writes_per_day × object_size
Project storage growth over the retention period
Apply a growth factor for metadata overhead (default 1.2x)
Return daily, monthly, and total retention storage figures

FILE 3: `estimator/bandwidth_calculator.py`

Calculate ingress and egress bandwidth.

Compute ingress: write_QPS × write_payload_size
Compute egress: read_QPS × read_payload_size
Convert to human-readable units (KB/s, MB/s, GB/s)
Return both raw bytes/sec and formatted strings

FILE 4: `estimator/infrastructure_planner.py`

Size the infrastructure based on load.

Calculate server count: peak_QPS / queries_per_server (round up)
Calculate cache size: based on the 80/20 rule (cache 20% of daily reads)
Calculate DB shard count: using total storage and max shard size (default 500 GB)
Use consistent hashing with configurable virtual nodes for shard distribution
Return server count, cache size in GB, and shard count

FILE 5: `estimator/latency_analyzer.py`

Break down a request's latency budget.

Define latency components: network hop, load balancer, app server processing, cache lookup, DB query, serialization
Calculate P50 and P99 latency for each component
Identify the dominant component (highest latency contributor)
Check if total latency fits within a target SLA
Return a breakdown table and pass/fail for the SLA

FILE 6: `estimator/cost_estimator.py`

Estimate monthly cloud infrastructure costs.

Compute costs: number of servers × cost per server per hour × 730 hours/month
Storage costs: total TB × cost per TB per month
Bandwidth (egress) costs: egress GB × cost per GB (tiered pricing)
Cache costs: cache nodes × cost per node per month
Return itemized and total monthly costs

FILE 7: `estimator/init.py`

Export all calculator classes for clean imports.

FILE 8: `main.py`

Build a CLI that runs a full estimation scenario.

Define a scenario (e.g., "Twitter-like feed: 300M DAU, 2 tweets/user/day, 100:1 read-write ratio")
Run all estimators in sequence: QPS → Storage → Bandwidth → Infrastructure → Latency → Cost
Print a formatted report with all results
The report should be clear enough to present in an interview

Hints

Use Python dataclasses or TypedDict for structured return types
For human-readable formatting: 1_073_741_824 bytes → "1.00 GB"
For consistent hashing: use hashlib.md5 to hash keys to a ring
Remember: 1 day = 86,400 seconds, 1 month ≈ 730 hours, 1 year ≈ 365 days
Cloud cost reference (approximate): compute ~$0.05/hr per vCPU, storage ~$0.023/GB/month, egress ~$0.09/GB

What to Submit

The editor has 8 file sections with TODO comments. Replace each TODO with your implementation.

Before submitting, make sure each file section in the editor is complete:

FILE 1 — QPS calculator with DAU conversion, peak multiplier, and read/write split
FILE 2 — Storage calculator with per-object sizing, retention, and growth projection
FILE 3 — Bandwidth calculator computing ingress/egress with human-readable formatting
FILE 4 — Infrastructure planner with server count, cache sizing, and DB shard count
FILE 5 — Latency analyzer with component breakdown, P50/P99, and SLA check
FILE 6 — Cost estimator with compute, storage, bandwidth, and cache cost items
FILE 7 — Module exports for all calculators
FILE 8 — CLI main.py running a complete scenario with formatted report

Instructions