Back to Course|System Design Interview Mastery: Scalable Architecture from Estimation to Production
Lab

Build a Ride-Sharing Backend

35 min
Advanced
Unlimited free attempts

Instructions

Ride-sharing systems like Uber and Lyft are among the most frequently asked system design interview questions. They test your understanding of geospatial indexing, real-time matching, dynamic pricing, and state machines. In this lab, you'll build the core backend components that power a ride-sharing service.

Architecture Overview

rideshare/main.py (simulation entry point)
  ├── rideshare/geo/geohash.py           — Geohash encoding/decoding
  ├── rideshare/geo/spatial_index.py      — Spatial index for nearby queries
  ├── rideshare/matching/matcher.py       — Ride matching algorithm
  ├── rideshare/matching/eta_calculator.py — ETA calculation (Haversine)
  ├── rideshare/pricing/surge_calculator.py — Surge pricing
  ├── rideshare/pricing/fare_calculator.py  — Fare computation
  └── rideshare/tracking/ride_tracker.py   — Ride lifecycle state machine

Step-by-Step Instructions

FILE 1: rideshare/geo/geohash.py

Implement geohash encoding and decoding for geospatial indexing.

  • Encode latitude/longitude to a geohash string of configurable precision (default 6)
  • Use bit interleaving: alternate between longitude and latitude bits
  • Decode a geohash string back to approximate latitude/longitude (center of the cell)
  • Calculate the 8 neighboring geohash cells for a given geohash
  • Use the Base32 alphabet: 0123456789bcdefghjkmnpqrstuvwxyz

FILE 2: rideshare/geo/spatial_index.py

Build a spatial index that uses geohashes for fast nearby queries.

  • Insert drivers with their current position (lat, lng)
  • Remove drivers when they go offline or get matched
  • Query nearby drivers within a given radius using geohash prefix + neighbor expansion
  • Calculate actual distances using Haversine to filter false positives from geohash approximation
  • Return results sorted by distance from the query point

FILE 3: rideshare/matching/matcher.py

Implement the ride matching algorithm that finds the optimal driver.

  • Score drivers based on a weighted combination: distance (40%), ETA (30%), rating (30%)
  • Filter out drivers beyond maximum pickup distance
  • Filter out drivers whose vehicle type doesn't match the ride request
  • Return the best match or None if no drivers are available
  • Support different vehicle types: economy, comfort, premium

FILE 4: rideshare/matching/eta_calculator.py

Calculate estimated time of arrival for drivers.

  • Use the Haversine formula to compute great-circle distance between two points
  • Apply a road-factor multiplier (default 1.4) since roads aren't straight lines
  • Apply a traffic multiplier based on time of day (rush hour = higher multiplier)
  • Return ETA in minutes with both distance and time components

FILE 5: rideshare/pricing/surge_calculator.py

Implement dynamic surge pricing based on supply and demand.

  • Calculate demand/supply ratio for a geographic zone (geohash prefix)
  • Map ratio to surge multiplier: ratio < 1.0 → 1.0x, ratio 1.0-2.0 → linear 1.0x-2.0x, ratio > 2.0 → capped
  • Apply configurable minimum (1.0) and maximum (3.0) surge caps
  • Track request counts and available drivers per zone
  • Provide a method to decay old demand data over time

FILE 6: rideshare/pricing/fare_calculator.py

Calculate ride fares with all pricing components.

  • Base fare + (distance_km × per_km_rate) + (duration_min × per_min_rate)
  • Apply surge multiplier from the surge calculator
  • Apply minimum fare floor (no ride cheaper than the minimum)
  • Support different rate cards per vehicle type (economy, comfort, premium)
  • Return a breakdown showing base, distance, time, surge, and total

FILE 7: rideshare/tracking/ride_tracker.py

Implement the ride lifecycle as a state machine.

  • States: REQUESTED → MATCHED → DRIVER_EN_ROUTE → ARRIVED → IN_PROGRESS → COMPLETED or CANCELLED
  • Validate state transitions (e.g., cannot go from REQUESTED directly to IN_PROGRESS)
  • Track timestamps for each state transition
  • Support cancellation from REQUESTED, MATCHED, DRIVER_EN_ROUTE, or ARRIVED states
  • Calculate ride duration and wait time from timestamps

FILE 8: rideshare/main.py

Wire everything together with a simulation.

  • Seed sample drivers at various locations
  • Create ride requests from sample riders
  • Run the matching algorithm to pair riders with drivers
  • Calculate surge pricing and fares
  • Execute the full ride lifecycle from request to completion
  • Print a summary of each ride with timing, fare breakdown, and driver details

Hints

  • The geohash Base32 alphabet is 0123456789bcdefghjkmnpqrstuvwxyz (32 characters, no a, i, l, o)
  • Haversine formula: a = sin²(Δlat/2) + cos(lat1) × cos(lat2) × sin²(Δlng/2), c = 2 × atan2(√a, √(1-a)), d = R × c where R = 6371 km
  • For geohash neighbors, each geohash cell has exactly 8 neighbors (N, NE, E, SE, S, SW, W, NW)
  • Average driving speed in a city: ~30 km/h (used for ETA from distance)
  • Surge pricing is per-zone: use geohash prefix (e.g., precision 4) as the zone identifier

Grading Rubric

Geohash encoding/decoding with correct bit interleaving: encode() alternates longitude and latitude bits, groups every 5 bits into a Base32 character from the standard geohash alphabet (0123456789bcdefghjkmnpqrstuvwxyz), and produces correct hashes for known coordinates (e.g., San Francisco 37.7749/-122.4194 → starts with '9q8y'). decode() reverses the process to return the center of the geohash cell. neighbors() correctly computes all 8 adjacent cells by offsetting lat/lng by the cell size.15 points
Spatial index with insert/remove/nearby query using geohash grid: insert() computes geohash for the driver's position, stores in a dict keyed by geohash prefix, and handles re-insertion when position changes. remove() cleans up both the grid and drivers dict. nearby() searches the target cell and all 8 neighbors, calculates actual Haversine distance for each candidate, filters by radius and optionally by vehicle_type, and returns results sorted by distance ascending.15 points
Matching algorithm finding optimal driver by distance + rating + ETA: find_match() queries the spatial index for nearby available drivers of the requested vehicle type, scores each using weighted combination (distance 40%, ETA 30%, rating 30%) with proper normalization (distance relative to max_pickup_km, ETA relative to 30 min cap, rating relative to 5.0), and returns the highest-scoring driver. find_top_matches() returns the top N candidates sorted by score descending.15 points
ETA calculator with Haversine and traffic adjustment: haversine() correctly implements the Haversine formula converting degrees to radians and returning distance in km. calculate() multiplies straight-line distance by road_factor (1.4), converts to time using avg_speed_kmh (30), and applies traffic_multiplier based on hour_of_day. get_traffic_multiplier() returns 1.5 for rush hours (7-9, 17-19), 1.0 for midday (10-16), and 0.8 for night (20-6).10 points
Surge pricing with demand/supply ratio and configurable caps: record_request() increments demand count for the zone. update_supply() sets the driver count. get_surge() calculates the demand/supply ratio, maps it to a multiplier (ratio < 1 → 1.0, linear between 1-2, capped at max_surge for higher ratios), and handles edge cases (zero supply with demand → max surge, zero activity → no surge). decay_demand() reduces all request counts by a decay factor to prevent stale data.10 points
Fare calculator combining base + distance + time × surge: calculate() looks up the correct RateCard for the vehicle type, computes distance_charge (km × per_km_rate) and time_charge (min × per_min_rate), adds base_fare for the subtotal, applies surge_multiplier to get surge_amount, applies minimum_fare floor, and returns a complete FareBreakdown. Different vehicle types (economy, comfort, premium) produce different totals from their distinct rate cards.10 points
Ride tracker with complete state machine and lifecycle management: create_ride() initializes in REQUESTED state with timestamp. transition() validates against VALID_TRANSITIONS (rejects invalid transitions like REQUESTED → IN_PROGRESS with ValueError), records StateTransition with timestamp and metadata, updates current state. cancel_ride() handles cancellation from any cancellable state. get_wait_time() calculates REQUESTED→IN_PROGRESS duration. get_ride_duration() calculates IN_PROGRESS→COMPLETED duration. All timestamps are recorded for each state change.15 points
Main simulation demonstrating full ride flow: run_simulation() initializes all six components (SpatialIndex, ETACalculator, RideMatcher, SurgeCalculator, FareCalculator, RideTracker), seeds sample drivers, processes ride requests through the complete lifecycle (REQUESTED → MATCHED → DRIVER_EN_ROUTE → ARRIVED → IN_PROGRESS → COMPLETED), calculates surge-adjusted fares, and prints a summary showing driver assignment, ETA, fare breakdown, and ride timing for each request.10 points

Checklist

0/8

Your Solution

Unlimited free attempts
FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.