Business Cases & Product Sense
Root Cause Analysis
"A metric dropped 20% last week. How would you investigate?" This question tests your structured problem-solving ability under ambiguity.
The Investigation Framework
Follow this systematic approach:
1. Clarify and scope the problem
2. Validate the data
3. Segment the decline
4. Generate hypotheses
5. Test hypotheses with data
6. Identify root cause
7. Recommend actions
Step 1: Clarify and Scope
Before diving in, ask clarifying questions:
| Question | Why It Matters |
|---|---|
| Which metric exactly? | "Revenue" could mean many things |
| What's the time period? | Week-over-week vs year-over-year |
| How much of a drop? | 5% vs 50% requires different urgency |
| Any known events? | Product launch, holiday, outage |
Example: "Before I investigate, I want to confirm: we're looking at daily active users, which dropped 20% this week compared to last week. Is this the first time we've seen this pattern, or is it seasonal?"
Step 2: Validate the Data
Never assume the data is correct:
| Check | What to Look For |
|---|---|
| Logging changes | Did tracking code change? |
| Definition changes | Did the metric calculation change? |
| Data pipeline issues | ETL failures, duplicates, delays |
| Outliers | Single large customer or bot activity |
Interview insight: "My first step would be to check if this is a real change or a data quality issue. I'd look at whether our logging changed or if there were ETL failures."
Step 3: Segment the Decline
Decompose the metric to find where the drop is concentrated:
Common segmentation dimensions:
- Platform (iOS, Android, Web)
- Geography (country, region)
- User type (new vs returning)
- Traffic source (organic, paid, referral)
- Device (mobile, desktop)
- Time (weekday vs weekend, hour of day)
MECE Principle: Mutually Exclusive, Collectively Exhaustive - segments shouldn't overlap and should cover everything.
Example decomposition:
Total DAU: -20%
├── Mobile: -25%
│ ├── iOS: -5%
│ └── Android: -40% ← Problem here!
└── Desktop: -10%
Step 4: Generate Hypotheses
Based on segmentation, brainstorm possible causes:
| Segment Finding | Possible Causes |
|---|---|
| Android-specific drop | App update bug, Play Store issue, competitor launch |
| New users only | Marketing campaign ended, onboarding change |
| Specific geography | Local event, payment processor issue, regulatory change |
| Specific time | Server outage, traffic spike, API failure |
Structure your hypotheses:
- Internal technical (bugs, performance)
- Internal product (feature changes, UI updates)
- External market (competitor, economy)
- External technical (platform changes, API updates)
Step 5: Test with Data
Prioritize hypotheses by:
- Likelihood (based on pattern)
- Data availability (can we check this?)
- Actionability (can we fix it?)
Example testing:
Hypothesis: Android app update caused bug
Test: Check app version in data
Finding: 95% of drop is from v4.2 released Monday
Conclusion: Likely root cause found
Interview Example
Question: "Homepage conversion rate dropped 15% yesterday. Walk me through your investigation."
Strong answer:
"I'd approach this systematically:
Clarify: Is this 15% relative (5% → 4.25%) or absolute? What's the confidence interval?
Validate: Check if tracking code changed, look for data pipeline issues.
Segment: Break down by:
- Device: Did one platform drop more?
- Traffic source: Organic vs paid vs email
- User type: New vs returning
- Geography: Regional issues?
- Time of day: Sudden drop or gradual?
Hypotheses based on pattern:
- If sudden drop at specific time → likely technical (deploy, outage)
- If gradual all day → likely product/external
- If one traffic source → likely attribution or marketing issue
Test: I'd query each hypothesis:
SELECT
DATE_TRUNC('hour', timestamp),
platform,
COUNT(*) as sessions,
SUM(converted) / COUNT(*) as conversion_rate
FROM homepage_visits
WHERE date >= CURRENT_DATE - 2
GROUP BY 1, 2
ORDER BY 1, 2
Root cause + recommendation: Based on findings, identify the cause and recommend either a revert, fix, or further investigation."
The best answers show structured thinking and data-driven hypothesis testing, not jumping to conclusions. :::