Building AI into Your Product Strategy

Writing AI Product Requirements

5 min read

Traditional PRDs don't work for AI features. AI is probabilistic, not deterministic—you need different requirements.

Why AI PRDs Are Different

Traditional PRD AI PRD
"Show user's name" "Predict user preference with 85%+ accuracy"
"Button click → action" "Input → model → probabilistic output"
"Same input = same output" "Same input can = different outputs"
"Test once, ship" "Monitor continuously, retrain"

The AI PRD Template

Section 1: Problem Statement

What you're solving:

We're building [AI capability] to solve [user problem]
because [evidence of problem] and current solutions
[fail because X].

Example:

We're building automated content moderation to solve the problem of scaling our review team. Currently, moderators review 5,000 posts/day manually, creating a 6-hour backlog. 80% of flagged content is obvious violations that don't need human judgment.

Section 2: Success Metrics

Define measurable thresholds:

Metric Target Rationale
Accuracy ≥90% Industry benchmark for content moderation
Precision ≥95% False positives damage user trust
Recall ≥85% Some violations can slip through
Latency <500ms Real-time user experience
Cost <$0.01/prediction Stay within budget

Critical: Define what "correct" means

For classification tasks:

  • What are all possible categories?
  • How do you handle edge cases?
  • What's the "gold standard" for comparison?

Section 3: Input/Output Specification

Input:

What: User-generated text posts
Format: UTF-8 strings, 1-5000 characters
Languages: English, Spanish
Volume: 50,000 posts/day
Source: Posts API endpoint

Output:

What: Moderation decision
Format:
{
  "decision": "approve" | "flag" | "reject",
  "confidence": 0.0-1.0,
  "reason_code": "spam" | "hate" | "violence" | ...,
  "requires_human_review": boolean
}

Section 4: Edge Cases & Error Handling

Define how to handle:

Scenario Expected Behavior
Model confidence <70% Route to human review
Input too long Truncate + process first 5000 chars
Unsupported language Default to human review
Model timeout Retry once, then queue for async
Model returns error Log, alert, route to human review

Section 5: Data Requirements

Training data:

  • Minimum volume needed
  • Data sources
  • Labeling requirements
  • Refresh frequency

Example:

Need 50,000 labeled examples (approved/flagged/rejected) from the past 12 months. Labels must match current policy. Quarterly retraining with new policy updates.

Section 6: Human-in-the-Loop Design

Where do humans stay involved?

Touchpoint Trigger Action
Low confidence <70% confidence Human reviews decision
Appeals User disputes Human re-reviews
Audit Random 5% sample Quality check
Drift detection Weekly accuracy drop >2% Alert ML team

Section 7: Rollout Plan

AI features need gradual rollout:

Phase Traffic Duration Gate to Next
Shadow 0% (logging only) 2 weeks Accuracy ≥85%
Canary 5% 1 week No major issues
Beta 25% 2 weeks Accuracy ≥88%
GA 100% Ongoing Continuous monitoring

Section 8: Monitoring & Alerts

What to track post-launch:

Metric Alert Threshold Escalation
Accuracy <85% Page on-call
Latency p99 >1000ms Page on-call
Error rate >1% Slack alert
User appeals >5% increase Weekly review

Common PRD Mistakes

1. Treating AI like traditional software

  • Bad: "The model will correctly classify all spam"
  • Good: "The model will classify spam with ≥92% precision"

2. Ignoring edge cases

  • Bad: No mention of what happens when confidence is low
  • Good: Explicit fallback to human review below 70% confidence

3. No baseline comparison

  • Bad: "We want high accuracy"
  • Good: "We want 90% accuracy vs. current 75% rule-based system"

4. Missing monitoring requirements

  • Bad: PRD ends at launch
  • Good: PRD includes ongoing metrics, alerts, and retraining triggers

Template Checklist

Before finalizing your AI PRD:

  • Problem is clearly defined with evidence
  • Success metrics have specific numbers
  • Input/output formats are specified
  • Edge cases are documented
  • Human-in-the-loop touchpoints defined
  • Rollout plan has gates between phases
  • Monitoring and alerting requirements included
  • Baseline comparison established

Key Takeaway

AI PRDs must acknowledge uncertainty. Define success with ranges and thresholds, not absolutes. Plan for errors, monitoring, and continuous improvement.


Next: Should you build AI in-house or buy from vendors? Let's explore the decision framework. :::

Quiz

Module 2: Building AI into Your Product Strategy

Take Quiz