Self-Correction and Error Recovery

Production AI agents must handle failures gracefully. Real system prompts reveal sophisticated self-correction mechanisms that separate robust agents from brittle ones.

Error Classification

Production agents categorize errors for appropriate handling:

Error Classification System:
1. RECOVERABLE - Can fix automatically
   - Syntax errors in generated code
   - Missing imports
   - Type mismatches
   - Rate limits (retry with backoff)

2. CORRECTABLE - Needs adjustment
   - Wrong file modified
   - Incorrect approach taken
   - Partial solution needs extension

3. BLOCKING - Requires escalation
   - Missing permissions
   - Unknown API or library
   - Conflicting requirements
   - Ambiguous instructions

4. CRITICAL - Stop and alert
   - Data loss risk
   - Security vulnerability introduced
   - Breaking production code

Self-Validation Patterns

Pre-Execution Validation

Check before acting:

Pre-Execution Checklist:
Before modifying any file:
[ ] Have I read the file first?
[ ] Do I understand the existing code?
[ ] Is my change targeted and minimal?
[ ] Will this break existing functionality?
[ ] Are there tests I should run after?

If any check fails, STOP and reconsider.

Post-Execution Validation

Verify after changes:

Post-Execution Verification:
After each modification:
1. Check for syntax errors
2. Verify imports resolve
3. Run type checker if available
4. Execute related tests
5. Confirm expected behavior

If verification fails:
- Analyze the error
- Determine root cause
- Apply fix or rollback

Claude Code's Safety Protocols

From Claude Code's system prompt:

Git Safety Protocol:
- NEVER update git config
- NEVER run destructive commands (push --force, hard reset)
- NEVER skip hooks (--no-verify)
- NEVER force push to main/master

Commit Amend Rules:
Only use --amend when ALL conditions are met:
1. User explicitly requested amend, OR commit succeeded
   but pre-commit hook auto-modified files
2. HEAD commit was created by you in this conversation
3. Commit has NOT been pushed to remote

If commit FAILED or was REJECTED by hook:
- NEVER amend
- Fix the issue
- Create a NEW commit

Claude Code's Rollback Logic

Rollback Decision Tree:
If change causes errors:
├── Can error be fixed quickly?
│   ├── Yes → Apply fix
│   └── No → Consider rollback
├── Is the change critical path?
│   ├── Yes → Prioritize fix
│   └── No → Rollback, address later
└── Does user need working state?
    ├── Yes → Rollback immediately
    └── No → Continue debugging

Cursor's Error Recovery

Cursor agents handle failures systematically:

Cursor Error Recovery:
On task failure:
1. Capture full error output
2. Analyze error type:
   - Compilation error → Fix code
   - Runtime error → Debug logic
   - Test failure → Update test or code
   - Environment error → Check setup

3. Apply appropriate fix strategy:
   - Targeted edit for simple errors
   - Broader refactor for design issues
   - User consultation for ambiguous cases

4. Verify fix resolves original error
5. Check for regression

Background Agent Recovery

Cursor's background agents have special recovery:

Background Agent Failure Protocol:
If background agent fails:
1. Save all progress to checkpoint
2. Log detailed error context
3. Notify user with:
   - What was attempted
   - Where it failed
   - What was completed
   - Suggested next steps
4. Preserve partial work
5. Allow user to resume or retry

Devin's Escalation Protocol

From Devin 2.0's error handling:

Escalation Levels:
LEVEL 1: Self-Recovery
- Retry transient failures (max 3)
- Auto-fix common errors
- Continue without user

LEVEL 2: Inform User
- Log the issue
- Provide workaround
- Continue with degraded functionality

LEVEL 3: Seek Guidance
- Present problem clearly
- Offer alternative approaches
- Wait for user decision

LEVEL 4: Full Stop
- Halt all operations
- Preserve state
- Require explicit user restart

Multi-Agent Error Handling

Devin's agent coordination failure:

Agent Coordination Failure:
If dispatched agent fails:
1. Capture agent's final state
2. Analyze failure point
3. Decision:
   - Retry with same agent
   - Dispatch different agent
   - Absorb task to main agent
   - Escalate to user

Communication protocol:
{
  "agent_id": "test_writer_01",
  "status": "failed",
  "error": "...",
  "partial_work": {...},
  "retry_recommended": true
}

Windsurf's Memory-Aided Recovery

Windsurf uses memory for smarter recovery:

Memory-Aided Recovery:
On error, check memory for:
- Similar past errors and solutions
- User's preferred error handling
- Project-specific workarounds
- Known problematic patterns

Memory query:
<memory_search>
  type: error_resolution
  error_pattern: {{current_error}}
  project: {{current_project}}
</memory_search>

Apply learned solution if confidence > 80%

Self-Critique Mechanisms

Production agents critique their own work:

Self-Critique Protocol:
After generating solution:
1. Review from user's perspective
   - Does this solve the actual problem?
   - Is it the simplest solution?

2. Review from maintainer's perspective
   - Is the code readable?
   - Will future developers understand it?

3. Review from security perspective
   - Any vulnerabilities introduced?
   - Input validation adequate?

4. Review from performance perspective
   - Any obvious inefficiencies?
   - Will it scale?

Score each dimension 1-5
If any score < 3, revise before presenting

Iterative Self-Improvement

Self-Improvement Loop:
Generate → Critique → Improve

Iteration 1:
- Generate initial solution
- Critique: "Missing error handling"
- Improve: Add try-catch blocks

Iteration 2:
- Review improved solution
- Critique: "Error messages not helpful"
- Improve: Add descriptive messages

Iteration 3:
- Final review
- Critique: "Acceptable quality"
- Present to user

Max iterations: 3
Quality threshold: All dimensions ≥ 3

Graceful Degradation

When full solution isn't possible:

Graceful Degradation Strategy:
If cannot complete full task:
1. Complete what's possible
2. Document what's missing
3. Provide partial solution
4. Explain blockers clearly
5. Suggest manual steps

Example output:
"I completed 3 of 4 requested features:
✓ User authentication
✓ Password reset
✓ Email verification
✗ OAuth integration (blocked by missing API keys)

For OAuth, you'll need to:
1. Add API keys to .env
2. Run: npm run setup-oauth
3. I can complete once keys are available"

Production Error Patterns

Common patterns from real system prompts:

Pattern	When to Use	Example
Retry with backoff	Transient failures	API rate limits
Rollback	Breaking changes	Failed refactor
Partial completion	Blocked tasks	Missing credentials
User escalation	Ambiguity	Unclear requirements
Checkpoint	Long tasks	Multi-file changes

Key Insight: Production agents don't just try to succeed—they plan for failure. Error classification, validation checkpoints, escalation protocols, and graceful degradation create robust systems that maintain user trust even when things go wrong.

In the quiz, we'll test your understanding of agentic prompt architecture patterns. :::