Code-review prompts (security / perf / readability)
The three-category review prompt
A free-form "review this code" prompt produces a free-form blob of feedback. You can't act on it — bug fixes are mixed with style nits, security issues are buried under naming opinions, and the model picks whatever order feels conversational. By the time you've parsed the response, you might as well have read the diff yourself.
The fix is to partition the review by category. Three categories cover most reviews you'll do: SECURITY, PERFORMANCE, READABILITY. Each finding goes in exactly one bucket. Each finding has a fixed format. The reviewer reading it knows where to look first.
The three-category review, stacked by what each layer catches:
Three-category review prompt
Review this code in 3 categories. For each category, list 0–3 issues, each as: SEVERITY (high/med/low) — issue (line N) — fix in one line.
Categories: SECURITY, PERFORMANCE, READABILITY.
import sqlite3 def get_user(conn, name): cur = conn.cursor() q = "SELECT id, email FROM users WHERE name = '" + name + "'" cur.execute(q) return cur.fetchone()End with a 1-line verdict: APPROVE / REQUEST_CHANGES / BLOCK.
Claude Sonnet 4.5 returned:
## SECURITY
1. HIGH — SQL injection vulnerability (line 5) — Use parameterized query:
`cur.execute("SELECT id, email FROM users WHERE name = ?", (name,))`
## PERFORMANCE
1. LOW — Cursor not closed (line 4) — Use context manager or call
`cur.close()` after fetchone()
2. LOW — No index consideration (line 5) — Ensure `name` column is indexed
for query performance
## READABILITY
1. LOW — Variable name too short (line 5) — Rename `q` to `query` for clarity
2. MED — No input validation or error handling (line 3) — Add type check for
`name` parameter and handle exceptions
---
Verdict: BLOCK — Critical SQL injection vulnerability must be fixed before merge.
Captured from Claude Sonnet 4.5 (claude-sonnet-4-5) on 2026-04-27. Re-runs may differ slightly.
Notice the layout. The SQL injection issue lives under SECURITY, not buried in the middle of style nits. The verdict at the bottom is one word. A reviewer can scan this in fifteen seconds: high-severity security finding → BLOCK → done. They don't have to debate priorities; the structure pre-debates them.
The "0–3 issues" upper bound is doing real work. Without it, the model finds something to say in every category, even when there's nothing wrong. A LOW readability issue in code that's actually fine just creates noise. Capping at 3 forces the model to pick the most important findings, and capping at 0 lets categories be empty when they should be.
Three patterns to take away:
| Pattern | What it does |
|---|---|
| Fixed categories | Reviewers always know where to look first |
| Severity tag (high/med/low) | Sorts the work without further reading |
(line N) | Lets you click through to the location |
| One-line verdict at the end | Forces a decision; no waffling |
The verdict line is the most important constraint. Without it, you get a list of issues with no overall recommendation, and you have to integrate the findings yourself. With it, the model commits — APPROVE means "I'd merge this," REQUEST_CHANGES means "fix the high/med issues first," BLOCK means "high-severity finding, do not merge."
You will sometimes disagree with the verdict. That's fine. The point isn't to outsource the decision — it's to make the model's reasoning auditable so you can override it consciously. A model that says "APPROVE despite the SQL injection" gives you something to push back on. A model that hand-waves gives you nothing.
Next up: how to use severity tags well. :::
Sign in to rate