Debugging prompts that actually localize
Bug localization prompts
The wrong way to debug with an LLM is to paste the broken function and say "this is wrong, fix it." The model rewrites the entire thing, you lose the existing structure, and you end up reviewing a brand new implementation instead of a one-line fix.
The right pattern is what working engineers actually do: localize first, fix second. A localization prompt asks the model to find the bug — it does not ask the model to rewrite the function.
This module's flow:
The full localize-then-fix loop:
Here's a localization prompt against a buggy second_largest:
The following Python function is supposed to return the second-largest distinct number, but it sometimes returns the largest. Identify the bug, give the exact line, and the one-line fix. Do not rewrite the whole function.
def second_largest(nums): if len(nums) < 2: return None largest = float("-inf") second = float("-inf") for n in nums: if n > largest: second = largest largest = n elif n > second: second = n return secondTest that fails:
second_largest([5, 5])returns-inf, expectedNone.
Claude Sonnet 4.5 returned:
## Bug Identification
Line with bug: Line 10 (the `elif n > second:` condition)
Problem: When all numbers are the same (like `[5, 5]`), the function doesn't
check if `n` is distinct from `largest` before updating `second`. This allows
duplicate values to become the "second largest."
One-line fix:
elif n > second and n != largest:
This ensures that `second` only gets updated if the number is both greater than
the current second AND different from the largest, guaranteeing we get the
second-largest distinct number.
Captured from Claude Sonnet 4.5 (claude-sonnet-4-5) on 2026-04-27. Re-runs may differ slightly.
Three things make this prompt work:
- The expected behaviour is stated in one sentence — "supposed to return the second-largest distinct number." The model now knows what correctness means.
- A failing test case is included —
second_largest([5, 5])returns-inf, expectedNone. This anchors the bug to a reproducible input. - The output shape is constrained — "exact line, one-line fix, do not rewrite." The model can't escape into a refactor.
The "do not rewrite" instruction is the single most important word in this prompt. Without it, the model defaults to the helpful behaviour of returning a clean rewrite — which makes it harder to review, harder to land as a small PR, and harder to learn from.
When you can't include a failing test (the bug is intermittent, or you don't have a repro), describe the symptom precisely instead. "Returns 0 when input is empty" beats "doesn't work for empty inputs." Concrete inputs and outputs give the model a target. Vague prose gives it room to invent.
Next up: making the model explain a real traceback in plain language. :::
Sign in to rate