Structured-output failure modes by vendor

Each vendor has a recognisable signature when structured output goes wrong. Knowing the signature helps you triage faster — instead of asking "what is wrong with my prompt?", you ask "is this the Claude failure, the GPT failure, or the Gemini failure?", and that question answers itself once you have seen each one twice.

Claude — wrapped JSON

The most common Claude failure shape on prompt-only JSON tasks is wrapping the valid JSON in a ```json ... ``` markdown fence. You saw this in lesson 1 of this module. The JSON inside is correct. The fence is not.

How to recognise it: your JSON.parse(output) throws "Unexpected token `" with the position pointing to the very first character.

How to fix it: either use Anthropic's tool-use mode (which returns typed objects, not text) or pre-process the output:

function stripJsonFence(s: string): string {
  return s.replace(/^```json\s*/, "").replace(/\s*```$/, "").trim();
}

Run that before the parser. Done.

GPT-4o-mini — over-volunteering

When you ask GPT-4o-mini for a clean JSON object via prompt only, it usually delivers. When the prompt is for a more open task and the model volunteers structure, the structure is often surrounded by prose: "Here is the JSON object you requested:" before the object, "Let me know if you need any other fields!" after.

How to recognise it: your parser throws because the prompt-shaped output is Here is the JSON object you requested:\n{...} and the leading prose breaks the parse.

How to fix it: use OpenAI's response_format: { type: "json_object" } mode. With that flag set, GPT-4o-mini suppresses prose entirely. If you cannot use that mode (older API client, structured-output not supported), the same pre-processing trick works — strip everything before the first { and everything after the last }.

Gemini 2.5 Flash — truncation

Flash's signature is mid-string truncation, like in lesson 1's "products": ["a house blend", "single example. The output starts well, the schema is honoured for the first few fields, then the response stops abruptly. Sometimes mid-string. Sometimes mid-array.

How to recognise it: the parse fails not at the start but somewhere inside, and the error message points to a position near the end of the response. The length of the response is suspiciously round (close to a power of two — 512, 1024, 2048 tokens).

How to fix it: raise max_output_tokens first, retry. If that does not fix it, escalate to Gemini Pro for the same task. If you cannot escalate, use the smallest possible JSON shape — flatten arrays, drop optional fields, shorten field names. Flash on long-output tasks is the wrong default.

A general triage flow

When a structured-output failure happens in production, ask in this order:

Does the response start with ```json? → Claude failure, strip fence.
Does the response start with prose ("Here is...", "Sure, ...")? → GPT failure, switch to JSON mode or strip prose.
Does the response stop mid-string or mid-array? → Gemini truncation, raise token limit or escalate.
Is the JSON valid but the field types are wrong? → schema-design failure, not vendor failure. The model is doing what you told it to.

The fourth case is the one most teams miss. If you ask for founded_year: number and the model returns "founded_year": "2019" (string), that is your schema being too soft. Use OpenAI strict mode, Anthropic tool-use, or a Zod-style runtime validator on your end.

Next module: reasoning modes — when to ask the model to think, and when thinking just costs you latency. :::

Claude — wrapped JSON

GPT-4o-mini — over-volunteering

Gemini 2.5 Flash — truncation

A general triage flow

Quiz

Stay on the Nerd Track