Structured output across vendors
JSON-only across vendors — three failure shapes
The cleanest test of a vendor's structured-output discipline is a strict-JSON instruction. The user explicitly says "ONLY a JSON object (no prose, no markdown fence)" and provides the schema. There is one acceptable answer shape. There are many ways to get it wrong. Watch what each model does.
The prompt
Extract the structured data from this paragraph and return ONLY a JSON object (no prose, no markdown fence). Schema: { "company": string, "founded_year": number, "headquarters_city": string, "products": string[] }
Paragraph: "Bayt Coffee, founded in 2019 in Cairo, sells three main products: a house blend, single-origin Ethiopian, and a cold-brew concentrate."
Claude Sonnet 4.5
{ "company": "Bayt Coffee", "founded_year": 2019, "headquarters_city": "Cairo", "products": ["house blend", "single-origin Ethiopian", "cold-brew concentrate"] }
GPT-4o-mini
{ "company": "Bayt Coffee", "founded_year": 2019, "headquarters_city": "Cairo", "products": [ "house blend", "single-origin Ethiopian", "cold-brew concentrate" ] }
Gemini 2.5 Flash
{"company": "Bayt Coffee", "founded_year": 2019, "headquarters_city": "Cairo", "products": ["a house blend", "single
Side-by-side scorecard
Strict-JSON test: who returns parseable bytes?
Claude Sonnet 4.5
- Fence violates 'no markdown' rule
- Needs pre-parse cleanup
GPT-4o-mini
- Only model that obeys the no-fence rule
- Direct parse downstream
Gemini 2.5 Flash
- Cut off mid-string at 'single
- Drifted on product wording ('a house blend')
What this tells us
Three different failure modes, exactly visible in the captures.
Claude wrapped the JSON in a ```json ... ``` markdown fence even though the prompt said "no markdown fence". The content inside the fence is valid JSON, schema-correct, with the right field types. But if your downstream code naively does JSON.parse(output), it will throw because of the leading ```. This is Claude's most common structured-output failure mode: the JSON itself is correct, but it is wrapped.
GPT-4o-mini returned bare JSON, no fence, no prose, valid parse. On this prompt GPT-4o-mini is the only one of the three that produced output you can directly send to JSON.parse. Notice GPT-4o-mini chose the multi-line "pretty" formatting; that is fine for a parser. The product names match the source.
Gemini 2.5 Flash truncated mid-string: "products": ["a house blend", "single — no closing quote, no closing bracket, no closing brace. Whatever max-output budget Gemini Flash has, this prompt hit it. Even worse: the partial text added a noise word ("a house blend" instead of "house blend") that was not in any other model's output. So Gemini drifted on word choice AND truncated.
For production, the patches are predictable. For Claude: strip leading ```json and trailing ``` before parsing, or use Anthropic's tool-use mode which forces a typed JSON object. For GPT-4o-mini on this kind of task: use OpenAI's response_format: { type: "json_object" } and the model becomes even more reliable than the prompt-only version. For Gemini: do not use Flash for strict-JSON over a few hundred characters of input — escalate to Gemini Pro or use the structured-output API mode that Google ships separately.
Captured 2026-04-27 from Claude Sonnet 4.5, GPT-4o-mini, and Gemini 2.5 Flash. Re-runs may differ slightly.
Next: function calling — same idea, different API surface, similar parity issues. :::
Sign in to rate