System-prompt behavior — where each model listens hardest
The translation guide — porting a prompt between vendors
You wrote a prompt that works on one model. The output is exactly what you need. Your team is now asking you to support a second model — for cost, for redundancy, for compliance, for any of the reasons real engineering teams rebalance providers. The naive answer is "send the same prompt to the new model". The empirical answer, after the lessons in Modules 1 and 2, is that the same prompt does not produce the same output.
This lesson is the patch list. When you port a prompt from Claude to GPT or to Gemini, here are the moves that consistently work.
When porting from Claude to GPT-4o-mini
GPT-4o-mini's most consistent quirk is over-volunteering. It adds subject lines, signatures, "I hope you are doing well" preambles, and trailing commentary that you did not ask for. The patch is to be explicit about omission, not just inclusion.
In your prompt, replace soft instructions like "Output only the email body" with hard ones: "Output only the email body. Do not include a subject line. Do not include sign-off. Do not include any meta-commentary." On a tone task, add: "Do not add salutations like 'I hope you are doing well'."
GPT-4o-mini also tends to wrap structured output in markdown fences even when told not to. You will see this in Module 3. Mitigation: when you parse the output, strip leading/trailing fences before JSON-decoding rather than trusting the model.
When porting from Claude to Gemini 2.5 Flash
Gemini Flash's most visible quirk in the live captures is truncation under multi-constraint prompts. If your Claude prompt has four hard rules, Gemini will sometimes follow one and stop. The patch is to reduce the constraint count, not to repeat them.
Move from "Follow rules 1, 2, 3, 4" to a single tight instruction. Replace "Use exactly 3 lines, each starting with a different vowel in alphabetical order, mention coffee once, no line over 60 characters" with "Three lines. Each line starts with a different vowel. Mention coffee in line 2. Keep lines under 60 chars." Same content, restructured to weight equally rather than nest.
Also: on Flash, raise max_output_tokens to at least 1024. The truncation pattern in this course's captures looks consistent with a hidden output limit being hit. A higher cap will not fix all failures but it eliminates one class of them.
When porting from GPT to Claude
The reverse problem. Claude often produces more output than GPT for the same prompt because Claude's defaults are more thorough. If your downstream code expects the GPT-shape (terse, tightly bounded), tell Claude explicitly: "Maximum N words" or "Maximum N bullet points". Claude follows count constraints reliably — give it one.
Claude also sometimes wraps strict-JSON outputs in a markdown fence (you will see this in Module 3 lesson 1). Same mitigation as GPT: strip fences before parsing rather than trusting the model.
A general rule
Porting is not a search-and-replace. Porting means re-reading your prompt with the destination model's failure modes in mind, then tightening the language until the constraints land. Three captures side-by-side, like the ones you have read in this module, are the fastest way to see which constraints fall off.
Next module: structured output across vendors — JSON, function calling, and the schemas each API actually accepts. :::
Sign in to rate