Why the same prompt produces three outputs

Hagar finished the Foundation course six months ago. Her marketing-side-project at Bayt Coffee turned into a real role — she is now the AI engineer at a small Cairo startup, and her CTO walked into standup last Monday with a single question: "Why are we paying for Claude when GPT-4o-mini is one-tenth the price? Should we switch?"

That question is what this course answers.

You already know how to write a prompt. The Foundation course taught you the 5-slot skeleton, format locking, system prompts, few-shot, the I-don't-know trigger. Those rules still hold. What changes here is that the same prompt — same words, same constraints, same examples — produces a different answer depending on which model receives it.

Not slightly different. Visibly different. Different lengths, different formats, different willingness to push back, different rule-following discipline, different tone defaults.

Cross-model knowledge map

What you will see across this course

Every comparison lesson in this course was captured live on 2026-04-27 from three frontier APIs:

Model	Vendor	Role in the comparison
Claude Sonnet 4.5	Anthropic	The careful, format-disciplined one
GPT-4o-mini	OpenAI	The chatty, friendly default
Gemini 2.5 Flash	Google	The fast, sometimes-truncating one

These are not screenshots from a blog post. They are not paraphrased. The exact bytes the API returned are pasted into each lesson. When a model fails an instruction, you will see the failure verbatim. When two models give answers that look almost identical, you will see how the small differences matter.

What you will learn to decide

Which model produces the cleanest output for this task — without any prompt rewriting.
Which model needs the prompt rewritten to match its dialect, and what the rewrite looks like.
Which model fails on this task in a way that no prompt fix can repair, and what to fall back to.
What it costs you in tokens, latency, and dollars to get to a usable answer on each.

Hagar's CTO question — Claude vs GPT, should we switch — is not answered with "yes" or "no". It is answered with: "for these tasks GPT works fine, for these tasks Claude is worth paying for, for these tasks Gemini is faster, and here is the comparison report."

That report is your capstone.

Next: a strict-rule prompt sent to all three models. The output is uglier than you think. :::

What you will see across this course

What you will learn to decide

Quiz

Stay on the Nerd Track