What good and bad submissions look like

You will grade yourself. The next person to read your prompt — a teammate, a future you, the engineer you hand it off to — will not. To make sure your self-grade survives that handoff, here is what each kind of submission looks like.

A passing submission

A capstone prompt that scores 8 or higher has the same shape every time:

The system prompt fits in one screen. You can read it without scrolling.
Each of the five slots has a clear line. You can point at the role, the capabilities, the constraints, the format, and the example — they are not blended together.
The constraints include at least one banned-word rule and at least one refusal-scope rule.
There is a single worked example showing the joint shape of all the rules above.
The captured input is real (a real email, a real diff, a real ingredient list — not "imagine an email").
The captured output is real (run through the live API, not "the model would probably say").
The self-grade has 8, 9, or 10 ones in the rubric column, with a one-line note next to any zero explaining why it is not blocking.

If a teammate could pick up this prompt and ship a fix on Monday morning without asking you a single question, it passes.

A failing submission

Failing submissions tend to fail in the same predictable ways. Look at your three drafts and check whether any of these traps apply:

Failure mode	What it looks like	Fix
Toy scenario	"Make me a haiku writer." Nobody asks for haiku at 3pm Tuesday.	Pick a real, repeating task.
Wall-of-text constraints	Three paragraphs of "be friendly, be warm, be helpful, also..."	Convert to bulleted hard rules.
No real input	"Here is what someone might send..."	Paste a real message you actually got.
Missing refusal scope	The assistant cheerfully discusses anything.	Add a "refuse anything outside X" line.
Forged outputs	"The model would probably reply..."	Run it through the API. Paste the actual reply.
Self-grade inflation	10/10 with no critical eye.	Read the prompt as if a stranger wrote it; grade harder.
Over 400 words	Prompts that try to anticipate every edge case.	Cut. The model handles edges if the rules are clear.

Passing submission vs failing submission

Ship

Passing (≥ 8/10)

Prompt lengthFits one screen

SlotsAll 5 visible as separate lines

CapturesReal input + real output

Self-gradeHonest, with notes on zeros

Pros

Teammate can pick it up cold on Monday
Refusal scope holds against pushback
Output matches the spec, not just the rules

Redesign

Failing (< 8/10)

Prompt lengthWall of text

SlotsBlended together

Captures"The model would probably..."

Self-gradeInflated 10/10

Cons

Toy scenario nobody asked for
No refusal scope, no I-don't-know trigger
Spec and actual output do not match

How to actually self-grade without lying to yourself

Three tactics:

Wait 24 hours before grading. A prompt feels great the moment you write it. Sleep on it; come back tomorrow; grade then.
Read it like a teammate inheriting it. Ask "if I were debugging this at 3am, would I know what each line was for?" If not, the prompt is not ready.
Grade against the captures. If the prompt scored 9/10 on the rubric but the actual model output you captured does not match the spec, the prompt failed criterion 9 — drop a point.

Honesty here is what makes the rubric useful. Every prompt engineer who ships in production has shipped a prompt they thought was 10/10 that turned out to be a 6. The difference is they caught it before users did.

You are done

When all three prompts pass at 8 or above, the capstone is complete. Hagar finished hers in one weekend. You can finish yours in less if your scenarios are clear. Save the three prompts somewhere you will find them in six months — they are the foundation you will iterate on for the rest of your prompt-engineering career.

Welcome out the other side. Module 1 was a worried Hagar staring at a blank ChatGPT tab. The capstone is you, with three production prompts in your back pocket. :::

A passing submission

A failing submission

Passing submission vs failing submission

Passing (≥ 8/10)

Failing (< 8/10)

How to actually self-grade without lying to yourself

You are done

Quiz

Stay on the Nerd Track

What good and bad submissions look like

A passing submission

A failing submission

Passing submission vs failing submission

Passing (≥ 8/10)

Failing (&#x3C; 8/10)

How to actually self-grade without lying to yourself

You are done

Quiz

Stay on the Nerd Track

Failing (< 8/10)