How much better is GPT-5.5 Instant than GPT-5.3 Instant?

On OpenAI's evaluations, GPT-5.5 Instant produces 52.5% fewer hallucinated claims on high-stakes medicine, law, and finance prompts and 37.3% fewer inaccurate claims on conversations users had flagged for factual errors. AIME 2025 rises to 81.2 (from 65.4) and MMMU-Pro to 76 (from 69.2). Latency is unchanged.

What is the API ID and pricing for GPT-5.5 Instant?

GPT-5.5 Instant is available as the chat-latest alias. Standard pricing is $5.00 per million input tokens and $30.00 per million output tokens, with a 400,000-token context window. Batch and Flex are 50% off; Priority is 2.5× the standard rate.

Is GPT-5.5 Instant the same as GPT-5.5?

No. GPT-5.5 — the full reasoning model — released in ChatGPT on April 23, 2026 (with API access on April 24) and offers a 1,050,000-token context window. GPT-5.5 Instant is a faster variant tuned for default ChatGPT traffic, exposed via the chat-latest API alias with a 400,000-token context window. GPT-5.5 Thinking and GPT-5.5 Pro are higher-effort variants in the same family.

What is Memory Sources?

Memory Sources is a new ChatGPT feature that shows you what context the model used to personalize a response — saved memories, past chats, custom instructions, and (for Plus/Pro on web) library files and connected Gmail emails. You can delete or edit any source. OpenAI notes it may not surface every contributing factor.

Does the personalization feature work for everyone?

Not yet. The base GPT-5.5 Instant model swap is rolling out to all ChatGPT users immediately. Personalization that pulls from past chats, files, and Gmail launches first for Plus and Pro on the web, with Free, Go, Business, and Enterprise plans expected over the coming weeks.

GPT-5.5 Instant: ChatGPT's New Default Model in 2026

May 6, 2026

#GPT-5.5 #GPT-5.5 Instant #ChatGPT #OpenAI #LLM #AI hallucination #default model #AI memory

GPT-5.5 Instant: ChatGPT's New Default Model in 2026

TL;DR

OpenAI released GPT-5.5 Instant on May 5, 2026 — the new default model for ChatGPT, replacing GPT-5.3 Instant. The headline change is accuracy: 52.5% fewer hallucinated claims than GPT-5.3 Instant on high-stakes prompts in medicine, law, and finance, and 37.3% fewer inaccurate claims on conversations users had flagged for factual errors. AIME 2025 rises to 81.2 (up from 65.4) and MMMU-Pro to 76 (up from 69.2). The model rolls out today to all ChatGPT users — Free, Plus, Pro, Go, Business, and Enterprise — with personalization from past chats, files, and Gmail launching first on Plus and Pro web, then expanding. In the API, GPT-5.5 Instant is reachable as the chat-latest alias at $5/$30 per million input/output tokens with a 400K-token context window.

What You'll Learn

What GPT-5.5 Instant is and how it differs from GPT-5.3 Instant
The benchmark and hallucination numbers OpenAI cites — and what they mean in practice
What's actually new in ChatGPT today: Memory Sources and Gmail-aware personalization
Where GPT-5.5 Instant sits in the wider GPT-5.5 family (Instant vs Thinking vs Pro)
Pricing, context windows, and the chat-latest API alias
Style and tone changes: fewer emojis, tighter answers, same warmth

Release Details

GPT-5.5 Instant launched on May 5, 2026, the day before this post. It is rolling out as the new default model in ChatGPT for all logged-in users on every paid and free plan, and it replaces GPT-5.3 Instant — which had only been the default since March 3, 2026 — in both the consumer product and the API¹.

In the API, GPT-5.5 Instant is reached through the chat-latest alias rather than a frozen, dated snapshot. That alias automatically routes to whatever Instant model currently powers ChatGPT, so any future Instant update will replace this one without a developer-side ID change. For teams that need a pinned version, OpenAI has stated that the previous Instant model will remain addressable for paying API customers for three months after the swap².

This is the same overall release cadence OpenAI has used through the GPT-5 series: a fast, latency-optimized "Instant" model for everyday ChatGPT traffic, separate from the deeper-reasoning Thinking and Pro tiers. The full GPT-5.5 reasoning model and the GPT-5.5 Pro variant launched in ChatGPT on April 23, 2026, with API access for both arriving the next day; OpenAI cited "different safeguards" needed for API deployment at scale as the reason for the staggered rollout³.

Hallucination Reductions: The Headline Number

OpenAI's central pitch for GPT-5.5 Instant is fewer factual errors on the kinds of prompts where errors matter most:

Evaluation	Improvement vs GPT-5.3 Instant
Hallucinated claims on high-stakes prompts (medicine, law, finance)	52.5% fewer
Inaccurate claims on conversations users flagged for factual errors	37.3% fewer

The first number — 52.5% — is on prompts OpenAI specifically classifies as "high-stakes" in domains where a confident wrong answer can do real damage. The second number — 37.3% — comes from a harder set: real conversations that ChatGPT users had already marked as containing factual errors. Improving on a hand-picked corpus of known failures is a tougher test than improving on a fresh evaluation set⁴.

These are OpenAI's internal evaluations. Independent third-party hallucination benchmarks have not yet reported numbers for GPT-5.5 Instant, and cross-model hallucination comparisons are notoriously sensitive to methodology — the same model can score in the low single digits on grounded summarization (Vectara HHEM) and in the double digits on harder fabrication benchmarks (BridgeBench, CometAPI). Treat OpenAI's numbers as relative improvements over GPT-5.3 Instant rather than absolute hallucination rates.

For ChatGPT users, the practical effect is that the model OpenAI considers safe enough to run as the default for medical, legal, and financial questions has been retrained to be more conservative when it doesn't actually know.

Benchmark Gains: AIME 2025 and MMMU-Pro

OpenAI also published two academic benchmark scores for GPT-5.5 Instant against its predecessor:

Benchmark	GPT-5.5 Instant	GPT-5.3 Instant	Delta
AIME 2025 (math reasoning)	81.2	65.4	+15.8
MMMU-Pro (multimodal reasoning)	76	69.2	+6.8

AIME 2025 in the LLM benchmarking sense is the 30-problem set from the 2025 American Invitational Mathematics Examination (AIME I and AIME II), used as a math-reasoning evaluation for frontier models. A 15.8-point swing here on a low-latency Instant model, without invoking the Thinking or Pro tiers, is a notable jump.

MMMU-Pro is the more robust version of the Massive Multi-discipline Multimodal Understanding benchmark — it filters out questions answerable from text alone, expands answer choices from four to ten, and adds a vision-only setting where the question is embedded in an image. The 6.8-point gain is smaller in absolute terms but lands on a benchmark explicitly designed to resist guessing and text-only shortcuts⁵.

Both improvements come without a latency penalty. OpenAI explicitly says GPT-5.5 Instant maintains the response time of GPT-5.3 Instant while moving the accuracy needle.

What's Actually New in the ChatGPT App: Memory Sources

Alongside the model swap, OpenAI is rolling out a transparency feature called Memory Sources. When ChatGPT personalizes an answer using context it remembers about you, you can now tap the "Sources" icon below the response and see what it pulled from.

For all consumer plans, Memory Sources can show:

Saved memories from settings
Past chats it referenced
Custom instructions that influenced the answer

For Plus and Pro users on the web, it can also show:

Files in your library that the model referenced
Emails it pulled from a connected Gmail account

Each source is editable. You can delete chats you don't want cited, change saved memories in settings, or use temporary chats that don't read or write to memory at all⁶.

OpenAI is upfront that Memory Sources isn't a complete audit trail: it "may not show every factor that shaped a response," and it only appears inside your own account view — shared chats don't display sources to the recipient. This is a step toward explaining why ChatGPT said what it said, not a full provenance system.

The personalization feature itself — the model proactively pulling from past chats, files, and Gmail — is launching first for Plus and Pro on the web, with Free, Go, Business, and Enterprise plans expected to follow in the coming weeks. The base GPT-5.5 Instant model swap is happening immediately for everyone⁷.

The GPT-5.5 Family

GPT-5.5 Instant doesn't ship alone. OpenAI's current lineup looks like this:

Variant	Released	Best for	Highest published numbers
GPT-5.5 Instant	May 5, 2026	Default ChatGPT use; low latency	AIME 2025: 81.2; MMMU-Pro: 76
GPT-5.5 Thinking	April 23, 2026	Harder problems, multi-step reasoning	—
GPT-5.5 Pro	April 23, 2026 (API: April 24, 2026)	Highest-accuracy work, Pro/Business/Enterprise	BrowseComp: 90.1%; FrontierMath Tier 4: 39.6%

Instant is the model most ChatGPT users will actually hit. Thinking is the deliberate-reasoning variant for complex, multi-step problems. Pro is the highest-accuracy tier — gated to paid plans — and is where OpenAI has put its strongest published numbers on agentic browsing (BrowseComp at 90.1%) and frontier math (FrontierMath Tier 4 at 39.6%)³.

Behind the scenes, when a user has Instant selected in ChatGPT, the system can automatically escalate from GPT-5.5 Instant to GPT-5.5 Thinking on harder requests. In OpenAI's framing, GPT-5.5 Instant is "a single auto-switching system" — a unified entry point that calls in deeper reasoning when the task warrants it. GPT-5.5 Pro is gated to Pro, Business, and Enterprise users per OpenAI's announcement and is chosen from the model picker, not the everyday Instant default.

Pricing and API Access

GPT-5.5 Instant is available to developers through the chat-latest API alias. Published rates for that alias:

Tier	Input	Output
Standard	$5.00 / 1M tokens	$30.00 / 1M tokens
Batch / Flex (50% off)	$2.50 / 1M tokens	$15.00 / 1M tokens
Priority (2.5×)	$12.50 / 1M tokens	$75.00 / 1M tokens

The chat-latest context window is 400,000 tokens — large enough for substantial documents, but smaller than the 1,050,000-token ceiling on the full GPT-5.5 reasoning model. For prompts above 272,000 input tokens, OpenAI applies a 2× multiplier to input and a 1.5× multiplier to output for the rest of the session under standard, batch, and flex tiers⁸.

chat-latest is a moving target. The alias auto-resolves to whichever Instant model currently powers ChatGPT, so anything you build against it will silently update the next time OpenAI swaps the default. Teams that need reproducibility should pin to a dated snapshot rather than the alias.

Knowledge cutoff for the GPT-5.5 family is December 2025⁸.

Style and Tone Changes

OpenAI describes GPT-5.5 Instant as "smarter, clearer, and more personalized" — but the more concrete change is what it stops doing. Coverage of the launch noted that the model:

Uses fewer "gratuitous emojis"
Trims verbosity and over-formatting
Gives "tighter and more to-the-point" responses
Keeps warmth and personality, but with less ornamentation

For users who appreciated GPT-5.3 Instant's "anti-cringe" direction earlier this year — cutting unnecessary proclamations and overly enthusiastic phrasing — GPT-5.5 Instant continues that direction rather than reversing it. The model is meant to feel concise without becoming clinical⁹.

How It Compares

Within ChatGPT's default tier, GPT-5.5 Instant is a clean upgrade: same latency, materially fewer hallucinations on the prompts that most need them, meaningful jumps on AIME 2025 and MMMU-Pro. There is no obvious user-facing reason to prefer GPT-5.3 Instant — and OpenAI is removing that choice anyway by making the swap the new default.

Against the wider field, GPT-5.5 Instant occupies a specific niche. It is OpenAI's bet that low-latency, default-model accuracy is where the next round of competition matters — not just headline benchmark scores from heavyweight reasoning models like Claude Opus 4.7 or GPT-5.4. For users running thousands of quick prompts a day, a 52.5% drop in high-stakes hallucinations is felt more directly than a +6 SWE-bench Pro point on a model they rarely call.

Where Instant is still not the right tool: deep multi-step reasoning, autonomous agentic loops, and tasks where the cost of being wrong dwarfs the cost of waiting. Those go to GPT-5.5 Thinking or GPT-5.5 Pro — or, depending on the benchmark, to Claude Opus 4.7 or Gemini 3.1 Pro.

The Bottom Line

GPT-5.5 Instant is a quiet but consequential update: not a frontier-pushing reasoning model, but the model billions of ChatGPT prompts a day will actually use. The hallucination drop OpenAI reports — 52.5% on high-stakes medical, legal, and financial questions — is the kind of change that doesn't show up on a leaderboard but does show up in whether you can trust a default-model answer to a question that matters. Pair that with Memory Sources as a transparency layer and it's clear OpenAI's bet here is on default-tier reliability, not benchmark fireworks.

For day-to-day ChatGPT users, the right move is to notice the change, watch how the new model behaves on your usual prompts, and use Memory Sources when you want to know why ChatGPT thinks it knows you.

References

Frequently Asked Questions

GPT-5.5 Instant is OpenAI's new default model for ChatGPT, released on May 5, 2026. It replaces GPT-5.3 Instant for all ChatGPT users — Free, Plus, Pro, Go, Business, and Enterprise — and is reached in the API via the chat-latest alias.