Is MAI-Thinking-1 better than Claude?

Microsoft says MAI-Thinking-1 is toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro and was preferred over Claude Sonnet 4.6 in a blind human evaluation of 1,276 tasks. 2 These are Microsoft's own results, so independent benchmarking is still needed.

Can I use the MAI models today?

Yes, partially. MAI-Code-1-Flash is rolling out to GitHub Copilot individual users in VS Code. MAI-Thinking-1 is in private preview on Microsoft Foundry. The models are also distributed via OpenRouter, Fireworks, and Baseten, and developers can tune the weights themselves. 1 2 3

Did Microsoft distill these from OpenAI or other labs?

No. Microsoft repeatedly states it does not distill from other labs and trains its reasoning models from scratch on clean, licensed data. 1 2

What is Frontier Tuning?

It is Microsoft's method for adapting a MAI model to an organization's own workflows using reinforcement learning in real-world environments, producing a custom model the customer owns. 1

ai-ml

Microsoft MAI Models Explained: 7 In-House AI Models

June 22, 2026

#Microsoft MAI #MAI-Thinking-1 #MAI-Code-1-Flash #Microsoft AI #Mustafa Suleyman #GitHub Copilot #Build 2026 #frontier models #superintelligence

Microsoft MAI Models Explained: 7 In-House AI Models

At Build 2026, Microsoft unveiled seven first-party MAI models spanning reasoning, coding, image, transcription, and voice. All were trained from scratch with no distillation from other labs — a deliberate push toward what Microsoft calls long-term self-sufficiency, after years of leaning on OpenAI.¹

TL;DR

On June 2, 2026, Microsoft AI chief Mustafa Suleyman announced a family of seven new MAI (Microsoft AI) models built entirely in-house.¹ The headliners are MAI-Thinking-1, a 35-billion-active-parameter reasoning model that Microsoft says goes toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro, and MAI-Code-1-Flash, a 5-billion-active-parameter coding model now rolling out inside GitHub Copilot.²³ Microsoft trained these models from the ground up on clean, licensed data — no distillation from rival labs — and co-designed them with its own Maia 200 silicon. The strategic message is clear: Microsoft wants to stop being dependent on a single partner for frontier intelligence.

What You'll Learn

What the seven MAI models are and which task each one handles
How MAI-Thinking-1 and MAI-Code-1-Flash perform on Microsoft's benchmarks
Why "trained from scratch, no distillation" is the strategic core of the launch
What Frontier Tuning is and why enterprises should care
Where you can actually use the models today

Why Microsoft built its own models

Microsoft spent years as one of OpenAI's largest financial backers, and Copilot has run heavily on OpenAI's models. Microsoft began shipping in-house models in 2025 with MAI-1-preview and MAI-Voice-1; this Build 2026 release is a major expansion of that lineup into reasoning and agentic coding, advancing a pivot the company frames as "long-term self-sufficiency."¹⁴ CNBC characterized the move as Microsoft working to lessen its reliance on OpenAI and lower costs for developers.⁵

Suleyman framed the launch around scale. He noted that the compute used to train frontier models has grown by a factor of one trillion, and said Microsoft expects another thousand-fold increase over the next three years.¹ The stated goal is what Microsoft calls a "hill-climbing machine" — an organization and pipeline that keeps improving cycle after cycle as it adds compute, better data, and sharper evaluation.

The seven MAI models at a glance

The MAI family is a multimodal lineup, with each model specialized for a different kind of real-world work.¹

Model	Job	Microsoft's headline claim
MAI-Thinking-1	Reasoning	Toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro²
MAI-Code-1-Flash	Agentic coding	Outperforms Claude Haiku 4.5 across coding benchmarks tested³
MAI-Image-2.5	Text-to-image + editing	Launched No. 2 for image editing on Arena¹
MAI-Image-2.5-Flash	Efficient image	Ultra-efficient variant of MAI-Image-2.5¹
MAI-Transcribe-1.5	Transcription	SOTA accuracy, 5× faster, across 43 languages¹
MAI-Voice-2	Speech generation	Natural-sounding speech across 15 languages¹
MAI-Voice-2-Flash	Efficient voice	Lower-cost variant, coming soon¹

A consistent thread runs through all seven: Microsoft says it does not distill from other labs and does not rely on opaque data. Its datasets are described as clean, traceable, and enterprise-grade.¹

MAI-Thinking-1: the flagship reasoner

MAI-Thinking-1 is the family's reasoning model and its most ambitious claim. It is a sparse Mixture-of-Experts model with 35 billion active parameters out of roughly 1 trillion total — a small inference footprint relative to much larger frontier systems.² Despite the modest active size, Microsoft says it is "toe-to-toe with Claude Opus 4.6 on SWE-Bench Pro," the diverse, real-world coding benchmark.²

On math, Microsoft reports MAI-Thinking-1 scoring 97.0% on AIME 2025 and 94.5% on AIME 2026.² The company also ran a blind, side-by-side human evaluation with rating partner Surge, spanning 1,276 single- and multi-turn tasks, in which raters preferred MAI-Thinking-1 over Claude Sonnet 4.6.²

For enterprise use, the model supports a 256,000-token context window — enough for roughly a 600-page document — along with function calling, layered developer instructions, and compatibility with the widely used Chat Completions API.² It is available in private preview on Microsoft Foundry, with a public preview on MAI Playground promised soon.²

The framing matters as much as the numbers. Microsoft argues that "capabilities should be learned, not inherited" — its case for training from scratch rather than distilling a teacher model, which it says produces a more steerable, adaptable system.²

MAI-Code-1-Flash: small, fast, and inside Copilot

MAI-Code-1-Flash is the coding workhorse, designed for everyday developer assistance rather than benchmark glory. With 5 billion active parameters, it is comparable to Claude Haiku 4.5 but cheaper to run, according to Microsoft.¹³

Microsoft trained it directly with the GitHub Copilot harnesses used in production, so the model learns to interact with the surrounding tools the way a real coding agent would. On Microsoft's own evaluations, it outperforms Claude Haiku 4.5 across every core coding benchmark tested, including a 16-point lead on SWE-Bench Pro (51.2% versus 35.2%).³ It is also leaner: Microsoft reports it solving harder problems with up to 60% fewer tokens on SWE-Bench Verified.³

To probe genuine reasoning rather than memorization, Microsoft built a 186-question, 34-category benchmark of adversarial traps — inverted classics, impossible tasks, underdetermined scenarios — on which MAI-Code-1-Flash reached 85.8% adjusted accuracy.³ The model is now rolling out to GitHub Copilot individual users in VS Code, appearing in both the model picker and the Auto picker.³ If you write code with Copilot, this is the MAI model you are most likely to touch first. (For the bigger picture on AI coding assistants, see our guide to AI assistance for coding.)

Frontier Tuning: the enterprise pitch

The most strategically interesting piece is not a model at all — it is a method called Microsoft Frontier Tuning. Using reinforcement learning in real-world environments (what Microsoft calls RLEs, or "training gyms" for AI), customers can adapt a MAI model to the specific traces of how work actually gets done inside their organization.¹

Microsoft's example: a MAI model tuned for Excel matched GPT-5.4 while being up to 10× more efficient, and a model tuned to a market-leading organization's standards achieved the highest win rate of any model tested at roughly 10× lower cost.¹ The pitch to enterprises is ownership — you build your own model, trained on your data, in your environment, and it stays yours.

Custom silicon and the infrastructure story

Microsoft is co-designing the MAI models with its own Maia 200 accelerator, and says it is already seeing a 1.4× efficiency boost from that co-design.¹ The lab also notes that its next-generation GB200 cluster is now operational.¹ Owning more of the stack — from silicon through the training pipeline to post-training — is central to the self-sufficiency thesis. It echoes a broader hyperscaler trend toward custom AI chips that we covered in the Meta-Broadcom MTIA deal.

The Mayo Clinic collaboration

Alongside the models, Microsoft announced a partnership with the Mayo Clinic to co-create a frontier AI model for healthcare, combining Mayo's clinical expertise and de-identified clinical data with Microsoft's foundational AI.¹ The model will be deployed first within Mayo's own environment and, once validated, made available to other organizations through Microsoft Foundry. Notably, the finished model will be owned by Mayo Clinic — a structure Microsoft ties to patient trust and responsible stewardship of health data.¹

What it means

The MAI launch does not crown Microsoft the new frontier leader; the strongest claims are Microsoft's own, benchmarked on Microsoft's own harnesses. But it is a credible, broad, first-party portfolio shipped on real products — Copilot, VS Code, Foundry — rather than a research preview. For a company that built its AI strategy on a single partnership, demonstrating it can train competitive reasoning and coding models from scratch is the headline, regardless of which benchmark wins. As Microsoft scales the compute it has promised, the open question is how quickly "competitive with" turns into "ahead of." The broader competitive backdrop — including Anthropic surpassing OpenAI in annualized revenue — shows just how fast the frontier is moving.

Bottom line

Microsoft's seven MAI models mark its clearest move yet toward AI self-sufficiency. MAI-Thinking-1 and MAI-Code-1-Flash show the company can build competitive reasoning and coding models from scratch, while Frontier Tuning and the Maia 200 co-design reveal a strategy aimed at owning the whole stack. Whether the models lead the frontier or merely reach it, Microsoft has proven it no longer has to rely on anyone else to get there.

Microsoft AI, "Building a hill-climbing machine: Launching seven new MAI models," June 2, 2026 (updated June 8, 2026). https://microsoft.ai/news/building-a-hillclimbing-machine-launching-seven-new-mai-models/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹² ↩¹³ ↩¹⁴ ↩¹⁵ ↩¹⁶ ↩¹⁷ ↩¹⁸ ↩¹⁹ ↩²⁰ ↩²¹ ↩²²
Microsoft AI, "Introducing MAI-Thinking-1," June 2, 2026. https://microsoft.ai/news/introducing-mai-thinking-1/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸ ↩⁹ ↩¹⁰ ↩¹¹ ↩¹²
Microsoft AI, "Introducing MAI-Code-1-Flash," June 2, 2026. https://microsoft.ai/news/introducingmai-code-1-flash/ ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸
MarkTechPost, "Microsoft AI Lab Unveils MAI-Voice-1 and MAI-1-Preview: New In-House Models," August 29, 2025. https://www.marktechpost.com/2025/08/29/microsoft-ai-lab-unveils-mai-voice-1-and-mai-1-preview-new-in-house-models-for-voice-ai/ ↩
CNBC, "Microsoft unveils new AI models to lessen reliance on OpenAI and lower costs for developers," June 2, 2026. https://www.cnbc.com/2026/06/02/microsoft-unveils-new-ai-models-lessen-reliance-on-openai-lower-costs.html ↩

Frequently Asked Questions

MAI (Microsoft AI) models are a family of seven first-party models Microsoft announced at Build 2026 on June 2, covering reasoning, coding, image generation, transcription, and voice. They were built in-house and trained from scratch without distillation from other labs. 1