Google Catches First AI-Built Zero-Day in the Wild (2026)

May 14, 2026

Google Catches First AI-Built Zero-Day in the Wild (2026)

TL;DR

On May 11, 2026, Google's Threat Intelligence Group (GTIG) published a new AI Threat Tracker report disclosing the first case it has observed of a criminal threat actor using a zero-day exploit it believes was developed with the help of an AI model.1 The vulnerability was a two-factor authentication (2FA) bypass implemented in a Python script targeting a popular open-source, web-based system administration tool. The exploit was caught before a planned mass-exploitation campaign and patched after coordinated disclosure to the vendor. GTIG flagged it as AI-generated based on educational docstrings, a hallucinated CVSS score, a textbook-clean Pythonic structure, and a fabricated _C ANSI color class — fingerprints of large language model output.1 Google declined to name the affected tool, the CVE, or the threat actor. According to John Hultquist, GTIG's chief analyst, "the AI vulnerability race" is no longer imminent — it has already begun.2


What You'll Learn

  • What GTIG actually disclosed on May 11, 2026, and what it deliberately did not
  • How investigators concluded the exploit was AI-built without recovering prompts or model logs
  • Where this finding sits relative to Anthropic's November 2025 cyber-espionage disclosure and the Claude Mythos preview
  • What new malware families (PROMPTSPY, CANFAIL, LONGSTREAM) tell us about AI-in-the-loop attacks
  • What defenders should change this quarter

What Google Disclosed — and What It Didn't

The new report, "Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access," is the latest installment of GTIG's running AI Threat Tracker.1 It follows the February 2026 tracker, which characterized AI-enabled offensive activity as still nascent. The May edition argues the picture has shifted from experimentation to operational reality.

The headline finding: GTIG observed a criminal threat actor preparing a mass-exploitation campaign built around a previously unknown 2FA-bypass zero-day in a "popular open-source, web-based system administration tool." Google's analysts concluded with high confidence that the exploit code itself was developed with help from a large language model.

What Google did not publish:

DetailDisclosed?
Name of the affected open-source toolNo
CVE identifierNo
Name or alias of the threat actorNo
Which LLM was usedNo (Gemini ruled out, others not named)
Whether the actor used a hosted API or a self-hosted modelNo

The vulnerability has been responsibly disclosed and patched. Google says its "proactive counter discovery" interrupted the campaign before mass exploitation could begin.1


How GTIG Concluded the Exploit Was AI-Built

Without access to the attacker's prompts or model logs, GTIG had to make its determination from the artifact itself — the Python exploit script. Four forensic markers stood out:

1. Educational docstrings. The script was littered with explanatory comments of the type LLMs produce for teaching code, not for shipping it. Real malware authors strip comments; this one read like a tutorial.

2. A hallucinated CVSS score. The script referenced a CVSS severity rating for the vulnerability — but the bug was a zero-day, so no official CVSS score existed. The AI essentially invented a risk rating that was never assigned by any authority.1

3. Textbook Pythonic structure. Clean type hints, idiomatic standard-library usage, detailed --help menus, and structured argument parsing. Operationally useful, but unusually well-formatted relative to typical exploit code.

4. A fabricated helper class. Google specifically called out a _C ANSI color class — the kind of self-contained utility LLMs love to generate as scaffolding, regardless of whether it's actually needed.1

Forensic attribution to AI is a new investigative discipline, and Google is explicit that these are behavioral fingerprints, not cryptographic proof. The team's working assumption: the actor used an LLM to support discovery, drafting, and weaponization — not autonomously, but as a productive collaborator.


Where This Sits Relative to Earlier "Firsts"

The "first" claim is doing some work here. Three earlier moments look similar but are not the same.

DisclosureReported byDateWhat was new
Big Sleep finds previously unknown bug in SQLiteGoogle DeepMind / Project ZeroOct–Nov 2024First time an AI agent found an exploitable, real-world memory-safety bug — but it was a defensive tool run by Google researchers, and the bug was patched before it shipped3
XBOW tops HackerOne US leaderboardXBOWQ2 2025First AI-driven vulnerability discovery to lead a major bug-bounty leaderboard — 1,060 reports, 132 confirmed and fixed — but inside the bounty boundary, authorised testing4
GTG-1002 autonomous cyber-espionage with Claude CodeAnthropicNov 13, 2025First reported large-scale attack where the AI agent itself executed 80–90% of tactical operations — about orchestration, not exploit-writing5
AI-developed zero-day used in the wildGoogle GTIGMay 11, 2026First observed case of an AI being used to find AND weaponize a previously unknown vulnerability against systems that did not consent to be tested1

The Anthropic disclosure was about orchestration — Claude Code being conscripted as an autonomous penetration testing agent. The Google disclosure is about artifacts — an AI helping write the exploit code for a flaw nobody knew existed. These are different attack surfaces, and both are now confirmed in the wild.

On the defensive side, Anthropic's restricted-release Claude Mythos Preview, disclosed in April 2026 and gated through Project Glasswing, has already identified thousands of zero-days for partner organizations to patch. The asymmetry GTIG describes is real but is being contested on both sides of the line at once.

As John Hultquist, GTIG's chief analyst, framed it: "for every zero-day we can trace back to AI, there are probably many more out there."6


The Wider Catalog: AI-in-the-Loop Malware

The May report does more than describe one zero-day. It catalogs a maturing ecosystem of AI-assisted malware GTIG has tracked since its February 2026 baseline.1

PROMPTSPY. Android backdoor first disclosed by ESET researchers and now tracked by GTIG. Instead of hardcoded screen taps, the malware sends Gemini a text prompt plus an XML dump of the current UI and receives back JSON tap coordinates. That lets it operate across Android versions and device layouts where rigid coordinate-based malware would break. It also stores prior prompts and responses for multi-step interactions.7

PROMPTFLUX — malware that rewrites itself. Introduced in GTIG's prior November 5, 2025 report, PROMPTFLUX is an experimental VBScript malware family first identified in early June 2025 that calls the Gemini API to rewrite its own source code for obfuscation and evasion. One variant prompts the LLM, framed as an "expert VB Script obfuscator," to regenerate the malware's entire source roughly once per hour to defeat static signatures. GTIG assesses PROMPTFLUX as still experimental — the version Google analysed could not yet compromise a victim — but it shows the direction of travel: malware that re-skins itself between detections without human involvement.89

PROMPTSTEAL — an LLM as a live command generator. First disclosed in November 2025 and revisited in the May 2026 report. Used by the Russian government-backed actor APT28 (also tracked as FROZENLAKE) against targets in Ukraine. PROMPTSTEAL masquerades as an image-generation program, but its real purpose is to query the Qwen2.5-Coder-32B-Instruct model via the Hugging Face API and execute the commands the LLM returns — meaning the operational logic of the malware lives in the prompt, not in the binary. That matters for detection: traditional malware analysis looks at what a binary does; PROMPTSTEAL's behaviour is determined by what an external model says it should do at runtime.19

CANFAIL. Russia-nexus malware targeting Ukrainian organizations. JavaScript dropper that uses AI-generated decoy code — coherent but inert code blocks — to obfuscate the malicious functionality. Developer comments in the source explicitly call out filler sections, indicating the operators asked the model to pad the script with inactive code.1

LONGSTREAM. Also Russia-linked. Uses the same AI-obfuscation technique: inactive blocks for plausible-looking administrative tasks unrelated to the actual payload.1

State-actor activity. GTIG also documents:

  • APT45 (North Korea / DPRK-linked) sending thousands of repetitive prompts to Gemini to recursively analyze CVEs and validate proof-of-concept exploits — an "AI grinder" pattern that would be impractical without model assistance.110
  • UNC2814 (China-linked) using expert-persona jailbreaking to push Gemini toward researching pre-authentication remote code execution flaws in TP-Link router firmware and Odette File Transfer Protocol implementations.1
  • TeamPCP (UNC6780) compromising GitHub repositories in March 2026 and embedding a credential stealer GTIG tracks as SANDCLOCK into affected build environments.1

The throughline: AI is no longer a sandbox curiosity in offensive operations. It is sitting inside the attack lifecycle — discovery, drafting, obfuscation, runtime — at scale.


What John Hultquist Said

John Hultquist, Chief Analyst at Google Threat Intelligence Group, framed the finding bluntly across press interviews:

"There's a misconception that the AI vulnerability race is imminent. The reality is that it's already begun."2

"The game's already begun and we expect the capability trajectory is pretty sharp. We do expect that this will be a much bigger problem, that there will be more devastating zero-day attacks done over this, especially as capabilities grow."2

The implication for defenders is uncomfortable. For every AI-assisted exploit GTIG can trace, there are likely more that don't leave such obvious fingerprints — and as model output gets less identifiably "model-shaped," even the textbook-Python tell could disappear within a release cycle or two.


What This Means for Security Teams This Quarter

If GTIG is right that this is a turning point and not a one-off, three implications follow:

1. Patch latency is now an AI race. When a model-assisted attacker can iterate on exploit code in hours, the time between vendor patch release and weaponization compresses. Software bills of materials and automated patching for open-source admin tools — especially anything web-facing with 2FA — deserve attention this quarter.

2. 2FA is necessary but not sufficient. The exploited tool had 2FA enabled. The exploit was the bypass. Defenders should assume single-mechanism authentication can be defeated by determined attackers and add behavioral signals (impossible travel, device posture, session anomalies) rather than treat 2FA as a terminal control.

3. Threat intelligence should now ingest AI-output fingerprints. GTIG's heuristics — over-explanatory docstrings, fabricated CVSS scores, scaffolded utility classes — are the kind of signal that should land in code-review tooling and EDR pipelines. They're imperfect, they'll be evaded, but they're the first generation of a new forensic category and are better than nothing.

For deeper background on AI's role in defensive security, see our coverage of Claude Mythos Preview and Project Glasswing and the AI-Cybersecurity jagged frontier where small open models are now finding showcase vulnerabilities too.


Bottom Line

For a year, the question among security researchers was when — not if — a real attacker would use an AI to develop a working zero-day against an unsuspecting target. Google's May 11 disclosure is the closest thing to a public confirmation of that date. The threat actor's name, the target tool, and the model used remain undisclosed, but the forensic case is on the record: a hallucinated CVSS score and a textbook ANSI color class are now part of the IOC vocabulary. The race Hultquist described isn't around the corner. It's the one defenders have already been running, whether they noticed or not.


Footnotes

  1. Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access — Google Cloud Threat Intelligence Blog, May 11, 2026 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

  2. Google Says It Likely Thwarted Effort by Hacker Group to Use AI for 'Mass Exploitation Event' — CNBC, May 11, 2026 2 3

  3. Google's AI Tool Big Sleep Finds Zero-Day Vulnerability in SQLite Database Engine — The Hacker News, November 2024

  4. The road to Top 1: How XBOW did it — XBOW, 2025

  5. Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign — Anthropic, November 13, 2025 2

  6. Google: Hackers are using AI to weaponize zero-day vulnerabilities — Fortune, May 12, 2026

  7. PromptSpy Is the First Known Android Malware to Use Generative AI at Runtime — BleepingComputer 2

  8. Google Uncovers PROMPTFLUX Malware That Uses Gemini AI to Rewrite Its Code Hourly — The Hacker News, November 5, 2025

  9. GTIG AI Threat Tracker: Advances in Threat Actor Usage of AI Tools — Google Cloud Blog, November 5, 2025 2

  10. Google spotted an AI-developed zero-day before attackers could use it — CyberScoop, May 11, 2026

Frequently Asked Questions

GTIG identified a criminal threat actor preparing a mass-exploitation campaign built around a previously unknown 2FA-bypass zero-day in a popular open-source web-based administration tool. The exploit was a Python script. Based on its structure and content, GTIG assesses with high confidence that the actor used a large language model to support discovery and weaponization of the flaw.1

FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.