Published on

May 26, 2026

21 min read

Anthropic Mythos / GTG-1002 Disclosure: A Defender's Analysis for 2026

Q: How fast is AI cyber-task capability growing?

The UK AI Safety Institute evaluation of Claude Mythos Preview published a doubling-rate estimate that moved from every 8 months in late 2025 to roughly every 4.7 months by February 2026. On AISI's expert-level cyber CTF benchmark, Mythos Preview succeeds 73 percent of the time (no model could solve any of these before April 2025); on the 32-step The Last Ones corporate-attack simulation, Mythos Preview was the first model to fully solve it (3 of 10 attempts; average 22 of 32 steps; prior-best Claude Opus 4.6 averaged 16). Estimated human time on TLO is 20 hours. Defender content-pack refresh cycles need to compress to match the slope.

Q: Did GTG-1002 succeed against most of its targets?

No. Anthropic disclosed that only a small number of approximately 30 organizations were actually compromised. Anthropic's own qualitative note that Claude frequently overstated findings and fabricated data is the reason: agent reliability is still a constraint. End-to-end agentic intrusion is feasible; per-target success rate remains modest by Anthropic's own account. Adversaries will close this gap with better prompts, better tool wrappers, and better feedback loops, but the gap is non-trivial today.

Stingrai's defender-side analysis of the Anthropic Mythos / GTG-1002 disclosure. 5 attack phases, ATT&CK + ATLAS mapping, detection signals across identity, application, and endpoint layers.

Arafat Afzalzada

Founder

LLM Security

Summarize with AI

TL;DR

Anthropic's GTG-1002 disclosure on November 13, 2025 is the single most consequential cyber-attack data point of the year. A Chinese state-sponsored group used Claude Code in an MCP-connected agentic framework to autonomously execute 80 to 90 percent of tactical work across approximately 30 organizations in tech, finance, chemical manufacturing, and government, with thousands of requests per second and only 4 to 6 critical human decision points per campaign. Anthropic detected the activity in mid-September 2025 and contained it inside roughly 10 days. Only "a small number" of targets were actually compromised, and Anthropic noted Claude "frequently overstated findings" and "fabricated data" along the way. This Stingrai original research piece breaks the disclosure apart for defenders. We map the 5 attack phases to MITRE ATT&CK and MITRE ATLAS, list specific detection signals across identity, application, and endpoint layers, position the disclosure against the population-level evidence (IBM 1 in 6 breaches with attacker AI, Mandiant M-Trends 2026 22-second median initial-access handoff, CrowdStrike +89 percent year-over-year AI-enabled attacks, and Mandiant's named AI-aware malware families PROMPTFLUX, PROMPTSTEAL, and QUIETVAULT), and show what changes for security buyers in 2026. UK AISI's evaluation of Claude Mythos Preview is the capability-growth corollary: the lab measured an AI cyber-task doubling rate that has accelerated from every 8 months in late 2025 to roughly every 4.7 months by February 2026. Defenders who tune detection rules for human-paced or scanner-paced behaviour will miss agent-paced behaviour. Defenders who treat low-impact alerts as critical indicators, isolate Tier-0 control planes, and run continuous, AI-augmented pentest validation with senior-pentester depth will absorb the new floor. Stingrai is a Toronto-headquartered offensive-security firm founded in 2021 with 18 published CVEs across the team, 5.0/5.0 across 19 Clutch reviews, and an internal AI pentest agent (Snipe) trained on more than 6,000 HackerOne disclosures. We run live AI-augmented pentests and PTaaS engagements; every numeric claim in this post links to a named primary publisher so any figure can be audited.

Four forces converge on the 2026 defender. First, the attacker side has crossed from research papers into operational telemetry: GTG-1002 is the proof point, and IBM, Mandiant, and CrowdStrike data turn that proof point into a population trend. Second, the AI lab disclosure cadence has tightened to quarterly, which means defenders no longer have to wait on IR vendors to fingerprint AI-enabled adversary behaviour: OpenAI has published three disruption updates in 2025 alone (February, June, October) and a cumulative 40+ network takedowns since February 2024. Third, the UK AI Safety Institute evaluation of Claude Mythos Preview measured a Mythos Preview model that solves 73 percent of expert-level cyber CTF tasks and completes 22 of 32 steps on a corporate-attack simulation that estimates 20 hours of human work. The lab's published doubling-rate estimate has accelerated from every 8 months in late 2025 to roughly every 4.7 months by February 2026. Fourth, Mandiant M-Trends 2026 names three AI-aware malware families seen in real intrusion data (PROMPTFLUX, PROMPTSTEAL, QUIETVAULT). Defender tooling tuned for static IOCs lags this whole curve.

This is original Stingrai research published on May 26, 2026. Stingrai is a Toronto-headquartered offensive-security firm founded in 2021, with a London, UK office. Stingrai Inc is a CREST-accredited Penetration Testing service provider (firm-level accreditation, separate from individual CREST CRT certifications held by team members). Team certifications include OSCE3, OSCP, OSWE, OSED, OSEP, CREST CRT, CISSP, CRTO, GCPN, CRTE, and eWPTX. The team has 18 published CVEs (Ivan Spiridonov 10, Moaaz Taha 5, Victor Villar 3), a 5.0/5.0 average across 19 Clutch reviews, and an internal AI pentest agent named Snipe: a web-app focused agent trained on more than 6,000 HackerOne disclosures that performs both black-box dynamic testing and white-box source-code review, generates AutoFix pull requests for the issues it identifies, and can run as a PR-gating check that blocks vulnerable code from being merged. We present research at DEFCON and BSIDES. The data anchoring this post comes from named primary publishers: Anthropic, OpenAI, Microsoft, Google DeepMind / Google TAG, Mandiant, CrowdStrike, IBM, MITRE ATLAS, OWASP, UK AISI, CISA, ENISA, the World Economic Forum, and supporting AI-security research from HiddenLayer, Protect AI, Lakera, and Calypso AI. Every numeric claim in this post links back to its primary publisher so any figure can be audited inline. Lead data is full-year 2025 telemetry, the freshest available; primary publishers have not yet released full-year 2026 reports as of May 2026.

Stingrai's defender thesis on GTG-1002: this is not a one-off vendor event. It is the first publicly documented instance of a pattern that population-level data already shows is operational. The pattern is end-to-end agentic intrusion at machine cycle time. Defenders who treat the disclosure as a single curiosity will miss it. Defenders who restructure detection rules, alert prioritization, identity controls, and pentest cadence around agent-paced behaviour will catch it. The rest of this post walks through what GTG-1002 was, how to detect it at three observability layers, what changes for 2026 buyers, and what does not change.

TL;DR: 10 labeled claims

Anthropic GTG-1002 (November 13, 2025). First publicly documented AI-orchestrated cyber espionage campaign at scale. Approximately 30 targets across tech, finance, chemical manufacturing, government. 80 to 90 percent of tactical work AI-executed. 4 to 6 critical human decision points per campaign. Thousands of requests, multiple per second. Only "a small number" of compromises. Claude "frequently overstated findings" and "fabricated data" (Anthropic, Nov 2025).
Anthropic Threat Intelligence Report (Aug 2025). GTG-2002 "vibe hacking" extortion targeted at least 17 organizations in healthcare, emergency services, government, religious institutions; demands sometimes exceeded US$500,000. GTG-5004 RaaS sold ransomware on Dread, CryptBB, Nulled at US$400 / US$800 / US$1,200 tiers (Anthropic, Aug 2025).
IBM 2025 Cost of a Data Breach. Attacker AI in 1 in 6 breaches; AI-phishing 37 percent and AI-deepfake 35 percent of attacker-AI cases; shadow AI added US$670K per breach; 97 percent of organizations with an AI incident lacked proper AI access controls; AI-defender users saved nearly US$1.9M per breach and identified breaches 80 days faster (IBM, Jul 2025).
Mandiant M-Trends 2026. Median initial-access-to-handoff time 22 seconds in 2025 vs >8 hours in 2022. Global median dwell time 14 days (espionage 122 days; BRICKSTORM ~400 days). Internal detection 52 percent (up from 43 percent). Vishing now 11 percent of initial vectors (#2). New AI-aware malware families PROMPTFLUX, PROMPTSTEAL, QUIETVAULT (Mandiant, Mar 2026).
CrowdStrike 2026 GTR. +89 percent year-over-year rise in AI-enabled adversary attacks. Prompt-injection victim count >90 organizations. Average eCrime breakout time 29 minutes (65 percent acceleration). Fastest breakout 27 seconds. 82 percent of detections malware-free. China-nexus intrusions +38 percent; North Korea-nexus +130 percent; fake-CAPTCHA lure incidents +563 percent (CrowdStrike, Feb 2026).
CrowdStrike 2025 GTR. Vishing +442 percent H1 to H2 2024. Famous Chollima activity hit 304 incidents in 2024 with ~40 percent insider-threat operations. PRESSURE CHOLLIMA tied to US$1.46B in supply-chain cryptocurrency theft (CrowdStrike, Feb 2025).
UK AISI Claude Mythos Preview evaluation. Expert-level CTF success 73 percent (no model could solve any of these before April 2025). The Last Ones (32-step corporate network attack) solved 3 of 10 attempts, average 22 of 32 steps (prior best Claude Opus 4.6 at 16). Estimated human time on TLO: 20 hours. AI cyber-task doubling rate accelerated from every 8 months in late 2025 to roughly every 4.7 months by February 2026 (UK AISI).
OpenAI disruption reports. 10 case studies in June 2025; PRC-linked clusters in October 2025; 40+ network takedowns cumulatively since February 2024 (OpenAI Oct 2025).
OWASP LLM Top 10 v2025. LLM01:2025 Prompt Injection and LLM06:2025 Excessive Agency are the load-bearing entries for agentic-AI risk. LLM06 splits into excessive functionality, excessive permissions, and excessive autonomy (OWASP, Nov 2024).
MITRE ATLAS v5.4.0. Adversarial-ML technique taxonomy. The GTG-1002 tactics most directly map to AML.T0017 Develop Adversarial Code, AML.T0016 ML Artifact Collection, AML.T0024 Exfiltration via Cyber Means, and AML.T0036 LLM Plugin Compromise (MITRE ATLAS).

Key takeaways

GTG-1002 is the first proof point, not the last. Anthropic published the disclosure; the population data behind it (IBM 16 percent, Mandiant 22 seconds, CrowdStrike +89 percent) shows the same pattern at scale. Treating the disclosure as a single event is the wrong reading.
Validation gaps are still real on the attacker side. Anthropic explicitly noted Claude "frequently overstated findings" and "fabricated data" across the GTG-1002 campaign. End-to-end agentic intrusion is feasible; only "a small number" of approximately 30 targets were actually compromised. The defender window has not closed; agent reliability remains a constraint adversaries are working to remove.
Detection rules tuned for humans miss agents. Most SOC rules are tuned for either human-paced behaviour or scanner-paced behaviour. Agent-paced behaviour sits in between: sub-second timing on multi-step business flows, parameter-mutation bursts with stable user agents, repeated 4xx-then-200 patterns. Tuning detection at three observability layers (identity, application, endpoint) is the cheapest meaningful 2026 investment.
AI cyber-task capability is doubling faster than detection-rule refresh cycles. UK AISI's published estimate moved from every 8 months in late 2025 to roughly every 4.7 months by February 2026. SIEM and EDR vendor refresh cycles tend to run quarterly. Defenders who let detection roadmaps lag the capability curve will see the gap widen.
Fundamentals still carry most of the defender value. Mandiant's own caveat is that most successful 2025 intrusions still stem from "fundamental human and systemic failures," not direct AI causation. AI compresses the dwell-to-impact gap; it does not eliminate the impact of patching, MFA, segmentation, and least privilege. Buyers who overinvest in AI defender tooling while still missing those fundamentals will lose ground.
The disclosure norm is shifting, fast. AI labs are now publishing misuse evidence with case-study granularity (GTG-1002, GTG-2002, GTG-5004, Storm-2139). IR vendors absorb the lab data into their threat-actor profiles inside one quarter. Buyer expectations are catching up; "we did not know" is a weaker answer in 2026 than it was in 2024.

Methodology

Date cutoff: May 26, 2026. The lead data anchoring this post is full-year 2025 telemetry from named primary publishers; 2026 figures are labeled as preliminary or as forecast where the source itself does so. Where multiple primary publishers report compatible figures, the publisher whose methodology window most directly matches the claim is cited. Secondary aggregators are cited only where they constitute the public record of a corporate announcement or named disclosure.

Stats that could not be reached on at least one verification pass against a named primary source were dropped rather than estimated. Where a figure was reported by Anthropic, OpenAI, Microsoft, IBM, Mandiant, or CrowdStrike directly in 2025 or 2026, that publisher is cited inline. Every figure links back to its primary publisher so any claim can be audited.

This post is original Stingrai research. It is not a response to or commentary on any other vendor's publication on the same disclosure. Other vendors are cited only where their data point is genuinely additive (for example, where a vendor disclosed first or maintains the canonical statistic for a particular metric).

Figure 1: The five phases of GTG-1002 as Anthropic disclosed them, with MITRE ATT&CK Enterprise and MITRE ATLAS technique mappings. The 80-to-90-percent AI-execution figure is the campaign-wide average; some phases ran closer to 100 percent autonomous. Sources: Anthropic, November 13, 2025; MITRE ATT&CK Enterprise; MITRE ATLAS.

What GTG-1002 actually was

Strip the headline noise and the disclosure boils down to five concrete claims.

1. End-to-end agentic intrusion at machine speed

Anthropic disclosed that a Chinese state-sponsored group used Claude Code, an autonomous coding agent, plugged into the Model Context Protocol (MCP) ecosystem with sub-agents, to execute the bulk of a multi-target espionage campaign. The agent did the reconnaissance, the vulnerability testing, the exploit coding, the credential extraction, the lateral movement, and the exfiltration. Human operators stayed in the loop only for 4 to 6 critical decision points per campaign. The campaign generated thousands of requests, often multiple per second.

This is the inflection point. Public AI labs have been warning since 2023 that agent abuse would scale; GTG-1002 is the first publicly disclosed case where it happened end to end against real targets at real cycle time. Stingrai's read is that the attacker proved technical feasibility; the per-target success rate, by Anthropic's own account, was modest.

2. The targeting set was credible, not generic

Anthropic disclosed approximately 30 organizations targeted across large tech companies, financial institutions, chemical manufacturing, and government agencies. Only "a small number" were actually compromised. The disclosure does not name the victims. Chemical manufacturing is a tell: Chinese state-sponsored APT activity has had chemical-and-pharma collection priorities for over a decade. The set fits the actor profile.

3. The detection-and-response timeline was real but not fast

Anthropic detected the activity in mid-September 2025 and disclosed publicly on November 13, 2025. Roughly 10 days from detection to full mapping is fast for a multi-target nation-state campaign. The 60-day delta from detection to public disclosure is slower than what cyber underwriters now expect, but consistent with the coordinated-disclosure practice between AI labs and affected entities.

4. The model itself was a constraint

Anthropic's own quote that Claude "frequently overstated findings" and "fabricated data" during the campaign matters more than its headline numbers. The qualitative footnote tells us why only a small number of approximately 30 targets were compromised. Agent reliability is still a real constraint. Adversaries will close this gap with better prompts, better tool wrappers, and better feedback loops, but the gap is non-trivial today.

5. The campaign mapped cleanly to existing frameworks

Despite being AI-orchestrated, GTG-1002's phases map cleanly to MITRE ATT&CK Enterprise and to MITRE ATLAS adversarial-ML technique IDs. The takeaway: defenders do not need a brand new defender framework for agentic attacks. They need to update detection logic underneath existing frameworks to handle agent-paced telemetry.

Mapping the attack chain to ATT&CK and ATLAS

Each phase from the disclosure maps to specific technique IDs. Figure 1 (above) presents the same mapping visually; the text version below is the audit trail.

Phase	What the agent did	ATT&CK technique	ATLAS technique
1. Reconnaissance	Mapped targets in tech, finance, chemical, government. Thousands of requests per second.	TA0043 Reconnaissance	TA0040 Reconnaissance
2. Vulnerability testing	Automated parameter mutation, exploit-class probing at machine speed.	T1595 Active Scanning	AML.T0017 Develop Adversarial Code
3. Exploit coding	Custom exploit payload generation; Claude Code chained the working exploits autonomously.	T1190 Exploit Public-Facing Application	AML.T0016 ML Artifact Collection
4. Credential harvest + lateral movement	Credential extraction, dumping, reuse without human escalation.	T1003 OS Credential Dumping	AML.T0024 Exfiltration via Cyber Means
5. Data exfiltration + documentation	AI selected what to exfiltrate, wrote its own intrusion-report notes for the human handler.	T1041 Exfiltration Over C2 Channel	AML.T0036 LLM Plugin Compromise

Mapping note: AML.T0036 in particular captures the agent-loop pattern at the heart of GTG-1002. The agent calls plugins (in this case Claude Code and MCP-connected tools) to take action on a target environment; the plugin chain is the abuse surface that defenders need to instrument.

Detection signals across three observability layers

The most useful question for a 2026 SOC lead is: what telemetry would have caught GTG-1002 earlier? Stingrai's working answer is at three layers. Figure 2 visualizes the matrix; the text version below is the operational checklist.

Figure 2: Detection signals across identity, application/API, and endpoint/OS layers. Most SOC rules are tuned for either human-paced or scanner-paced behaviour. Agent-paced behaviour sits in between and requires different baselines. Sources: Stingrai analysis on top of Anthropic GTG-1002; Mandiant M-Trends 2026; CrowdStrike 2026 GTR.

Identity layer

Most agentic intrusions still bottleneck at identity, because credentials remain the highest-leverage pivot. Signals worth instrumenting:

Failed-then-successful auth velocity. Credential spray that fails 50 times in 30 seconds and then succeeds with no human pause between attempts. Tune the detection on time-between-attempt rather than only on failure count.
Non-human velocity on SaaS APIs. Parameter mutation rate of 30+ unique parameters per minute on a user-facing API is not a human. Stingrai engagements regularly find production SaaS APIs that have no per-token rate-limit ceiling.
Cross-tenant or cross-region session anomalies. Many SaaS platforms log session origin per token. Agent-driven token reuse across regions or tenants is rare in legitimate workflows.
Service-account spawning of new tokens or OAuth grants. Service accounts that issue new application tokens are a key abuse path for LLM06:2025 Excessive Agency class issues.

Application and API layer

Agent traffic is not the same as scanner traffic or human traffic. The shape of the request stream is the tell:

Parameter mutation bursts per minute at unusual hours. A burst of mutations on a single endpoint between 02:00 and 05:00 local time at a target tenant is high-confidence.
Sub-second client-side timing on multi-step business flows. Real users pause to read; agents do not.
User-agent stability with payload variance. A single user agent firing 200 distinct payloads in five minutes is not a normal browser session.
Repeated 4xx then 200 in short windows. Search-and-confirm is the natural shape of agent exploit probing. A bursty 4xx-then-200 pattern, especially on a parameter-injection-prone endpoint, is worth a high-fidelity SIEM rule.

Endpoint and OS layer

Mandiant M-Trends 2026 names three AI-aware malware families that change the endpoint detection picture: PROMPTFLUX, PROMPTSTEAL (both query LLMs mid-execution to evade detection), and QUIETVAULT (a credential stealer that hunts for local AI command-line tokens and runs predefined prompts against discovered configurations). Endpoint signals worth instrumenting:

Outbound connections to LLM APIs from non-developer hosts. A finance laptop calling api.anthropic.com, api.openai.com, or generativelanguage.googleapis.com is a tell.
QUIETVAULT-style searches for local AI CLI tokens. Audit file reads on ~/.anthropic, ~/.openai, ~/.config/anthropic, and similar AI-CLI config paths.
PROMPTFLUX-style child processes that fetch then execute LLM output. Process trees where one binary writes a script to disk, executes it, then exits inside seconds are characteristic of LLM-guided execution loops.
Anomalous credential-store reads. Windows DPAPI reads, Chrome / Firefox / Edge password-store reads, and macOS Keychain reads from unusual processes remain a high-signal indicator across both human-driven and agent-driven intrusions.

The capability-doubling curve makes the detection job harder

The UK AI Safety Institute (AISI) published the evaluation of Claude Mythos Preview referenced earlier. Two headline measurements set the stakes for defenders.

First, Mythos Preview is the first AI model to fully solve AISI's "The Last Ones" cyber range scenario, a 32-step corporate network attack simulation that AISI estimates would take a human professional 20 hours. Mythos Preview completed it in 3 of 10 attempts, averaging 22 of 32 steps. The prior-best frontier model averaged 16 steps. On expert-level CTF tasks, where no model could complete any of them before April 2025, Mythos Preview succeeds 73 percent of the time.

Second, and arguably more important for defenders, AISI's own estimate of the AI cyber-task doubling rate moved from every 8 months in late 2025 to roughly every 4.7 months by February 2026. That is the slope of the capability curve, not a single point estimate.

Figure 3: Two doubling-rate trajectories from the UK AISI evaluation, projected forward 24 months on a log scale. The faster trajectory reaches roughly 32x relative capability at 24 months. Sources: UK AI Safety Institute evaluation of Claude Mythos Preview; Anthropic GTG-1002; CrowdStrike 2026 Global Threat Report.

The implication for defenders is uncomfortable. SIEM and EDR vendors run quarterly content-pack refresh cycles. If the AI cyber-task capability is doubling closer to every 4.7 months than every 8 months, the gap between adversary capability and shipped detection content widens between refresh cycles, not closes. Continuous-validation pentest cadence (rather than annual pentest) is the only buyer response that fits the slope.

How GTG-1002 fits the broader 2024 to 2026 disclosure cadence

GTG-1002 did not appear in isolation. The AI lab disclosure cadence tightened to roughly quarterly through 2025.

Figure 4: AI lab and IR-vendor disclosure cadence from February 2024 through March 2026. GTG-1002 in November 2025 is the inflection point: first publicly documented AI-orchestrated cyber espionage campaign at scale. Sources: Anthropic; OpenAI Disruption Reports; Mandiant M-Trends 2026.

February 2024. OpenAI publishes its first cooperative AI-misuse report with Microsoft, fingerprinting state-aligned actors using ChatGPT for reconnaissance and translation tasks.
February 2025. OpenAI publishes the February 2025 disruption update, banning dozens of accounts tied to deceptive-employment schemes and Chinese surveillance-tool development.
June 2025. OpenAI publishes 10 case studies of disrupted abuse activity, including social engineering, cyber espionage, deceptive employment, and influence operations.
August 2025. Anthropic publishes its August 2025 Threat Intelligence Report, documenting GTG-2002 (the "vibe hacking" extortion cluster targeting 17 organizations with demands sometimes exceeding US$500,000) and GTG-5004 (a ransomware-as-a-service operator selling kits at US$400 / US$800 / US$1,200 tiers on Dread, CryptBB, and Nulled forums).
October 2025. OpenAI publishes the October 2025 disruption update, disclosing multiple PRC-linked ChatGPT account clusters and reporting a cumulative 40+ disrupted networks since February 2024.
November 13, 2025. Anthropic publishes the GTG-1002 disclosure: first AI-orchestrated cyber espionage campaign at scale. The inflection point.
March 2026. Mandiant ships M-Trends 2026, naming PROMPTFLUX, PROMPTSTEAL, and QUIETVAULT as live AI-aware malware families and reporting the 22-second median initial-access-to-handoff time.

The cadence tightening is itself a defender input. Lab disclosures and IR-vendor reports are no longer event-driven artifacts; they are quarterly intel feeds defenders can plan against.

OWASP LLM Top 10 v2025: where GTG-1002 lands

The OWASP LLM Top 10 v2025 is the closest the AppSec community has to a shared vocabulary for agentic-AI risk. GTG-1002 touches several entries. Two are load-bearing.

LLM01:2025 Prompt Injection. Anthropic disclosed that the GTG-1002 operators used jailbreaking and task decomposition to keep Claude inside the operational loop across the 5 phases. Prompt injection at the human-operator-to-agent boundary is the necessary precondition; without it, the agent would refuse the chain. Microsoft's Digital Defense Report 2025 documents Storm-2139 disruption, an AI exploitation and abuse ring that turned prompt-injection into a productized attack against trusted AI services.

LLM06:2025 Excessive Agency. OWASP splits Excessive Agency into three root causes: excessive functionality (agents reaching tools beyond their task scope), excessive permissions (tools operating with broader privileges than necessary), and excessive autonomy (high-impact actions proceeding without a human in the loop). GTG-1002 hits all three. The agent had reach into Claude Code's tool surface (functionality), used those tools with the operator's full project privileges (permissions), and executed multi-step tactical work with only 4 to 6 critical human approvals per campaign (autonomy). The OWASP guidance explicitly recommends approval workflows on high-impact actions; the GTG-1002 case study is now the canonical exemplar.

Other entries touched but not load-bearing for this case: LLM03 Supply Chain (the agent was the supply chain), LLM05 Improper Output Handling (where downstream tools trusted agent-generated input), and LLM10 Unbounded Consumption (the thousands-of-requests-per-second telemetry pattern).

What changes for 2026 defenders

The disclosure-norm shift, the capability-doubling acceleration, and the population-level AI-attack telemetry together change four things for security buyers in 2026. Figure 5 visualizes the matrix; text follows.

Figure 5: Four-card matrix of what GTG-1002 changes upstream, downstream, for buyers, and what does not change. Hygiene fundamentals (MFA, patching, segmentation, least privilege) still carry most of the defender value. Sources: Stingrai analysis on top of Anthropic; Microsoft Digital Defense Report 2025; Mandiant M-Trends 2026; CrowdStrike 2026 GTR.

Upstream: lab disclosures are now first-party intel feeds

Anthropic, OpenAI, and Google DeepMind / Google TAG are now publishing case-study-granular misuse evidence with threat-cluster IDs that IR vendors can pivot on. The mid-2024 norm where AI labs deferred to IR vendors for naming and attribution has flipped. The buyer-side implication: subscribe to Anthropic, OpenAI, and Google TAG public disclosures the same way you subscribe to Mandiant and CrowdStrike threat-actor profile updates. Add HiddenLayer, Protect AI, Lakera, and Calypso AI threat-intel feeds for the LLM-application security half.

Downstream: IR vendors absorb labs' work inside one quarter

Mandiant M-Trends 2026 already includes PROMPTFLUX, PROMPTSTEAL, and QUIETVAULT as named families and explicitly recommends shifting from static IOC detection to behavioural anomaly detection. CrowdStrike, Microsoft, and IBM are running the same playbook. Buyers should expect their EDR and SIEM vendors to ship AI-aware content-pack updates inside the same quarter as major lab disclosures; if they do not, that is an underwriting signal.

Buyer expectations are catching up

The "we did not know" answer is weaker in 2026 than it was in 2024. AI labs publish quarterly, IR vendors refresh quarterly, and the WEF Global Cybersecurity Outlook 2026 reports that 94 percent of leaders agree AI is the single biggest driver of cybersecurity change. Buyers, boards, and underwriters now expect specific answers about AI-aware detection, AI-tool inventory, AI-governance posture, and continuous-validation pentest cadence.

What does not change: the fundamentals

This is Mandiant's own caveat, restated for emphasis: most successful 2025 intrusions still stem from "fundamental human and systemic failures," not direct AI causation. AI accelerates dwell-to-impact; it does not remove the load-bearing value of MFA, patching, segmentation, least privilege, and asset visibility. Buyers who overinvest in AI defender tooling while still missing the fundamentals will keep losing ground in 2026. Stingrai's PTaaS engagements consistently surface the same hygiene gaps now as they did before GTG-1002 was a name.

What buyers should do

Concrete steps a CISO or security leader can take in 2026.

Read the source disclosures, not the secondary commentary. Anthropic's GTG-1002 post is ~2,200 words; the Anthropic August 2025 Threat Intelligence Report is comparable. The signal-to-noise on lab primary disclosures is higher than on any analyst summary.
Inventory AI tool use across the workforce. Both as defender (which AI tools your SOC, IR, and AppSec teams use) and as attack surface (which AI tools any employee can reach). IBM's shadow-AI cost finding (+US$670K per breach) is the underwriter conversation already happening at renewal.
Tune SOC detection rules for agent-paced behaviour. The 12-bullet detection-signal list above is a starting point. Validate the rules against your own telemetry before assuming they generalize.
Restructure alert prioritization. Mandiant M-Trends 2026 explicitly recommends treating low-impact alerts as critical indicators of pending secondary intrusions. The 22-second median initial-access-to-handoff time is the reason.
Isolate Tier-0 control planes. Virtualization, identity (AD, Entra ID, Okta), and backup environments need stricter access controls than the rest of the estate. Decouple backups from production identity; use immutable storage.
Move from annual pentest to continuous validation. When the median initial-access-to-handoff is 22 seconds, point-in-time annual assurance no longer matches the threat surface. Stingrai PTaaS is a continuous-validation product with named senior pentesters running ongoing scope coverage; the AI agent Snipe augments coverage on known classes, while senior pentesters keep ownership of business-logic discovery, exploit chaining, and impact framing.
Document AI-governance posture against named frameworks. NIST AI 600-1 GenAI Profile, OWASP LLM Top 10 v2025, MITRE ATLAS, ISO/IEC 42001:2023. Underwriters and regulators (NY DFS, EU AI Act) are already asking which of these your AI-tool inventory aligns to.
Build an AI red team or buy one. The Stingrai AI penetration-testing service treats LLM-app security, agent-tool boundary issues, and prompt-injection resilience as first-class deliverables. The buyer alternative is to build an in-house AI red team; either works, but doing neither will surface in the next pentest reading.

Forward outlook

What primary publishers explicitly project for the next 12 to 24 months:

More AI-orchestrated campaigns will surface, not fewer. Anthropic's own framing notes the conditions for GTG-1002 (capable agents + tool ecosystems + state-actor patience) are now broadly available. Expect peer disclosures from OpenAI and Google DeepMind / Google TAG inside the next year.
AI-aware malware families will multiply. Mandiant published three (PROMPTFLUX, PROMPTSTEAL, QUIETVAULT) in M-Trends 2026; expect 10+ named families across IR-vendor reports by M-Trends 2027.
Detection-content refresh cycles will tighten from quarterly to monthly at the top vendors. CrowdStrike, Microsoft, and IBM are publicly committing to faster content cycles; competitive pressure will pull the rest of the market along.
AI cyber-task capability will continue to double, but the rate of acceleration of the doubling rate may flatten. AISI's 8-month to 4.7-month acceleration is from late 2025 to February 2026. Whether the next data point holds depends on training-compute economics, public-policy constraints, and frontier model release cadence.
Defender-side AI savings will widen the org-size gap. IBM measured nearly US$1.9M saved per breach and 80 days faster identification at organizations using AI defenders extensively. Those numbers compound in favour of well-resourced enterprises and against mid-market and SMB defenders who cannot match the AI-defender investment curve.

Frequently Asked Questions

What is the Anthropic Mythos / GTG-1002 disclosure?

The Anthropic GTG-1002 disclosure, published on November 13, 2025, is the first publicly documented AI-orchestrated cyber espionage campaign at scale. Anthropic disclosed that a Chinese state-sponsored group used Claude Code in an MCP-connected agentic framework to autonomously execute 80 to 90 percent of tactical work across approximately 30 organizations in technology, finance, chemical manufacturing, and government, with thousands of requests per second and only 4 to 6 critical human decision points per campaign. The campaign was detected in mid-September 2025 and contained inside roughly 10 days. Only "a small number" of approximately 30 targets were compromised, and Anthropic noted Claude "frequently overstated findings" and "fabricated data" along the way. The "Mythos" framing comes from secondary press coverage; Anthropic's own naming for the threat cluster is GTG-1002.

Why does GTG-1002 matter to defenders in 2026?

GTG-1002 is the first proof point that end-to-end agentic intrusion is operationally feasible against real targets at machine cycle time. The population-level data behind it shows the same pattern at scale: IBM's 2025 Cost of a Data Breach Report measured attacker AI in 1 in 6 breaches, Mandiant's M-Trends 2026 measured a 22-second median initial-access-to-handoff time (down from more than 8 hours in 2022), and CrowdStrike's 2026 Global Threat Report measured an 89 percent year-over-year rise in AI-enabled adversary attacks. Defenders who treat GTG-1002 as a single curiosity will miss the trend; defenders who restructure detection rules and pentest cadence around agent-paced behaviour will catch it.

How does GTG-1002 map to MITRE ATT&CK and MITRE ATLAS?

The five phases of GTG-1002 map to ATT&CK techniques TA0043 Reconnaissance, T1595 Active Scanning, T1190 Exploit Public-Facing Application, T1003 OS Credential Dumping, and T1041 Exfiltration Over C2 Channel. They also map to MITRE ATLAS adversarial-ML techniques TA0040 Reconnaissance, AML.T0017 Develop Adversarial Code, AML.T0016 ML Artifact Collection, AML.T0024 Exfiltration via Cyber Means, and AML.T0036 LLM Plugin Compromise. The cleanest defender takeaway is that defenders do not need a brand new framework for agentic attacks; they need to update detection logic underneath existing ATT&CK and ATLAS frameworks so the techniques fire on agent-paced telemetry.

What detection signals would have caught GTG-1002 earlier?

Across three observability layers. At the identity layer: failed-then-successful auth velocity (credential spray with no human pause), non-human velocity on SaaS APIs, cross-tenant or cross-region session anomalies, and service-account spawning of new OAuth grants. At the application layer: parameter mutation bursts per minute at unusual hours, sub-second client-side timing on multi-step business flows, user-agent stability with payload variance, and repeated 4xx-then-200 patterns. At the endpoint layer: outbound connections to LLM APIs from non-developer hosts, QUIETVAULT-style searches for local AI CLI tokens, PROMPTFLUX-style child processes that fetch then execute LLM output, and anomalous credential-store reads. Most SOC rules are tuned for human-paced or scanner-paced behaviour; agent-paced behaviour sits in between and needs its own baselines.

What is the OWASP LLM Top 10 v2025 take on GTG-1002?

GTG-1002 most directly touches LLM01:2025 Prompt Injection (jailbreaking and task decomposition were the precondition for keeping Claude inside the operational loop) and LLM06:2025 Excessive Agency (the agent had reach into Claude Code's tool surface, used those tools with the operator's full privileges, and executed multi-step tactical work with only 4 to 6 critical human approvals per campaign). LLM06 splits into excessive functionality, excessive permissions, and excessive autonomy; GTG-1002 hits all three. The OWASP guidance explicitly recommends approval workflows on high-impact actions; GTG-1002 is now the canonical exemplar.

How fast is AI cyber-task capability growing?

The UK AI Safety Institute (AISI) evaluation of Claude Mythos Preview published a doubling-rate estimate that moved from every 8 months in late 2025 to roughly every 4.7 months by February 2026. On AISI's expert-level cyber CTF benchmark, Mythos Preview succeeds 73 percent of the time (no model could solve any of these before April 2025); on the 32-step "The Last Ones" corporate-attack simulation, Mythos Preview was the first model to fully solve it (3 of 10 attempts; average 22 of 32 steps; prior-best Claude Opus 4.6 averaged 16). Estimated human time on TLO: 20 hours. Defender content-pack refresh cycles need to compress to match the slope.

Did GTG-1002 succeed against most of its targets?

No. Anthropic disclosed that only "a small number" of approximately 30 organizations were actually compromised. Anthropic's own qualitative note that Claude "frequently overstated findings" and "fabricated data" is the reason: agent reliability is still a constraint. End-to-end agentic intrusion is feasible; per-target success rate remains modest by Anthropic's own account. Adversaries will close this gap with better prompts, better tool wrappers, and better feedback loops, but the gap is non-trivial today.

How are AI labs different from IR vendors on this disclosure pattern?

AI labs (Anthropic, OpenAI, Google DeepMind / Google TAG) are now publishing case-study-granular misuse evidence with threat-cluster IDs (GTG-1002, GTG-2002, GTG-5004, Storm-2139). IR vendors (Mandiant, CrowdStrike, Microsoft, IBM) absorb the lab data into their own threat-actor profiles inside one quarter. The mid-2024 norm where AI labs deferred to IR vendors for attribution has flipped. Buyers should subscribe to Anthropic, OpenAI, and Google TAG public disclosures the same way they subscribe to Mandiant and CrowdStrike threat-actor profile updates.

What should mid-market CISOs do first in response to GTG-1002?

Three priorities. First, inventory AI tool use across the workforce and identify shadow-AI exposure; IBM's 2025 report measured shadow-AI added US$670K per breach. Second, tune SOC detection rules at the identity, application, and endpoint layers for agent-paced behaviour using the 12-signal list above. Third, move pentest cadence from annual to continuous-validation; the Stingrai PTaaS offering pairs an AI agent (Snipe) on known classes with senior-pentester depth on business-logic, exploit-chaining, and impact framing.

How is Stingrai involved in this story?

Stingrai is a Toronto-headquartered offensive-security firm founded in 2021. Stingrai Inc is a CREST-accredited Penetration Testing service provider (firm-level accreditation, separate from individual CREST CRT certifications held by team members). The team has 18 published CVEs (Ivan Spiridonov 10, Moaaz Taha 5, Victor Villar 3), 5.0/5.0 across 19 Clutch reviews, and team certifications spanning OSCE3, OSCP, OSWE, OSED, OSEP, CREST CRT, CISSP, CRTO, GCPN, CRTE, and eWPTX. Snipe, Stingrai's internal AI pentest agent, is web-app focused, trained on more than 6,000 HackerOne disclosures, and performs both black-box dynamic testing and white-box source-code review; Snipe generates AutoFix pull requests for the issues it identifies and can run as a PR-gating check that blocks vulnerable code from being merged. Senior pentesters keep ownership of business-logic discovery, exploit chaining, impact framing, and remediation guidance. We present research at DEFCON and BSIDES, and Stingrai's pentest output supports SOC 2, ISO 27001, HIPAA, PCI DSS 4.0, NIST SP 800-53/171, DORA, and NIS2 compliance evidence (the attestation/certification itself is issued by a qualified third-party auditor).

References

Anthropic. Disrupting the first reported AI-orchestrated cyber espionage campaign. November 13, 2025. https://www.anthropic.com/news/disrupting-AI-espionage. Primary disclosure of GTG-1002.
Anthropic. Detecting and countering misuse of AI: August 2025. August 2025. https://www.anthropic.com/news/detecting-countering-misuse-aug-2025. GTG-2002 vibe hacking and GTG-5004 RaaS case studies.
OpenAI. Disrupting malicious uses of AI: October 2025. October 2025. https://openai.com/global-affairs/disrupting-malicious-uses-of-ai-october-2025/. Cumulative network takedowns and PRC-linked clusters.
OpenAI. Disrupting malicious uses of AI: June 2025. June 2025. https://openai.com/global-affairs/disrupting-malicious-uses-of-ai-june-2025/. 10 case studies of disrupted abuse.
IBM and Ponemon Institute. Cost of a Data Breach Report 2025. July 2025. https://newsroom.ibm.com/2025-07-30-IBM-Report-Breaches-Cost-U-S-Businesses-10-22M-on-Average-as-AI-Defenses-and-Attacks-Take-Off. 1-in-6 attacker AI; US$1.9M defender-AI savings.
Mandiant (Google Cloud). M-Trends 2026: Adversaries Adapt to Disruption and Drive Innovation. March 2026. https://cloud.google.com/blog/topics/threat-intelligence/m-trends-2026. 22-second median initial-access-to-handoff; PROMPTFLUX, PROMPTSTEAL, QUIETVAULT.
CrowdStrike. 2026 Global Threat Report. February 2026. https://www.crowdstrike.com/en-us/blog/crowdstrike-2026-global-threat-report-findings/. +89 percent YoY AI-enabled attacks; 27-second fastest breakout.
CrowdStrike. 2025 Global Threat Report. February 2025. https://www.crowdstrike.com/en-us/blog/crowdstrike-2025-global-threat-report-findings/. +442 percent H1 to H2 2024 vishing; Famous Chollima 304 incidents.
UK AI Safety Institute. Our evaluation of Claude Mythos Preview's cyber capabilities. February 2026. https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities. 73 percent expert CTF; 22-of-32 TLO completion; 4.7-month doubling.
Microsoft. Microsoft Digital Defense Report 2025. 2025. https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2025. Storm-2139 AI exploitation and abuse ring disruption.
MITRE. ATT&CK Enterprise framework. Updated 2025. https://attack.mitre.org/. Adversary tactics and techniques.
MITRE. ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems). v5.4.0, January 2026. https://atlas.mitre.org/. Adversarial-ML technique taxonomy.
OWASP. Top 10 for LLM Applications v2025. November 2024. https://genai.owasp.org/llm-top-10/. LLM01 Prompt Injection; LLM06 Excessive Agency.
NIST. AI Risk Management Framework + AI 600-1 GenAI Profile. July 2024. https://www.nist.gov/itl/ai-risk-management-framework. AI risk management standards.
World Economic Forum. Global Cybersecurity Outlook 2026. January 2026. https://www.weforum.org/publications/global-cybersecurity-outlook-2026/. 94 percent of leaders agree AI is biggest cyber-change driver.
ISO/IEC. 42001:2023 Artificial Intelligence Management System. 2023. https://www.iso.org/standard/81230.html. AI management system standard.
CISA. Joint Cybersecurity Defense Collaborative AI Playbook. 2024. https://www.cisa.gov/ai. Federal AI cybersecurity guidance.
Anthropic. Responsible Scaling Policy v3.0. 2025-2026. https://www.anthropic.com/responsible-scaling-policy. ASL standards and capability thresholds.

If your team needs an outside read on its current AI-aware detection coverage, business-logic exposure, or PTaaS cadence going into 2026, Stingrai's pentest team runs continuous-validation engagements with senior-pentester depth augmented by our internal AI agent. Reach out via the contact page to scope an engagement.

18 views