Published on

July 1, 2026

16 min read

Non-Human Identity Attacks: When Leaked API Keys Become Your Perimeter (2026)

Non-human identities now outnumber humans 82 to 1, and 18.1M exposed API keys and tokens made NHIs the fastest-growing attack surface in 2025. Here is how exposure happens and how to defend it.

Ivan Spiridonov

Team Lead Penetration Tester

Network Security

Summarize with AI

TL;DR

Non-human identities (service accounts, API keys, tokens, machine credentials) are now the real perimeter, and they are exposed at record scale. - SpyCloud recaptured 18.1M exposed API keys and tokens in 2025, plus 6.2M credentials and cookies tied to AI tools. - Machine identities outnumber humans 82 to 1, and 42% of them hold privileged or sensitive access (CyberArk 2025 Identity Security Landscape). - GitGuardian found 28.65M new hardcoded secrets on public GitHub in 2025, up 34% year over year, and AI-service secret leaks jumped 81%. - Leaked machine credentials rarely have MFA, rotate slowly, and carry broad scopes, so a single exposed key can mean persistent access to production, cloud, and the software supply chain. - Defense is inventory, short-lived and scoped credentials, aggressive rotation, secrets management, and continuous detection of leaked keys, validated by testing that actually tries to use them.

The perimeter you are defending is no longer the login page. In 2025, SpyCloud recaptured 18.1 million exposed API keys and tokens across payment platforms, cloud providers, developer ecosystems, collaboration tools, and AI services, plus 6.2 million credentials and authentication cookies tied to AI tools (SpyCloud, 2026 Identity Exposure Report). Those are not passwords a human types. They are machine credentials, the silent identities that authenticate one system to another. Adversaries are no longer breaking in; they are signing in with credentials that were never supposed to leave a config file.

Three forces made non-human identities (NHIs) the fastest-growing attack surface of the year. Machine identities now outnumber humans 82 to 1, and 42% of them hold privileged or sensitive access (CyberArk, 2025 Identity Security Landscape). Developers pushed 28.65 million new hardcoded secrets to public GitHub in 2025, up 34% year over year, with AI-service secret leaks up 81% (GitGuardian, State of Secrets Sprawl 2026). And identity-based attacks rose 32% in the first half of 2025, with more than 97% of them being password attacks (Microsoft, Digital Defense Report 2025). For CISOs, platform teams, and security buyers, the message is blunt: identity is the pressure point, and the least-governed identities are the ones no person logs into.

This post is Stingrai's 2026 reference for non-human identity attacks and secrets sprawl. It draws on five primary publishers, SpyCloud, GitGuardian, CyberArk, Microsoft, and Verizon, and every figure carries its source, year, and reporting window so any claim can be audited inline. Lead data is full-year 2025 telemetry, the freshest available, because the primary publishers have not yet released their full-year 2026 reports as of July 2026. We define what NHIs are, walk through exactly how exposure happens, quantify the impact, and spend the back half on defense.

TL;DR: the numbers that define NHI risk in 2026

Exposed API keys and tokens recaptured (2025): 18.1 million (SpyCloud, 2026 Identity Exposure Report).
Machine-to-human identity ratio: 82 to 1, with 42% of machine identities holding privileged or sensitive access (CyberArk, 2025 Identity Security Landscape).
New hardcoded secrets on public GitHub (2025): 28.65 million, up 34% year over year (GitGuardian, State of Secrets Sprawl 2026).
AI-service secret leaks (2025): 1,275,105, up 81% year over year (GitGuardian, State of Secrets Sprawl 2026).
Credentials and cookies tied to AI tools (2025): 6.2 million (SpyCloud, 2026 Identity Exposure Report).
Stolen session cookies and artifacts recaptured (2025): 8.6 billion, enabling MFA bypass (SpyCloud, 2026 Identity Exposure Report).
Secrets from 2022 still valid: above 64% as of January 2026 (GitGuardian, State of Secrets Sprawl 2026).
Identity-based attacks (H1 2025): up 32%, with more than 97% being password attacks (Microsoft, Digital Defense Report 2025).
Compromised credentials as an initial access vector: 22% of breaches, and stolen credentials appeared in 88% of Basic Web Application attacks (Verizon, 2025 DBIR).
Secrets in MCP configuration files: 24,008 unique, of which 2,117 were valid credentials (GitGuardian, State of Secrets Sprawl 2026).

Key takeaways

The most exposed identities are the ones no human owns. NHIs rarely have MFA, rotate infrequently, and run with broad permissions, so a single leaked key can grant persistent access to production, cloud, and software supply chains (SpyCloud, 2026 Identity Exposure Report). The identity controls most organizations invested in over the last five years, phishing-resistant MFA and conditional access, largely do not apply to service accounts and API tokens.

Leaked secrets do not expire on their own. More than 64% of credentials confirmed valid in 2022 were still valid in January 2026 (GitGuardian, State of Secrets Sprawl 2026). A key that leaked years ago and was never rotated is not a historical footnote; it is a live door.

AI adoption is pouring fuel on secrets sprawl. AI-service secret leaks rose 81% in 2025, AI-assisted code leaked secrets at roughly double the GitHub-wide baseline (3.2% versus 1.5%), and SpyCloud recaptured 6.2 million credentials and cookies tied to AI tools (GitGuardian; SpyCloud). Every new agent, connector, and model integration is another identity that needs a credential.

Attackers are logging in, not breaking in. Microsoft's finding that adversaries "aren't breaking in; they're signing in" is reinforced by Verizon: compromised credentials were the initial access vector in 22% of breaches (Microsoft, Digital Defense Report 2025; Verizon, 2025 DBIR). Machine credentials are the quietest way in because their use looks like normal automation.

You cannot protect identities you have not inventoried. With machine identities outnumbering humans 82 to 1 and 68% of organizations lacking identity security controls for AI, the governance gap is structural, not a matter of a single missing tool (CyberArk, 2025 Identity Security Landscape).

Methodology

This reference draws on five primary publishers, cited inline throughout:

SpyCloud, 2026 Identity Exposure Report (published March 19, 2026; data window full-year 2025). Built from SpyCloud's recaptured identity datalake of breach, phishing, malware, and infostealer data.
GitGuardian, State of Secrets Sprawl 2026 (published March 17, 2026; data window full-year 2025). Based on secrets detected across public GitHub commits, plus analysis of AI-service credentials and MCP configuration files.
CyberArk, 2025 Identity Security Landscape (survey of 2,600 cybersecurity decision makers at organizations of 500-plus employees, conducted by Vanson Bourne).
Microsoft, Digital Defense Report 2025 (published October 16, 2025; data window first half of 2025). Based on Microsoft's cross-cloud threat signal.
Verizon, 2025 Data Breach Investigations Report (analysis of breaches and incidents in the DBIR corpus).

The research cutoff for this post is July 2026. Where a publisher reports full-year 2025 as its most recent complete window, we use and label that figure. Any statistic that could not be traced to one of these named primary sources on a verification pass was dropped rather than estimated. Currency and unit conventions follow each source; no figures were converted.

What is a non-human identity?

A non-human identity is any credential that authenticates a machine, workload, or automation rather than a person. The category is broad and it is exactly where most organizations have the least visibility:

Service accounts that let one application authenticate to another.
API keys and tokens that grant programmatic access to SaaS platforms, payment providers, and cloud APIs.
OAuth tokens and refresh tokens issued to connected apps and integrations.
Cloud workload identities and instance roles (for example, cloud IAM roles assumed by compute).
Certificates and signing keys used for mutual TLS, code signing, and CI/CD.
Secrets embedded in AI agents, connectors, and Model Context Protocol (MCP) servers that let a model reach tools and data.

The defining property is that no human logs in. That single fact breaks most of the identity security playbook. There is no person to complete an MFA prompt, no interactive session to expire, and often no clear owner to notice when a credential is misused. As SpyCloud puts it, unlike human credentials these NHIs "often lack MFA enforcement, rotate infrequently, and operate with broad permissions" (SpyCloud, 2026 Identity Exposure Report).

Why NHIs are the fastest-growing attack surface

Scale is the first reason. CyberArk's 2025 Identity Security Landscape puts the ratio of machine identities to humans at 82 to 1, and finds that 42% of machine identities have privileged or sensitive access (CyberArk, 2025 Identity Security Landscape). Every microservice, pipeline, SaaS integration, and AI agent adds identities faster than any human-centric process can track them.

Exposure is the second reason. GitGuardian recorded 28.65 million new hardcoded secrets on public GitHub in 2025, a 34% year-over-year jump and the largest single-year increase it has measured (GitGuardian, State of Secrets Sprawl 2026). SpyCloud, working from breach and malware data rather than code repositories, independently recaptured 18.1 million exposed API keys and tokens in the same year (SpyCloud, 2026 Identity Exposure Report). Two very different collection methods, one consistent conclusion: machine secrets are leaking at industrial scale.

AI is the accelerant. AI-service secret leaks rose 81% year over year to 1,275,105, and code written with AI assistance leaked secrets at roughly double the GitHub-wide rate (GitGuardian, State of Secrets Sprawl 2026). The rush to wire models into internal systems is minting new NHIs faster than governance can absorb them, which is why 68% of organizations report they lack identity security controls for AI (CyberArk, 2025 Identity Security Landscape).

How exposure happens

Machine secrets leak through a small number of well-worn paths. Understanding them is the prerequisite to closing them.

Hardcoded secrets and git leaks

The most common source is a secret written directly into source code and pushed to a repository. It happens in a rushed commit, a debugging snippet, a test fixture, or a config file that was never meant to ship. Once it lands in git history, deleting the line does not remove it; the secret lives in the commit history forever unless the history is rewritten and the credential is rotated. GitGuardian's 28.65 million figure is drawn from public GitHub alone, and generic secrets that lack an obvious provider prefix are the hardest to auto-detect and revoke (GitGuardian, State of Secrets Sprawl 2026).

CI/CD pipelines and configuration files

Build systems are dense with secrets: registry credentials, deploy keys, cloud roles, and signing certificates. When those are stored as plaintext environment variables, printed in verbose build logs, or committed in pipeline definitions, they leak. The newest variant is the AI toolchain. GitGuardian found 24,008 unique secrets exposed in MCP configuration files on public GitHub, of which 2,117 were valid, in part because popular setup guides recommend pasting API keys straight into config files and connection strings (GitGuardian, State of Secrets Sprawl 2026).

Infostealer malware and session theft

Not every leak is a developer mistake. Infostealer malware harvests credentials, tokens, and session cookies from infected machines and sells them in bulk. SpyCloud recaptured 642.4 million exposed credentials from 13.2 million infostealer infections in 2025, and 8.6 billion stolen cookies and session artifacts that let attackers replay authenticated sessions and bypass MFA entirely (SpyCloud, 2026 Identity Exposure Report). Session tokens are non-human by nature: they authenticate a session, not a person, so stealing one sidesteps the login screen.

Over-scoped and long-lived tokens

The final path is not a leak at all, it is bad hygiene that turns any leak into a catastrophe. A token scoped to a single read-only endpoint and expiring in an hour is a nuisance if it leaks. A token with account-wide write access and no expiry is a breach. Because NHIs rotate infrequently, leaked keys stay live for years: more than 64% of secrets confirmed valid in 2022 were still valid in January 2026 (GitGuardian, State of Secrets Sprawl 2026).

The impact: from one leaked key to full compromise

A leaked machine credential is rarely the end of an attack; it is the beginning. Because NHIs authenticate systems to each other, a single valid key drops an attacker into the middle of your environment, past the login page and past MFA.

The path is consistent. An attacker validates a leaked key against its provider, confirms it still works, then uses its scope to read data, assume additional roles, or push code. Because the activity looks like normal automation, it blends in. Microsoft frames the shift plainly: adversaries "aren't breaking in; they're signing in," and identity-based attacks rose 32% in the first half of 2025 (Microsoft, Digital Defense Report 2025). Verizon quantifies the outcome: compromised credentials were the initial access vector in 22% of breaches, and stolen credentials showed up in 88% of Basic Web Application attacks (Verizon, 2025 DBIR).

Three impact patterns dominate:

Session hijacking. Stolen cookies and tokens let an attacker resume an authenticated session without ever solving MFA. With 8.6 billion stolen session artifacts in circulation, this is a volume problem, not an edge case (SpyCloud, 2026 Identity Exposure Report).
Cloud and production access. An over-scoped cloud or service credential can expose production data, infrastructure, and the ability to pivot laterally between services.
Software supply-chain compromise. A leaked CI/CD or package-registry credential lets an attacker tamper with builds, publish malicious versions, or sign artifacts, poisoning every downstream consumer.

SpyCloud's Chief Intelligence Officer, Trevor Hilligoss, describes attackers "stealing authenticated access, including API keys, session tokens and automation credentials, and using this access to move faster, stay persistent, and scale attacks across cloud and enterprise environments" (SpyCloud, 2026 Identity Exposure Report). The through-line of every pattern is persistence: the credential keeps working until someone rotates it.

Human versus non-human identity: the control gap

The reason NHIs are so dangerous is that the controls that hardened human identities over the last decade mostly do not reach them. The comparison below shows where the gap lives.

Control	Human identity (typical)	Non-human identity (typical)
Multi-factor authentication	Enforced, increasingly phishing-resistant	Rarely applicable; no interactive prompt
Session lifetime	Short, with idle timeouts	Long-lived tokens, often no expiry
Credential rotation	Password policies, forced resets	Rotates infrequently or never
Scope of access	Role-based, least privilege trending	Often broad or account-wide
Ownership and offboarding	Tied to an employee lifecycle	Frequently orphaned, no clear owner
Monitoring	Login anomaly detection is mature	Use looks like normal automation

Source lines for the underlying claims: NHIs "often lack MFA enforcement, rotate infrequently, and operate with broad permissions" (SpyCloud, 2026 Identity Exposure Report); 42% of machine identities hold privileged or sensitive access (CyberArk, 2025 Identity Security Landscape).

What this means for defenders

The good news is that NHI risk is addressable with disciplined engineering practice. The work is inventory, scoping, rotation, storage, and detection, validated by adversarial testing.

1. Inventory every non-human identity

You cannot govern what you have not counted. Build and maintain an inventory of service accounts, API keys, tokens, workload identities, certificates, and AI-agent credentials, with an owner, a scope, and an expiry for each. With machine identities outnumbering humans 82 to 1, this is a program, not a spreadsheet (CyberArk, 2025 Identity Security Landscape). Prioritize the credentials with privileged or sensitive scope first.

2. Make credentials short-lived and tightly scoped

The single highest-leverage change is to shrink the blast radius of any one leak. Prefer short-lived, automatically issued credentials (workload identity federation, OIDC-based cloud roles, ephemeral tokens) over long-lived static keys. Scope every credential to the minimum access it needs. A leaked token that expires in minutes and can touch one endpoint is a very different incident from an account-wide key with no expiry.

3. Rotate aggressively, and treat any leak as live

Because leaked secrets stay valid for years, rotation is not optional hygiene, it is incident containment (GitGuardian, State of Secrets Sprawl 2026). Automate rotation so it is routine rather than a fire drill, and when a secret is found in code, logs, or a paste site, revoke it first and investigate second. Deleting a line from the current file is not remediation; the credential lives on in git history until it is rotated.

4. Get secrets out of code and into a manager

Store secrets in a dedicated secrets manager or vault, inject them at runtime, and keep them out of source, config files, build logs, and chat. Add secret scanning to pre-commit hooks and CI so a hardcoded key is caught before it is ever pushed. Given that AI-assisted commits leak secrets at roughly double the baseline rate, this guardrail matters more as AI-generated code grows (GitGuardian, State of Secrets Sprawl 2026).

5. Detect leaked keys and abused sessions continuously

Monitor for your own credentials appearing in public repositories, paste sites, and breach and infostealer datasets, and instrument your platforms to flag anomalous token use: a key called from a new geography, at an unusual rate, or against endpoints it never touched before. With 8.6 billion session artifacts in circulation, session anomaly detection is as important as credential detection (SpyCloud, 2026 Identity Exposure Report).

How Stingrai helps

Controls only count if they hold up when someone actually tries to abuse a leaked key, and that is where offensive testing earns its place. Stingrai is a Toronto and London based offensive-security firm and a firm-level CREST-accredited penetration testing service provider, founded in 2021. Our web application penetration testing hunts for the hardcoded secrets, over-scoped tokens, and broken authorization flaws that let a leaked machine identity pivot into real access, the same IDOR and access-control classes that turn one valid key into full compromise. Our network and cloud focused penetration testing and PTaaS validates whether an exposed credential can actually reach production, and our red teaming exercises assume-breach scenarios that start exactly where an attacker starts: with a working credential.

Snipe, Stingrai's autonomous AI agent for web application penetration testing, reinforces this on the code side. Snipe performs white-box source-code review that surfaces hardcoded secrets and the authorization and IDOR flaws that let leaked machine identities move laterally, it opens AutoFix pull requests for what it finds, and it can run as a PR-gating check that blocks vulnerable code from merging. Trained on thousands of real HackerOne disclosure reports and on the methodology of Stingrai's human pentesters, Snipe is built to find the complex, business-logic and authorization bugs that generic scanners miss. Stingrai's testing supports your SOC 2, ISO 27001, and PCI DSS compliance programs by producing the evidence your audits expect. See Stingrai pricing for engagement options.

For related reading on how exposed keys and misconfiguration become real access, see our breakdown of Supabase anon-key and RLS misconfigurations and our GitHub Actions security checklist for locking down CI/CD secrets.

Frequently asked questions

What is a non-human identity (NHI)?

A non-human identity is any credential that authenticates a machine, workload, or automation rather than a person: service accounts, API keys, OAuth tokens, cloud workload identities, certificates, and secrets embedded in AI agents. The defining trait is that no human logs in, so MFA and interactive session controls usually do not apply. SpyCloud notes that NHIs "often lack MFA enforcement, rotate infrequently, and operate with broad permissions," which is exactly why they are attractive targets (SpyCloud, 2026 Identity Exposure Report).

How many API keys and tokens leaked in 2025?

SpyCloud recaptured 18.1 million exposed API keys and tokens in 2025, spanning payment platforms, cloud providers, developer ecosystems, collaboration tools, and AI services (SpyCloud, 2026 Identity Exposure Report). Separately, GitGuardian counted 28.65 million new hardcoded secrets pushed to public GitHub in the same year, up 34% year over year (GitGuardian, State of Secrets Sprawl 2026).

Why are non-human identities so hard to secure?

Three reasons: scale, exposure, and missing controls. Machine identities outnumber humans 82 to 1, and 42% hold privileged or sensitive access, so the volume alone outpaces human-centric processes (CyberArk, 2025 Identity Security Landscape). They leak constantly through code and malware, and the MFA, short-session, and rotation controls that protect human accounts mostly do not apply to a token no person logs into.

How does a leaked API key lead to a breach?

An attacker validates the leaked key against its provider, confirms it still works, then uses its scope to read data, assume additional roles, or push code, all while the activity looks like normal automation. Because NHIs rotate slowly, the key often keeps working for years. Verizon found compromised credentials were the initial access vector in 22% of breaches, and stolen credentials appeared in 88% of Basic Web Application attacks (Verizon, 2025 DBIR).

Is AI making secrets sprawl worse?

Yes. GitGuardian found AI-service secret leaks rose 81% in 2025, and code written with AI assistance leaked secrets at roughly double the GitHub-wide baseline, 3.2% versus 1.5% (GitGuardian, State of Secrets Sprawl 2026). SpyCloud separately recaptured 6.2 million credentials and cookies tied to AI tools (SpyCloud, 2026 Identity Exposure Report). Every new agent, connector, and MCP server is another identity that needs a credential, and setup guides often push those secrets straight into config files.

How do you defend against non-human identity attacks?

Start by inventorying every NHI with an owner, scope, and expiry, then make credentials short-lived and tightly scoped, rotate aggressively, move secrets out of code into a secrets manager, and continuously detect leaked keys and abused sessions. Validate all of it with penetration testing that actually tries to use exposed credentials to reach production. Stingrai's web application penetration testing and PTaaS are built to find the hardcoded secrets and authorization flaws that turn one leaked key into full compromise.

References

SpyCloud. 2026 Identity Exposure Report. March 19, 2026. https://spycloud.com/newsroom/annual-identity-exposure-report-2026/. Built from SpyCloud's recaptured identity datalake of breach, phishing, malware, and infostealer data; source for the 18.1M exposed API keys and tokens, 6.2M AI-tool credentials, 8.6B session artifacts, and 65.7B total identity records.
GitGuardian. The State of Secrets Sprawl 2026. March 17, 2026. https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/. Annual analysis of secrets detected on public GitHub, AI-service credentials, and MCP configuration files; source for the 28.65M new secrets, 81% AI-service leak growth, MCP config exposures, and secret validity figures.
CyberArk. 2025 Identity Security Landscape. 2025. https://www.cyberark.com/press/machine-identities-outnumber-humans-by-more-than-80-to-1-new-report-exposes-the-exponential-threats-of-fragmented-identity-security/. Survey of 2,600 security decision makers at 500-plus-employee organizations (Vanson Bourne); source for the 82:1 machine-to-human ratio, 42% privileged access, and 68% lacking AI identity controls.
Microsoft. Microsoft Digital Defense Report 2025. October 16, 2025. https://blogs.microsoft.com/on-the-issues/2025/10/16/mddr-2025/. Cross-cloud threat intelligence; source for the 32% rise in identity-based attacks, the 97% password-attack share, and the "signing in, not breaking in" framing.
Verizon. 2025 Data Breach Investigations Report. 2025. https://www.verizon.com/business/resources/reports/2025-dbir-data-breach-investigations-report.pdf. Analysis of breaches and incidents; source for compromised credentials as an initial access vector in 22% of breaches and 88% of Basic Web Application attacks.

0 views