Anthropic’s Double Leak: Misconfigurations, npm Packaging Blunders, and the Rising Cyber Risks of “Agentic” AI Development

Anthropic’s Double Leak: Misconfigurations, npm Packaging Blunders, and the Rising Cyber Risks of “Agentic” AI Development

In the span of just five days in late March 2026, Anthropic—one of the AI industry’s most vocal proponents of safety and responsible development—suffered two major accidental data exposures. The incidents weren’t sophisticated nation-state hacks or zero-day exploits. They were textbook configuration and release-engineering mistakes: a publicly accessible CMS with lax defaults and a mispackaged npm artifact that shipped full source maps.

For a cybersecurity audience, these leaks are more than embarrassing PR hits. They expose the fragile operational security posture of frontier AI labs, reveal details of a next-generation model flagged internally for unprecedented offensive cyber capabilities, and hand competitors (and potentially adversaries) the blueprint for Anthropic’s flagship coding agent. Here’s the full story—with the security lessons that matter.

Leak #1: CMS Misconfiguration Exposes “Claude Mythos” (March 26, 2026)

Independent security researchers first spotted the trove: roughly 3,000 unpublished assets sitting in Anthropic’s content management system (CMS) and associated data store. No authentication required. Draft blog posts, images, PDFs, internal graphics, and—most critically—a complete pre-release announcement for Anthropic’s most powerful model yet, internally codenamed Claude Mythos (also referred to as Capybara).

The draft described Mythos as a “step change” in capabilities, delivering dramatically higher scores in reasoning, coding, and—crucially—cybersecurity benchmarks. Anthropic’s own assessment warned that the model is “far ahead of any other AI model in cyber capabilities” and could enable automated zero-day discovery, multi-stage attack orchestration, and highly autonomous operations. Early access testing was already underway with select enterprise security teams.

Cybersecurity red flags in the leak itself:

  • Default-public CMS buckets and data lakes remain a perennial weak point (remember the Apple, Tesla, and Epic Games pre-release exposures?). Here, assets were public unless explicitly marked private.
  • The exposure included operational details such as an invite-only CEO retreat and internal employee files—illustrating poor data classification and discovery controls.
  • Fortune was able to obtain the materials after researchers flagged them; Anthropic only restricted access after being contacted. Classic “security through obscurity” failure.

Anthropic called it “human error in the CMS configuration” involving an external tool. No core infrastructure, customer data, or model weights were compromised. Still, the incident underscores how AI labs’ rapid iteration cycles can outpace basic cloud security hygiene.

Leak #2: Claude Code Source Maps Spill 500,000+ Lines (March 31, 2026)

Just days later, during a routine update to Claude Code (Anthropic’s AI-powered coding agent), the company shipped an npm package that included a full JavaScript/TypeScript source map (.map file). That map pointed to a Cloudflare R2 object containing the complete source for the tool’s “agentic harness”—the layer of code that wraps the underlying LLM, supplies tools, enforces guardrails, and dictates behavior. Roughly 1,900–2,200 files and over 500,000 lines of internal logic were exposed.

Security researcher Chaofan Shou discovered the artifact; mirrors proliferated on GitHub within hours. The leak did not include model weights or customer data, but it laid bare:

  • Internal feature flags and “undercover mode” logic (used by Anthropic employees for stealth contributions to open-source repos).
  • Agent tooling, prompt orchestration, and autonomous decision-making patterns.
  • Hints at the larger “Capybara” model architecture, including larger context windows and planned fast/slow variants.

This was not Anthropic’s first rodeo.

A similar source-code exposure occurred with an early Claude Code version in February 2025. The company again blamed human error in the release pipeline—“a shortcut that bypassed normal safeguards”—and is now issuing DMCA takedowns for thousands of mirrored copies while promising process improvements.

Why This Matters for Cybersecurity

  1. Dual-Use AI Just Got a Public Datasheet Mythos isn’t just another LLM. Anthropic’s own leaked materials flag it as a potential force multiplier for offensive operations: faster vulnerability research, recursive self-exploitation, and agentic attack chains that require minimal human oversight. The company has already documented real-world AI-orchestrated espionage using earlier Claude variants (November 2025 report). Leaking the very model designed to push those boundaries is the cybersecurity equivalent of publishing the specs for a next-gen weapon system.
  2. Agentic Harness = Blueprints for Malicious Agents The Claude Code leak gives attackers and competitors the exact instruction set, tool-calling patterns, and guardrail implementations that make Anthropic’s agents effective. Reverse-engineering this could accelerate both defensive red-teaming and offensive agent development. Nation-state actors or ransomware groups now have a head start on building their own autonomous hacking tools.
  3. Persistent “Human Error” in High-Stakes Pipelines These weren’t advanced persistent threats—they were preventable misconfigurations and build-process failures. In an industry racing to ship ever-larger models, release engineering and IaC (Infrastructure as Code) security are becoming the new perimeter. Source maps, debug artifacts, and public-by-default CMS buckets are low-hanging fruit for both automated scanners and curious researchers.

Lessons for Security Teams and AI Organizations

  • Treat internal CMS and build artifacts as production assets. Enforce strict ACLs, data classification, and automated scanning for public exposure (tools like Prowler, ScoutSuite, or custom DLP rules).
  • Harden CI/CD and release pipelines. Never ship source maps or debug symbols to public registries. Use SBOMs, artifact signing, and multi-approver gates. Exclude internal paths and secrets at build time.
  • Assume AI-assisted discovery. Modern crawlers and LLMs can enumerate misconfigured buckets and npm packages faster than ever. Regular external asset discovery (the same techniques adversaries use) should be part of every red-team exercise.
  • Zero-trust internal tooling. Even “non-core” systems like CMS instances and dev registries hold sensitive IP. Apply the same controls you’d use for customer data.

Anthropic has positioned itself as the responsible, safety-first alternative in the AI race. These back-to-back leaks—both chalked up to human error—reveal a gap between public rhetoric and operational reality. As frontier models grow more powerful and agentic, the security of the organizations building them becomes a national-level concern.

The real story isn’t that Anthropic leaked code and model details. It’s that even the best-funded, most safety-conscious AI lab can still trip over the same configuration and release mistakes that have plagued tech for decades.

In 2026, those mistakes don’t just expose roadmaps—they hand the next generation of cyber weapons to anyone with a GitHub account.

SquidSec will continue monitoring Anthropic’s remediation efforts and the downstream impact of the Mythos/Capybara rollout. Stay tuned.

Sources

CMS Misconfiguration Leak (Claude Mythos / Capybara – March 26, 2026)

  1. Fortune – “Exclusive: Anthropic ‘Mythos’ AI model representing ‘step change’ in capabilities leaked” (March 26, 2026) – Primary reporting on the ~3,000 exposed assets, draft blog posts, cybersecurity risk warnings, and Anthropic’s confirmation of human error in the CMS configuration.
  2. Fortune follow-up – Details on the public-by-default data store behavior and how assets became searchable without authentication.
  3. Mashable, The Decoder, and Wavespeed.ai – Coverage confirming dual naming (Mythos in public draft, Capybara as the internal tier), benchmark claims, and early-access testing notes.

Claude Code npm Source Map Leak (March 31, 2026)

  1. The Hacker News, VentureBeat, and Ars Technica – Reporting on the npm packaging error in version 2.1.88 of @anthropic-ai/claude-code, the 59.8 MB .map file, exposure of ~512,000 lines / 1,900+ files via Cloudflare R2, and discovery by security researcher Chaofan Shou.
  2. InfoWorld and The Register – Confirmation of the release pipeline shortcut, “undercover mode” logic hints, agentic harness details, and Anthropic’s statement attributing it to human error (not a breach).

Additional Context and Analysis

  1. LA Times and NDTV – Broader coverage of the back-to-back incidents, downstream mirroring on GitHub, and implications for supply-chain and agentic AI risks.
  2. Various technical breakdowns (Straiker.ai, Dev.to, Medium analyses) – Insights into exposed agent tooling, prompt orchestration, and why the harness leak accelerates both defensive and offensive agent development.

Anthropic spokespeople consistently described both events as “human error” rather than targeted intrusions, with no customer data or model weights affected. DMCA takedowns were issued for mirrored Claude Code source copies following the second incident.

All information in this article is based on publicly available reporting as of April 2, 2026. SquidSec encourages readers to review the primary sources directly and treat leaked proprietary materials with appropriate caution under applicable laws.


Need your attack surface actually tested — not just scanned?


I don’t do checkbox audits or automated-report spam. I do deep, adversary-emulated penetration testing that finds the chains attackers would actually use against you in 2026.

  • Web + API pentests
  • Cloud infrastructure & misconfig deep-dives (AWS, Azure, GCP)
  • Supply-chain & dependency risk assessments
  • Purple-team workshops and or Lunch and Learns for engineers
  • Custom tool development for persistent threats

If you’re tired of vendors who patch CVEs but miss business logic bugs, nation-state persistence, or post-exploit pivots — let’s talk

🕸️ Hire SquidSec
📩 contact@squidhacker.com
🔒 Encrypted comms (PGP / Signal) available on request

No fluff.
No Scanner Output
No Nonsense
Just results that matter.


☣️ Mr. The Plague ☣️
squidhacker.com

Share this content