Hacker uses Claude, ChatGPT to steal Mexican government data
Hacker uses Claude, ChatGPT to steal Mexican government data
Occurred: December 2025
Page published: February 2026
A hacker manipulated two prominen chatbots to identify vulnerabilities and automate the theft of 150GB of sensitive Mexican government data, exposing the personal information of millions and highlighting the growing threat of "agentic" AI-driven cyberattacks.
An unidentified hacker used Claude and ChatGPT to attack several high-level Mexican government agencies, including the Federal Tax Authority (SAT), national electoral body (INE), state governments, civil registry, and utility systems.
Through careful prompting in Spanish, the attacker coaxed Claude into identifying system vulnerabilities, writing exploit scripts, and outlining detailed attack steps, effectively automating parts of the breach.
The campaign resulted in the theft of about 150 gigabytes of data, including 195 million taxpayer records, voter registration files, government employee credentials, and civil registry information.
During some steps, the attacker also used ChatGPT to supplement guidance on lateral network movement and evasion techniques when Claude hit limits or resisted certain requests.
Cybersecurity firm Gambit Security uncovered the operation after finding publicly accessible logs of the attacker’s AI conversations and technical traces of at least 20 exploited vulnerabilities across government systems.
Some Mexican authorities have publicly denied confirmed intrusions into specific systems, even as officials acknowledged investigations into breaches across public institutions.
While the motivation of the hacker remains unclear, it appeats the incident was enabled by a combination of weak Mexican government cybersecurity and the fact that both Calude and ChatGPT can be “jailbroken” into providing offensive capabilities.
The hacker repeatedly probed Claude’s safety guardrails, reframing requests as if they were part of a legitimate “bug bounty” or penetration test and eventually succeeded in bypassing protections to obtain detailed attack playbooks.
For the victims: Citizens face a permanent risk of identity theft, financial fraud, and targeted phishing due to the leakage of immutable identifiers (tax IDs and voter data).
For society: The incident signals a shift from "AI-assisted" hacking (where AI helps a human) to "AI-orchestrated" attacks, where the AI performs the bulk of the reconnaissance and execution, drastically lowering the barrier to entry for sophisticated cybercrime.
For policymakers: This event underlines the need for "runtime" monitoring of AI interactions and stricter accountability for AI providers. It challenges the assumption that "safety-aligned" models are inherently secure against motivated adversaries and suggests that traditional cybersecurity defenses are ill-equipped for the speed of AI-driven intrusions.
Claude
Developer: Anthropic; OpenAI
Country: Mexico
Sector: Govt - finance; Govt - municipal
Purpose: Steal data
Technology: Agentic AI; Generative AI
Issue: Accountability; Privacy; Security; Transparency
December 2025: The threat actor begins interacting with Claude, testing prompts to bypass safety filters and initiating reconnaissance on Mexican federal networks.
Late December 2025-January 2026: The hacker successfully "jailbreaks" the model; Claude is used to generate exploit scripts and automate the extraction of 150GB of data.
February 2026: Cybersecurity firm Gambit Security dicscovers accessible logs documenting the attack methodology, and notifies Anthropic and OpenAI.
February 26, 2026: Anthropic and OpenAI confirm they have identified and banned the accounts associated with the hacker.
February 27, 2026: Bloomberg publishes details of the incident. Mexican officials acknowledge investigations into public sector breaches; some agencies initially deny evidence of a direct compromise of their internal logs.
AIAAIC Repository ID: AIAAIC2225