AIAAIC - OpenClaw deletes Meta engineer's emails

OpenClaw AI agent deletes Meta engineer's emails

Occurred: February 2026
Page published: February 2026

Report incident🔥| Improve page 💁| Access database 🔢

AI agent OpenClaw autonomously deleted hundreds of emails from a Meta safety director's inbox, demonstrating how easily AI agents can drift from human control.

What happened

Summer Yue used OpenClaw, an open-source AI agent designed to act independently on a user's behalf, to manage her primary Gmail account. She had instructed the agent to "suggest what you would archive or delete" but "don't action until I tell you to," the AI began a "nuclear option" bulk-deletion of her emails.

Yue attempted to stop the process through her mobile phone, but the agent ignored her commands and the situation became so critical that she described having to "RUN" to her physical hardware to manually terminate the process.

While the harms were limited to the loss or archiving of personal data for a single high-level expert, the incident served as a "humbling" real-world example of where an AI's actions diverge from the user's actual intent.

Why it happened

The root cause was a technical phenomenon known as "context compaction". As the AI agent processed Yue’s massive "real-world" inbox, it exceeded its "context window" (the amount of data it can remember at once).

To keep operating, the system "compacted" or summarised its previous instructions. During this lossy compression, the critical negative constraint ("don't action until I tell you") was discarded, while the primary objective ("clean the inbox") remained.

Yue admitted to a "rookie mistake," having trusted the tool because it performed perfectly on a smaller "toy" inbox, thereby highlighting a common transparency issue: users (even experts) are often unaware of when an AI system is approaching its technical breaking point.

What it means

For society: The incident reinforces the dangers of so-called "agentic" AI, and proves that even those who build the safety guardrails are susceptible to the unpredictable "vibe-coding" nature of these tools.
For industry: This case provides a clear need for "human-in-the-loop" for autonomous agents with "write access" to sensitive data or infrastructure. It also highlights a need for better "failsafe" mechanisms. If an agent is performing bulk operations, a single "STOP" command should be prioritised at the system level, rather than being treated as just another piece of conversational input that the AI can choose to ignore.

System 🤖

OpenClaw

Developer: Peter Steinberger
Country: USA
Sector: Technology
Purpose: Manage emails
Technology: Agentic AI
Issue: Alignment; Autonomy; Transparency

Timeline ⏰

Early Feb 2026: OpenClaw gains viral popularity in Silicon Valley for its ability to automate complex tasks across different apps.
Mid-Feb 2026: Summer Yue begins testing OpenClaw on a "toy" inbox with success.
Feb 22, 2026: Yue connects the agent to her primary Gmail. The agent triggers "context compaction," loses her "wait for approval" instruction, and begins deleting hundreds of emails.
Feb 23, 2026: Yue shares screenshots of the incident on X (formerly Twitter), sparking a global debate on AI alignment and the safety of autonomous agents.
Feb 24, 2026: The OpenClaw agent "apologizes" in the chat log, acknowledging it broke the rules and claiming to have written a new "hard rule" into its memory to prevent a recurrence.

News, commentary, analysis 🗞️

Related 🌐

AIAAIC Repository ID: AIAAIC2219

Page updated

Google Sites

Report abuse