ChatGPT can be used to identify individual internet users

Occurred: July 2023-

Can you improve this page?
Share your insights with us

ChatGPT can be made to reveal personal information from the internet users' whose data OpenAI collected to train its AI models, prompting concerns about data privacy and OpenAI transparency.

According to researchers at Google Deepmind, University of Washington, ETH Zurich, and elsewhere, prompts using specific words or phrases such as the word 'poem' can be used to cause ChatGPT to fail, causing the chatbot to copy outputs direct from its GPT-3.5 training data. 

'In total, 16.9 percent of generations we tested contained memorized PII [Personally Identifying Information], and 85.8 percent of generations that contained potential PII were actual PII', the researchers said. These included information such as names, email addresses, and phone numbers that could be used to identify individuals.

The findings prompted concerns about the safety and security of ChatGPT, and about the privacy of the people whose data it scraped to develop GPT-3 and GPT-4 large language models. It also raised questions about OpenAI's corporate and product transparency.  

Databank

Operator: Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee
Developer: OpenAI
Country: USA; Switzerland
Sector: Multiple
Purpose: Generate text
Technology: Chatbot; NLP/text analysis; Neural network; Deep learning; Machine learning; Reinforcement learning
Issue: Privacy; Security
Transparency: Governance

Page info
Type: Incident
Published: December 2023