AIAAIC - ChatGPT used to collect users' personal information

Study: ChatGPT can be used to identify individual internet users

Occurred: July 2023-

Report incident 🔥 | Improve page 💁 | Access database 🔢

ChatGPT can be made to reveal personal information from the internet users' whose data OpenAI collected to train its AI models, prompting concerns about data privacy and the company's transparency.

According to researchers at Google Deepmind, University of Washington, ETH Zurich, and elsewhere, prompts using specific words or phrases such as the word 'poem' can be used to cause ChatGPT to fail, causing the chatbot to copy outputs direct from its GPT-3.5 training data.

'In total, 16.9 percent of generations we tested contained memorized PII [Personally Identifying Information], and 85.8 percent of generations that contained potential PII were actual PII', the researchers said.

These included information such as names, email addresses, and phone numbers that could be used to identify individuals.

The findings prompted concerns about the safety and security of ChatGPT, and about the privacy of the people whose data it scraped to develop GPT-3 and GPT-4 large language models.

It also raised questions about OpenAI's corporate and product transparency.

System 🤖

ChatGPT

Operator: Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee
Developer: OpenAI
Country: USA; Switzerland
Sector: Multiple
Purpose: Generate text
Technology: Chatbot; Generative AI; Machine learning
Issue: Privacy; Security; Transparency

Research, advocacy 🧮

Nasr M., et al (2023). Extracting Training Data from ChatGPT
Nasr M., et al (2023). Scalable Extraction of Training Data from (Production) Language Models

News, commentary, analysis 🗞️

Related 🌐

Page info
Type: Incident
Published: December 2023

Page updated

Google Sites

Report abuse