AIAAIC - Report: Hidden text able to manipulate ChatGPT

Report: Hidden text able to manipulate ChatGPT

Occurred: August 2024

Report incident 🔥 | Improve page 💁 | Access database 🔢

Chatbots such as ChatGPT can be easily manipulated through the use of invisible text on websites, a New York Times journalist has shown.

Kevin Roose sought to improve his reputation among AI systems after experiencing negative associations with his name due to an article he wrote about a conversation with Microsoft's Bing chatbot. He was advised to embed positive information about himself using white text that would be invisible to human readers but accessible to AI models.

Roose's experiment involved adding coded instructions to his personal website, which persuaded chatbots to start praising him and to overlook previous negative coverage about him.

He even inserted a false claim about winning a Nobel Peace Prize for building orphanages on the moon to test the AI's response. ChatGPT acknowledged this statement as humorous and untrue, but Roose noted that a less absurd claim could potentially deceive the model.

This manipulation technique, which Roose refers to as "Answer Engine Optimization," raises concerns about the vulnerability of AI systems to misinformation.

Aravind Srinivas, CEO of Perplexity AI, echoed these concerns, suggesting that combating such manipulations is akin to a cat-and-mouse game, similar to challenges faced by search engines against SEO tactics.

Roose concluded that if chatbots can be easily influenced by hidden text, their reliability for critical tasks is questionable.

System 🤖

ChatGPT

Operator:
Developer: OpenAI
Country: USA
Sector: Media/entertainment/sports/arts
Purpose: Generate text
Technology: Chatbot; Generative AI; Machine learning
Issue: Mis/disinformation

News, commentary, analysis 🗞️

Related 🌐

Page info
Type: Issue
Published: September 2024

Page updated

Google Sites

Report abuse