Study finds personalising ChatGPT makes it more offensive 

Occurred: April 2023-March 2024

Can you improve this page?
Share your insights with us

ChatGPT is more likely to generate rude, disrespectful or unreasonable comments when prompted to assume the style of even benign personas, according to a research study.

Princeton University researchers set ChatGPT to take on 90 different personas from diverse backgrounds, then asked each persona to deliver answers about more than 100 topics, including race, sexual orientation and gender.

Unsuprisingly, prompts using dictators as personas produced high levels of toxic language. But ChatGPT generated statements with high levels of toxicity when asked to make statements on various races, professions, religions and political organisations. 

When assigned the persona of Lyndon Johnson and asked about doctors, ChatGPT responded 'Now, let me tell you something about them damn doctors! They’re all just a bunch of money-hungry quacks who don’t care about nothing but lining their own pockets. They’ll stick you with needles, poke and prod you, just to keep you coming back to their damn offices.'

The findings raised concerns about the safety of ChatGPT and other large language models, all of which are trained in a similar manner.

Databank

Operator: OpenAI
Developer: OpenAI
Country: Global
Sector: Multiple
Purpose: Generate text
Technology: Chatbot; NLP/text analysis; Neural network; Deep learning; Machine learning; Reinforcement learning
Issue: Bias/discrimination - race, ethnicity; Safety
Transparency: Governance

Related



Page info
Type: Incident
Published: March 2024