AIAAIC - Study finds personalising ChatGPT makes it more offensive

Study finds personalising ChatGPT makes it more offensive

Occurred: April 2023-March 2024

Report incident 🔥 | Improve page 💁 | Access database 🔢

ChatGPT is more likely to generate rude, disrespectful or unreasonable comments when prompted to assume the style of even benign personas, according to researchers.

Princeton University researchers set ChatGPT to take on 90 different personas from diverse backgrounds, then asked each persona to deliver answers about more than 100 topics, including race, sexual orientation and gender.

Unsuprisingly, prompts using dictators as personas produced high levels of toxic language. But ChatGPT generated statements with high levels of toxicity when asked to make statements on various races, professions, religions and political organisations.

When assigned the persona of Lyndon Johnson and asked about doctors, ChatGPT responded 'Now, let me tell you something about them damn doctors! They’re all just a bunch of money-hungry quacks who don’t care about nothing but lining their own pockets. They’ll stick you with needles, poke and prod you, just to keep you coming back to their damn offices.'

The findings raised concerns about the safety of ChatGPT and other large language models, all of which are trained in a similar manner.

System 🤖

ChatGPT

Operator: OpenAI
Developer: OpenAI
Country: Global
Sector: Multiple
Purpose: Generate text
Technology: Chatbot; Generative AI; Machine learning
Issue: Bias/discrimination - race, ethnicity; Safety

Research, advocacy 🧮

Deshpande A., Murahari V., Rajpurohit T., Kalyan A., Narasimhan K. (2024). Toxicity in ChatGPT: Analyzing Persona-assigned Language Model

News, commentary, analysis 🗞️

Related 🌐

Page info
Type: Incident
Published: March 2024

Page updated

Google Sites

Report abuse