AI Dungeon GPT-3 offensive speech filter

Occurred: April 2021

Can you improve this page?
Share your insights with us

AI Dungeon developer Latitude has come under fire for developing a content moderation system intended to stop players of its open-ended adventure game from generating stories depicting sexual encounters with minors.

An upgrade to OpenAI's powerful GPT-3 language model resulted in some players typing words that caused the game to generate inappropriate stories, which also appear to have prompted the AI to create inappropriate content of its own.

However, it quickly became clear that Latitude's new solution was blocking a wider range of content than envisaged. Gamers also complained that their private content was now being reviewed by moderators. 

Meantime, a security researcher published a report calculated that around a third of stories on AI Dungeon are sexually excplicit, and one-half are assessed as NSFW. 

Operator: Latitude

Developer: Latitude; OpenAI

Country: USA

Sector: Media/entertainment/sports/arts

Purpose: Minimise sexual content

Technology: Content moderation system; NLP/text analysis 

Issue: Accuracy/reliability; Safety; Privacy
Transparency: Governance

Page info
Type: Incident
Published: December 2021