Google GoEmotions dataset

Report incident ๐Ÿ”ฅ | Improve page ๐Ÿ’ | Access database ๐Ÿ”ข

GoEmotions is a 'fine-grained' dataset that enables users to train AI applications such as chatbots, content moderation, and customer support systems that can recognise emotional sentiment in text.ย 

Released in October 2021, Google describes GoEmotions as 'a human-annotated dataset of 58k Reddit comments extracted from popular English-language subreddits and labeled with 27 emotion categories.'ย 

Operator: Alphabet/Google
Developer: Alphabet/Google

Country: USA

Sector: Research/academia; Technology

Purpose: Classify emotions

Technology: Database/dataset
Issue: Accuracy/reliability; Cheating/plagiarism; Ethics/values

Transparency: Privacy

Risks and harms ๐Ÿ›‘

GoEmotions has been criticised for its high rate of mislabeled data and for violating the privacy of Reddit users by exploiting their content without consent.ย 

Transparency and accountability ๐Ÿ™ˆ

The GoEmotions dataset has several transparency limitations.

Research, advocacy ๐Ÿงฎ

Page info
Type: Data
Published: December 2022
Last updated: June 2024