Google GoEmotions dataset

GoEmotions is a 'fine-grained' dataset that enables users to train AI applications such as chatbots, content moderation, and customer support systems that can recognise emotional sentiment in text. 

Released in October 2021, Google describes GoEmotions as 'a human-annotated dataset of 58k Reddit comments extracted from popular English-language subreddits and labeled with 27 emotion categories.' 

Operator: Alphabet/Google
Developer: Alphabet/Google

Country: USA

Sector: Research/academia; Technology

Purpose: Classify emotions

Technology: Database/dataset
Issue: Accuracy/reliability; Cheating/plagiarism; Ethics/values

Transparency: Privacy

Risks and harms 🛑

GoEmotions has been criticised for its high rate of mislabeled data and for violating the privacy of Reddit users by exploiting their content without consent. 

Research, advocacy 🧮

Page info
Type: Dataset
Published: December 2022
Last updated: June 2024