Report incident ๐ฅ | Improve page ๐ | Access database ๐ข
GoEmotions is a 'fine-grained' dataset that enables users to train AI applications such as chatbots, content moderation, and customer support systems that can recognise emotional sentiment in text.ย
Released in October 2021, Google describes GoEmotions as 'a human-annotated dataset of 58k Reddit comments extracted from popular English-language subreddits and labeled with 27 emotion categories.'ย
Data ๐
Released: 2021
Availability: Active
Developer: Google
Country: USA
Purpose: Classify emotions
Type: Database/dataset
Technique: Emotion recognition
The GoEmotions dataset has several transparency limitations.
Data collection process. There is limited information on how specific Reddit comments were selected for inclusion in the dataset.
Annotator demographics. The dataset lacks detailed information about the demographics and backgrounds of the annotators, which could influence emotion labeling.
Annotation guidelines. While some information is provided, the full set of detailed guidelines given to annotators is not publicly available.
Inter-annotator agreement. While some metrics are provided, there's limited insight into specific areas of disagreement among annotators.
Data cleaning process. The exact methods used to clean and preprocess the Reddit comments are not fully detailed.
Excluded data. There is limited information on what types of comments or content were excluded from the dataset and why.
Consent and privacy. It is unclear whether or how consent was obtained from the original comment authors, or how privacy concerns were addressed.
Potential biases: While some biases are acknowledged, there may be insufficient detail on potential biases in the data selection or annotation process.
Version control. Information about how the dataset might be updated or versioned over time is limited.
GoEmotions has been criticised for its high rate of mislabeled data and for its unethical violation of the privacy of Reddit users by exploiting their content without acknowledgement or consent.ย
Page info
Type: Data
Published: December 2022
Last updated: October 2024