DiveFace dataset

DiveFace is a photographic facial recognition dataset comprises photographs of 24,000 people, with an average 5.5 images per person, for a total 139,677 images. 

Published in 2019, DiveFace was created by combining the Megaface dataset with additional annotations. 

According to the authors, 'DiveFace contains annotations equally distributed among six classes related to gender and ethnicity (male, female and three ethnic groups).' The dataset broadly categorises people as: East Asian, Sub-Saharan and South Indian, and Caucasian.

Developer: Aythami Morales, Julian Fierrez, Ruben Vera-Rodriguez, Ruben Tolosana
Country: Global
Sector: Research/academia
Purpose: Train facial recognition systems
Technology: Database/dataset; Facial recognition; Computer vision
Issue: Bias/discrimination - race, ethnicity

Risks and harms 🛑

DiveFace is seen to provide a useful basis for training unbiased and discrimination-aware face recognition algorithms.

But with over 5,000 ethnic groups worldwide, the decision to group all people means it is also highly simplistic and likely to suffer from its own biases, with certain ethnic groups or gender identities overrepresented or underrepresented.

Research, advocacy 🧮

Investigations, assessments, audits 🧐

Page info
Type: Dataset
Published: April 2024