DiveFace dataset

Report incident ๐Ÿ”ฅ | Improve page ๐Ÿ’ | Access database ๐Ÿ”ข

DiveFace is a photographic facial recognition dataset comprises photographs of 24,000 people, with an average 5.5 images per person, for a total 139,677 images.ย 

Published in 2019, DiveFace was created by combining the Megaface dataset with additional annotations in order to provide a useful basis for training unbiased and 'discrimination-aware' facial recognition algorithms.

According to the authors, 'DiveFace contains annotations equally distributed among six classes related to gender and ethnicity (male, female and three ethnic groups).' The dataset broadly categorises people as: East Asian, Sub-Saharan and South Indian, and Caucasian.

Dataset ๐Ÿค–

Dataset info ๐Ÿ”ข

Operator:
Developer: Aythami Morales, Julian Fierrez, Ruben Vera-Rodriguez, Ruben Tolosana
Country: Global
Sector: Research/academia; Technology
Purpose: Train facial recognition systems
Technology: Database/dataset; Facial recognition; Computer vision
Issue: Bias/discrimination - race, ethnicity; Copyright; Privacy
Transparency:ย 

Risks and harms ๐Ÿ›‘

With over 5,000 ethnic groups worldwide, the decision to group all people means the DiveFace dataset is also regarded as highly simplistic and likely to suffer from its own biases, with certain ethnic groups or gender identities overrepresented or underrepresented.

Transparency and accountability ๐Ÿ™ˆ

The DiveFace dataset suffers from multiple transparency limitations:

Research, advocacy ๐Ÿงฎ

Investigations, assessments, audits ๐Ÿ‘๏ธ