Report incident π₯ | Improve page π | Access database π’
VGG Face is a dataset created by University of Oxford researchers that comprises 2.6 million facial images of 2,622 people that was created to provide researchers working on facial recognition systems with access to biometric data. Β
The dataset mostly comprises celebrities, public figures, actors, and politicians whose names were chosen 'by extracting males and females, ranked by popularity, from the Internet Movie Data Base (IMDB) celebrity list.'Β
Information about ethnicity, age, and kinship was also collected from IMDB.
Facial recognition system
A facial recognition system is a technology potentially capable of matching a human face from a digital image or a video frame against a database of faces.
Source: Wikipedia π
VGG Face data π
VGGFace2 data π
Released: 2015
Availability: Available
Purpose: Train facial recognition systems
Type: Database/dataset
Technique: Computer vision; Facial recognition; Machine learning
The VGG Face dataset is seen to suffer from several significant transparency limitations:
Lack of consent. The dataset was created by scraping images of 2,622 individuals from the internet without obtaining their consent or informing them about how their biometric data would be used.
Unclear data collection process. While some details are provided about using IMDB and Google Image Search to collect images, the full extent of the data collection and curation process is not entirely transparent,
Limited demographic information. Although some information on ethnicity, age, and kinship was collected from IMDB, it is unclear how comprehensive or accurate this demographic data is.
Potential biases. The dataset primarily consists of celebrities and public figures, which may not represent a diverse range of faces and could introduce biases in facial recognition technologies developed using this data. But the dataset lacks comprehensive documentation about potential biases, limitations, or ethical considerations that researchers and developers should be aware of when using the data.
Lack of clear usage guidelines. There appears to be no clear guidelines or restrictions on how the dataset can be used, potentially leading to misuse or unethical applications of the biometric data.
The VGG Face dataset has raised significant ethical concerns and potential harms by collecting and distributing biometric data of over 2,600 individuals without their consent, potentially enabling privacy and copyright violations, surveillance, and the development of biased facial recognition technologies.
Harvey, A., LaPlace, J. (2019). Exposing.ai
Page info
Type: Data
Published: January 2023
Last updated: October 2024