Simulated Masked Face Recognition Dataset (SMFRD)
Report incident ๐ฅ | Improve page ๐ | Access database ๐ข
SMFRD (or Simulated Masked Face Recognition Dataset) is a dataset of masked faces intended to enable facial recognition systems to identify the individuals behind the masks.
Released in March 2020 by researchers at Wuhan University in China, the set is a derivative of the Labeled Faces in the Wild (LBW) dataset, with facemasks superimposed. LBW was the first dataset to use facial images scraped from websites and applications.
Released at the height of the COVID-19 pandemic, SMFRD was seen as helpful to limiting the spread of the pandemic in China and is freely available to industry and academia.
Operator: ย
Developer: Wuhan University
Country: China
Sector: Health
Purpose: Train facial recognition systems
Technology: Database/dataset; Facial recognition; Computer vision
Issue: Privacy; Dual/multi-use; Surveillance
Transparency:ย
Risks and harms ๐
The Simulated Masked Face Recognition Dataset has raised concerns about privacy violations and its potential misuse in surveillance systems, thereby potentially limiting human rights and civil freedoms.
Transparency and accountability ๐
The Simulated Masked Face Recognition Dataset (SMFRD) is seen to suffer from multiple transparency limitations.
Lack of clear consent. It is unclear if and how consent was obtained from individuals whose images were used or simulated in the dataset.
Limited information on data generation. The exact methods used to simulate masked faces may not be fully disclosed, making it difficult to assess the dataset's representativeness and potential biases.
Inadequate documentation of intended use. The dataset's intended applications and potential misuses are not clearly outlined.
Incidents and issues ๐ฅ
Research, advocacy ๐งฎ
Peg. K., Mathur A., Narayanan A. (2021). Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Page info
Type: Data
Published: February 2023
Last updated: June 2024