SMFRD dataset criticised for eroding privacy, enabling surveillance
SMFRD dataset criticised for eroding privacy, enabling surveillance
Occurred: August 2021
Report incident 🔥 | Improve page 💁 | Access database 🔢
A dataset which added face masks to images of people was criticised for potentially further eroding privacy and fueling mass surveillance.
In a study, Princeton University researchers revealed that computer vision datasets, particularly those containing images of people, present a range of ethical problems.
The study highlighted the issue of derivative datasets leading to unintended consequences, with the SMFRD (or Simulated Masked Face Recognition Dataset) called out for potentially violating the privacy of people who wish to conceal their face, and fueling surveillance and enabling government identification of masked protestors.
SMRFD is a derivative of Labeled Faces in the WILD (LFW), an open source dataset of facial images for researchers that was intended to establish a public benchmark for facial verification, but which morphed into being used in the real world, despite a warning label on the data set’s website that cautions against such use.
Operator:
Developer: Wuhan University
Country: China
Sector: Health
Purpose: Train facial recognition systems
Technology: Database/dataset; Facial recognition; Computer vision
Issue: Privacy; Dual/multi-use; Surveillance
Transparency:
Peg. K., Mathur A., Narayanan A. (2021). Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers
Page info
Type: Issue
Published: July 2024