DukeMTMC facial recognition dataset
Report incident 🔥 | Improve page 💁 | Access database 🔢
DukeMTMC is a dataset of video footage taken on Duke University's campus in 2014 with the aim of accelerating advances in 'multi-target, multi-camera tracking' using person re-identification and low-resolution facial recognition.
Published (pdf) in 2016 by Duke University academics and researchers, the dataset consists of over 2 million frames of 2,000 students captured using 8 cameras expressly set up to capture students 'during periods between lectures, when pedestrian traffic is heavy'.
The project was shut down after the publication of researcher Adam Harvey's Exposing.ai project and a Financial Times investigation into facial recognition data sharing.
Operator: CloudWalk; Hikvision; Megvii; SenseNets; SeeQuestor; SenseTime; Beihang University; National University of Defense Technology, China; NEC; PLA Army Engineering University
Developer: Ergys Ristani; Francesco Solera; Roger Zou; Rita Cucchiara; Carlo Tomasi; Duke University
Country: USA
Sector: Technology; Research/academia
Purpose: Train facial recognition systems
Technology: Dataset; Facial recognition; Computer vision
Issue: Ethics/values; Dual/multi-use; Privacy
Transparency: Governance; Privacy
Risks and harms 🛑
The DukeMTMC facial recognition dataset faced concerns due to unethical data collection and availability, and its use for academic, commercial, and military purposes.
Transparency and accountability 🙈
The DukeMTMC dataset suffers from several significant transparency limitations.
Privacy consent. The dataset was created using surveillance camera footage of students and others on a university campus, without obtaining informed consent from the individuals captured.
Limited disclosure of collection methods. There was insufficient transparency about how exactly the data was gathered and processed.
Ethical review gaps. It is unclear if the dataset creation went through proper ethical review processes.
Potential for misuse. There was no information on how the dataset could be used, including for surveillance purposes.#
Unclear data retention policies. Information about how long the data would be kept and how it might be used in the future was not well-defined.
Incidents and issues 🔥
Research, advocacy 🧮
Harvey, A., LaPlace, J. (2019). Exposing.ai
Peng K., Mathur A., Narayanan A. (2021). Mitigating Dataset Harms Requires Stewardship: Lessons from 1000 Papers (pdf)
Investigations, assessments, audits 🧐
Murgia M., Financial Times (2019). Who’s using your face? The ugly truth about facial recognition
Page info
Type: Data
Published: May 2022
Last updated: June 2024