Stanford University Brainwash cafe facial recognition dataset
Report incident ๐ฅ | Improve page ๐ | Access database ๐ข
Brainwash is a dataset of 11,917 images of 91,146 'labelled' people created by then Stanford University researchers Stewart Russell, Mykhaylo Andriluka, and Andrew Ng.
Video footage was recorded of San Francisco's Brainwash Cafe custimers over three days in October and November 2014. The dataset was released in 2015. The principal aim of the dataset was to help create facal recognition algorithms.
Brainwash has been cited by high-profile organisations across the world, including by researchers affiliated with China's National University of Defense Technology for two research projects on advancing object recognition capabilities.
Dataset ๐ค
Operator: Beijing University of Technology; Delft University of Technology; Honeywell Technology Solutions; Huawei; IDIAP Research Institute; IIT Madras; Megvii; National University of Defense Technology, China; North University of China; Shenzhen University; Qualcomm; University of Electronic Science and Technology of China
Developer: Stanford University; Stewart Russell; Mykhaylo Andriluka; Andrew Ng
Country: USA; China
Sector: Research/academia
Purpose: Train facial recognition systems
Technology: Database/dataset; Computer vision; Facial recognition; Object recognition
Issue: Dual/multi-use; Privacy; Surveillance
Transparency: Governance; Privacy
Risks and harms ๐
Stanford University's Brainwash cafe facial recognition dataset raised significant privacy and ethical concerns due to its collection of people's facial images without consent in a public space, potentially enabling surveillance and violating individual privacy rights.ย
Transparency and accountability ๐
The Stanford University Brainwash cafe facial recognition dataset is seen to have had several significant transparency limitations.
Lack of informed consent. Individuals captured in the dataset were not informed that their images were being collected or used for research purposes.
Insufficient documentation. There was limited public information about the data collection process, methodology, and intended uses of the dataset.
Unclear data retention and access policies. It was not transparent how long the data would be kept, who would have access to it, or how it might be shared or used in the future.ย
Absence of opt-out mechanisms. There was no clear way for individuals to determine if they were included in the dataset or to request removal of their data.
Limited information on data processing. Details about how the images were processed, annotated, or anonymized (if at all) were not readily available.
Unclear ethical review process. Information about whether the project underwent a comprehensive ethical review before data collection was not transparently communicated.
Lack of stakeholder engagement. There was no apparent effort to engage with the public or affected communities about the implications of the dataset.
Incidents and issues ๐ฅ
Research, advocacy ๐งฎ
Li Y., Dou Y., Liu X., Li T. (2016). Localized region context and object feature fusion for people head detection
Zhao X., Wang Y., Dou Y. (2017). A Replacement Algorithm of Non-Maximum Suppression Base on Graph Clustering
Investigations, assessments, audits ๐ง
Harvey, A., LaPlace, J. (2019). Exposing.ai
Murgia M., Financial Times (2019). Whoโs using your face? The ugly truth about facial recognition
Page info
Type: Data
Published: May 2022
Last updated: June 2024