Stanford University Brainwash cafe facial recognition dataset

Brainwash is a dataset of 11,917 images of 91,146 'labelled' people created by Stanford University researchers in San Francisco's Brainwash Cafe, the principal aim of which was to 'train and validate their algorithm’s effectiveness.' 

The dataset was removed 'at the request of the depositor' from Stanford University's website in June 2019 following the publication of researcher Adam Harvey's Exposing.ai project and a Financial Times investigation into facial recognition data sharing.

Dataset 🤖

Dataset databank 🔢

Operator: Beijing University of Technology; Delft University of Technology; Honeywell Technology Solutions; Huawei; IDIAP Research Institute; IIT Madras; Megvii; National University of Defense Technology, China; North University of China; Shenzhen University; Qualcomm; University of Electronic Science and Technology of China
Developer: Stanford University; Stewart Russell; Mykhaylo Andriluka; Andrew Ng
Country: USA; China
Sector: Research/academia
Purpose: Train facial recognition systems
Technology: Dataset; Facial recognition; Computer vision
Issue: Privacy; Dual/multi-use
Transparency: Privacy

Data sharing 

The Brainwash dataset was published online and has been cited by high-profile organisations across the world, including by researchers affiliated with China's National University of Defense Technology for two research projects on advancing object recognition capabilities.

It 'also appears in a 2018 research paper affiliated with Megvii (Face++) ... who has provided surveillance technology to monitor Uighur Muslims in Xinjiang.'

Clips from the dataset remain available on YouTube.

Transparency, privacy 

Video footage was recorded over three days in October and November 2014 without the awareness or consent of Brainwash Cafe customers - a matter the New York Times notes was not addressed in Stanford's research paper on the project. 

And the researchers behind Brainwash - Stewart Russell, Mykhaylo Andriluka, and Andrew Ng - refused to comment publicly on the nature or removal of the dataset.

Dataset documents 📚

Investigations, assessments, audits 🧐