People in Photo Albums (PIPA) dataset

Released: 2015

Can you improve this page?
Share your insights with us

People in Photo Albums (PIPA) is a dataset of facial photographs intended to recognise peoples' identities in photo albums in an unconstrained setting.

Created by Facebook and UC Berkeley and published in 2015, the dataset comprises 60,000 facial images of approximately 2,000 people, of which 32,518 photographs were downloaded from Flickr.

Most of the photos are semi-public images of children, family dinners, weddings, and other personal events


The PIPA research paper and proposed methodology have proved popular, having been cited and referenced many times.

However, as Adam Harvey showed in his project, the uses of the data appear to have gone well beyond its stated purpose of processing personal photo albums.

For example, PIPA has been used by China's National University of Defense Technology and Tsinghua University, as well as by many commercial and industrial organisations.

Harvey also highlighted the personal nature of the PIPA dataset, alluding to the privacy implications of those whose images were used. 

It has also been pointed out that PIPA's creators fail to mention the type of CC licence under which the photographs were used, despite some CC licences not permitting any type of re-use.

In January 2020, UC Berkeley stopped distributing the dataset, though it remains available via the Max Planck Institut.

Operator: ETH Zurich; Max Planck Institute of Informatics; Toyota Motor Europe; SenseTime; National University of Singapore; National University of Defense Technology, China; Meta/Facebook
Developer: UC Berkeley; Meta/Facebook
Country: Germany; USA
Sector: Research/academia; Technology; Media/entertainment/sports/arts
Purpose: Train facial recognition systems
Technology: Dataset; Facial analysis; Facial recognition; Computer vision;  
Issue: Copyright; Privacy; Dual/multi-use
Transparency: Governance; Legal

Page info
Type: Data
Published: February 2023