AIAAIC - PIPA dataset criticised for using sensitive personal images without consent

People in Photo Albums dataset criticised for using sensitive personal images without consent

Occurred: September 2019

Report incident 🔥 | Improve page 💁 | Access database 🔢

A dataset of facial photos intended to recognise peoples' identities in photo albums prompted controversy for failing to gain the explicit consent of those in the images, and for uses beyond its original purpose.

An investigation by artist Adam Harvey found that the People in Photo Albums (PIPA) dataset included images of people in various personal and social settings, potentially infringing on their privacy and that, despite attempts to anonymise the data, the personal nature of the photos could make individuals identifiable.

In addition, the uses of the data appear to have gone well beyond its stated purpose of processing personal photo albums. For example, Harvey discovered that PIPA was used by China's National University of Defense Technology and Tsinghua University, as well as by many commercial and industrial organisations.

It has also been pointed out that PIPA's creators fail to mention the type of CC licence under which the photographs were used, despite some CC licences not permitting any type of re-use.

The controversy highlighted the need for researchers to gain the consent of people whose personal and biometric data they are using, and the need for clearer ethical guidelines and consent processes in the development of datasets.

➕ January 2020. UC Berkeley stopped distributing the dataset. However, it remains available via the Max Planck Institut.

System 🤖

People in Photo Albums

Operator: ETH Zurich; Max Planck Institute of Informatics; Toyota Motor Europe; SenseTime; National University of Singapore; National University of Defense Technology, China; Meta/Facebook
Developer: UC Berkeley; Meta/Facebook
Country: Germany; USA
Sector: Research/academia; Technology; Media/entertainment/sports/arts
Purpose: Train facial recognition systems
Technology: Database/dataset; Facial analysis; Facial recognition; Computer vision
Issue: Copyright; Dual/multi-use; Ethics/values; Privacy; Surveillance; Transparency

Research, advocacy 🧮

Harvey, A., LaPlace, J. (2021). Exposing.ai
Zhao J., Li J., Cheng Y., Zhou L., Sim T.,Yan S., Feng J. (2018). Understanding Humans in Crowded Scenes: Deep Nested Adversarial Learning and A New Benchmark for Multi-Human Parsing

News, commentary, analysis 🗞️

https://www.law.kuleuven.be/citip/blog/free-to-re-use-the-case-of-facial-images-scrapped-from-the-internet-and-compiled-in-mega-research-datasets

Related 🌐

Page info
Type: Issue
Published: July 2024

Google Sites

Report abuse