Iarpa Janus Benchmark-C (IJP-C) dataset
Iarpa Janus Benchmark-C (IJP-C) is a database of YouTube video still-frames and Flickr and Wikimedia photos used for face recognition benchmarking.
IJP-3 was compiled in 2017 by US government subcontractor Noblis and contains 21,294 images of 3,531 people 'with diverse occupations' and and of varying levels of fame.
The dataset averages six pictures and three videos per person, and is available on application to computer vision and facial recognition researchers.
According (pdf) to Iarpa, 'the Janus program dramatically improved the performance of facial recognition software by increasing the speed and accuracy of identity matching.'
However, as discovered by activist Adam Harvey and highlighted by the Financial Times in September 2019, the dataset included a number of political activists, civil rights advocates, and journalists, including Ai Wei Wei, Tracey Emin, Evgeny Morozov, John Maeda, and Ta-Nehisi Coates.
None of these individuals were made aware of their inclusion in the database by Noblis or Iarpa, amd their images had been obtained without their explicit consent. Furthemore, the use of YouTube videos constituted a clear violation of the platform's terms of service.
Equally, as the FT pointed out, the use of the dataset by companies such as Chinese AI firm SenseTime and Japanese IT firm NEC, as well as by organisations such as China's National University of Defense Technology, raises concerns about its potential use for military and security purposes, including the mass surveillance of Uyghurs and other oppressed minorities.
Operator: SenseTime; NEC; National University of Defense Technology (NUDT)
Developer: Noblis; Iarpa
Sector: Govt - police: Govt - security; Govt - welfare
Purpose: Create facial recognition benchmark
Technology: Dataset; Facial recognition; Computer vision; Neural network; Machine learning
Issue: Privacy; Dual/multi-use; Surveillance
Transparency: Governance; Privacy