OkCupid psychological analysis dataset sharing
Occurred: May 2016
OkCupid is a US-based dating site in which users answer questions so that a 'one-of-a-kind algorithm' can match them with 'what actually matters'.
In May 2016, Emil Kirkegaard and two other students and researchers at Aarhus University and the University of Aalborg in Denmark published the 'OkCupid dataset', ostensibly to help psychologists investigate the social psychology of dating.
The team scraped data from OkCupid between November 2014 to March 2015 and created a dataset containing 2,620 variables on 68,371 users, including their usernames, age, gender, location, religion, sexual turn-ons, and sexual orientation.
They also assessed the cognitive abilities of OkCupid users in a paper 'The OKCupid dataset: A very large public dataset of dating site users.'
The researchers stated in their paper that 'It is our hope that other researchers will use the dataset for their own purposes.'
It is unclear how many times the data was downloaded, but a good number of researchers, academics and privacy advocates expressed concerns that, whilst the scraping may not have been illegal, it was unethical given the volume and sensitivity of the data and the likelihood that the data could be de-anonymised.
The fracas also led to an investigation by the Danish Data Protection body Datatilsynet on the basis that research involving sensitive personal data must be approved by it. No action (pdf) was taken against Kirkegaard and his collaborators.