OkCupid psychological analysis dataset sharing

Occurred: May 2016

Can you improve this page?
Share your insights with us

OkCupid is a US-based dating site in which users answer questions so that a 'one-of-a-kind algorithm' can match them with 'what actually matters'.

In May 2016, Emil Kirkegaard and two other students and researchers at Aarhus University and the University of Aalborg in Denmark published the 'OkCupid dataset', ostensibly to help psychologists investigate the social psychology of dating.

The team scraped data from OkCupid between November 2014 to March 2015 and created a dataset containing 2,620 variables on 68,371 users, including their usernames, age, gender, location, religion, sexual turn-ons, and sexual orientation. 

They also assessed the cognitive abilities of OkCupid users in a paper 'The OKCupid dataset: A very large public dataset of dating site users.'  

Reaction

The researchers stated in their paper that 'It is our hope that other researchers will use the dataset for their own purposes.' 

It is unclear how many times the data was downloaded, but a good number of researchers, academics and privacy advocates expressed concerns that, whilst the scraping may not have been illegal, it was unethical given the volume and sensitivity of the data and the likelihood that the data could be de-anonymised.

OkCupid filed a DCMA copyright claim, prompting the Open Science Framework website on which the paper and data were published to remove the data - an act akin to censorship, according to Kirkegaard

The fracas also led to an investigation by the Danish Data Protection body Datatilsynet on the basis that research involving sensitive personal data must be approved by it. No action (pdf) was taken against Kirkegaard and his collaborators.

Operator: Match Group/OkCupid; Emil Kirkegaard; Julius Bjerrekar; Oliver Nordbjerg
Developer: Match Group/OkCupid

Country: Denmark

Sector: Research/academia; Media/entertainment/sports/arts

Purpose: Assess dating psychology

Technology: Dataset
Issue: Privacy; Dual/multi-use; Ethics

Transparency: 

Page info
Type: Incident
Published: January 2023