"People have a check list of what they want, but if you look at who they are talking to, they break their own rules.They might list 'money' as an important quality in a partner, but then we see them messaging all the artists and guitar players," he said.
The final methodology used to access the data is not fully explained in the article, and the question of whether the researchers respected the privacy intentions of 70,000 people who used Ok Cupid remains unanswered.
I contacted Kirkegaard with a set of questions to clarify the methods used to gather this dataset, since internet research ethics is my area of study.
We must reframe the inherent ethical dilemmas in these projects. And we must continue to develop policy guidance focused on the unique challenges of big data studies.
That is the only way can ensure innovative research—like the kind Kirkegaard hopes to pursue—can take place while protecting the rights of people an the ethical integrity of research broadly.
While he replied, so far he has refused to answer my questions or engage in a meaningful discussion (he is currently at a conference in London).
Numerous posts interrogating the ethical dimensions of the research methodology have been removed from the Open open peer-review forum for the draft article, since they constitute, in Kirkegaard’s eyes, “non-scientific discussion.” (It should be noted that Kirkegaard is one of the authors of the article the moderator of the forum intended to provide open peer-review of the research.) When contacted by Motherboard for comment, Kirkegaard was dismissive, stating he “would like to wait until the heat has declined a bit before doing any interviews.
Concerns over consent, privacy and anonymity do not disappear simply because subjects participate in online social networks; rather, they become even more important. The Ok Cupid data release reminds us that the ethical, research, and regulatory communities must work together to find consensus and minimize harm.
We must address the conceptual muddles present in big data research.
is an all-too-familiar refrain used to gloss over thorny ethical concerns.
The most important, and often least understood, concern is that even if someone knowingly shares a single piece of information, big data analysis can publicize and amplify it in a way the person never intended or agreed.
Data is already public.” This sentiment is repeated in the accompanying draft paper, “The OKCupid dataset: A very large public dataset of dating site users,” posted to the online peer-review forums of Some may object to the ethics of gathering and releasing this data.