Not long after a scandal erupted over Facebook’s report on the experiment they conducted on users of the service, leading to an apology about how the study was communicated, the dating website OkCupid has waded into the debate with an admission that they regularly experiment on users and that they are not alone in doing so. Given the form taken by this pronouncement, it’s hard not to see it as a deliberate intervention with the intention of steering an area of widespread anxiety only likely to grow with time. However at the level of data science and web development, it’s hard not to recognise the logic of their argument, even if there are ethical issues which are obscured by it. Experimenting on users is “how websites work” as the author of the blogpost puts it. He gave a convincing performance on UK radio yesterday, arguing that the OkCupid matching algorithm is something that’s been constructed through the activity of data scientists at the company and that continued engagement with available data is an integral part of refining how the matching, as well as the broader website of which it is such a key part, works day-to-day. The Guardian offers a helpful summary of the news and the reaction that it has provoked:
Dating service OkCupid has cheerfully admitted to manipulating what it shows users, a month after Facebook faced a storm of protest when it revealed that it had conducted psychological experiments.
Christian Rudder, OkCupid’s co-founder and data scientist, posted three examples of experiments the firm had performed on to the site’s OkTrends blog, in an upbeat article entitled “We Experiment On Human Beings!”.
The blog, which used to chronicle the discoveries OkCupid made by observing its users’ behaviour, has been mothballed for three years, since OkCupid was purchased by dating behemoth Match.com in February 2011.
“OkCupid doesn’t really know what it’s doing,” writes Rudder in the most recent blogpost. “Neither does any other website. It’s not like people have been building these things for very long, or you can go look up a blueprint or something. Most ideas are bad. Even good ideas could be better. Experiments are how you sort all this out.”
Rudder refers specifically to Facebook’s troubles over its experimentation, when the firm tweaked the content of users’ news feeds in an effort to discover what their reaction was to a higher proportion of positive or negative posts. “Guess what, everybody,” he says, “if you use the internet, you’re the subject of hundreds of experiments at any given time, on every site. That’s how websites work.
The outcry seems to relate to the particular form the experiments take. Read the original post for details about two of the experiments they’ve conducted. However it’s the final experiment which has proved most contentious:
The ultimate question at OkCupid is, does this thing even work? By all our internal measures, the “match percentage” we calculate for users is very good at predicting relationships. It correlates with message success, conversation length, whether people actually exchange contact information, and so on. But in the back of our minds, there’s always been the possibility: maybe it works just because we tell people it does. Maybe people just like each other because they think they’re supposed to? Like how Jay-Z still sells albums?† Once the experiment was concluded, the users were notified of the correct match percentage.
To test this, we took pairs of bad matches (actual 30% match) and told them they were exceptionally good for each other (displaying a 90% match.)† Not surprisingly, the users sent more first messages when we said they were compatible. After all, that’s what the site teaches you to do.
But we took the analysis one step deeper. We asked: does the displayed match percentage cause more than just that first message—does the mere suggestion cause people to actually like each other? As far as we can measure, yes, it does.
When we tell people they are a good match, they act as if they are. Even when they should be wrong for each other.
The four-message threshold is our internal measure for a real conversation. And though the data is noisier, this same “higher display means more success” pattern seems to hold when you look at contact information exchanges, too.
This got us worried—maybe our matching algorithm was just garbage and it’s only the power of suggestion that brings people together. So we tested things the other way, too: we told people who were actually goodfor each other, that they were bad, and watched what happened.
Here’s the whole scope of results (I’m using the odds of exchanging four messages number here):
As you can see, the ideal situation is the lower right: to both be toldyou’re a good match, and at the same time actually be one. OkCupid definitely works, but that’s not the whole story. And if you have to choose only one or the other, the mere myth of compatibility works just as well as the truth. Thus the career of someone like Doctor Oz, in a nutshell. And, of course, to some degree, mine.
Should this be the case? I find many aspects of this concerning but my fear is that these media hyped instances of particular cases which seem intuitively problematic to many risk obscuring the broader picture in which increasingly large parts of our lives are mediated through socio-technical systems with workings that are opaque to us at best. Is there a risk that a debate about the research ethics of these experiments conducted by corporate data scientists distracts us from the political implications of the social relations upon which their capacity to do this depends? Is there a risk that a debate of the utmost importance ends up being driven by intuition rather than reasoned debate? One thought these debates have left me with however is the need for a serious engagement with research ethics by data scientists. These issues are just too important to overlook.