Mandelbrot set image very small MohammedAmin.com
Serious writing for
serious readers
Follow @Mohammed_Amin
Join my
email list

Search this site

Custom Search
Mohammed Amin's website
Serious writing
for serious readers
Tap here for MENU

Review of “Dataclysm: What our online lives tell us about our offline selves” by

When using the internet in private, people reveal their true thoughts, beliefs and desires. The author had access to large amounts of data from several dating sites.

Summary

7 January 2023

When answering surveys or opinion polls, people lie, or at the very least “shade the truth.” They do that even if the survey is anonymous. How many of us want to tick a box to say that we are racists?

We behave differently when using the internet anonymously. We behave differently when “swiping left” or “swiping right” on a dating site without thinking about what this behaviour says about us.

This book, based on hard data, is incredibly revealing about people's real behaviours and beliefs.

The author

When he wrote this book in 2014, Christian Rudder was president and co-founder of OkCupid, a large dating site. According to his Wikipedia page, he left in 2015.

When writing this book, he had full access to all of OkCupid’s data. He also was given an anonymised data by Google, Facebook and other websites.

Overview of the book

The book comprises 255 pages + 63 pages of notes etc. It has the following contents:

Introduction

Part 1: What Brings Us Together

  1. Wooderson’s Law
  2. Death by a Thousand Mehs
  3. Writing on the Wall
  4. You Gotta Be the Glue
  5. There’s No Success like Failure

Part 2: What Pulls Us Apart

  1. The Confounding Factor
  2. The Beauty Myth in Apotheosis
  3. It’s What’s inside That Counts
  4. Days of Rage

Part 3: What Makes Us Who We Are

  1. Tall for an Asian
  2. Ever Fallen in Love?
  3. Know Your Place
  4. Our Brand Could Be Your Life
  5. Breadcrumbs

Coda
A Note on the Data
Notes
Acknowledgements
Index

I have included a few short extracts below to provide a taste of the book.

Introduction

The author led OkCupid’s analytics team from 2009. He writes:

“I have led OkCupid’s analytics team since 2009, and my job is to make sense of the data our users create. While my 3 founding partners have done almost all the hard work of actually building the site, I’ve spent years just playing with the numbers.

Some of what I work on helps us to run the business: for example, understanding how men and women view sex and beauty differently is essential for a dating site.

But a lot of my results aren’t directly useful – just interesting. There’s not much you can do with the fact that, statistically, the least black band on earth is Belle & Sebastian, or that the flash in a snapshot makes a person look seven years older, except to say huh, and may be repeated at a dinner party. That’s basically all we did with this stuff for a while; the insights we gleaned went no further than an occasional lame press release.

But eventually we were analysing enough information that larger trends became apparent, big patterns in the small ones, and, even better, I realised that I could use the data to examine taboos like race by direct inspection.

That is, instead of asking people survey questions or contriving small-scale experiments, which was how social science was often done in the past, I could go and look at what actually happens when, say, 100,000 white men and 100,000 black women interact in private. The data was sitting right there on our servers. It was an irresistible sociological opportunity.

I dug in, and as discoveries built up, like anyone with more ideas than audience, I started a blog to share them with the world. That blog then became this book, after one important improvement.

For Dataclysm, I’ve gone far beyond OkCupid. In fact, I probably put together a dataset of person-to-person interactions that’s deeper and more varied than anything held by any other private individual – spanning most, if not all, of the significant online data sources of our time.

In these pages I’ll use my data to speak not just to the habits of one site’s users but also to a set of universals.

As for the data’s authenticity, much of it is, in a sense, fact-checked because the internet is now such a part of everyday life.

Take the data from OkCupid. You give the site your city, your gender, your age, and who you’re looking for, and it helps you to find someone to meet for coffee or a beer. Your profile is supposed to be you, the true version. If you upload a better-looking person’s picture as your own, or pretend to be much younger than you really are, you will probably get more dates.

But imagine meeting those dates in person: they’re expecting what they saw online. If the real you isn’t close, the date is basically over the instant you show up. That is one example of the broad trend: as the online and off-line worlds merge, a built-in social pressure keeps many of the internet’s worst fabulist impulses in check.

The people using the services, dating sites, social sites, and news aggregators alike, are all fumbling their way through life, as people always have. Only now they do it on phones and laptops. Almost inadvertently, they’ve created a unique archive: databases around the world now holds years of yearning, opinion, and chaos. And because it stored with crystalline precision it can be analysed not only in the fullness of time, but with a scope and flexibility unimaginable just a decade ago.”

Part 1 Chapter 1 — Wooderson’s Law

The author begins by revealing some shocking differences in the way women / men decide what kind of men / women they want to date.

“I’ll start with the opinions of women – all the trends below are true across my sexual datasets, but for specificity’s sake, I’ll use numbers from OkCupid. This table lists, for a woman, the age of men she finds most attractive. If I’ve arranged it unusually, you’ll see in a second why.”

A woman's age

The age of the men who look best to her

20

23

21

23

22

24

23

25

24

25

25

26

26

27

27

28

28

29

29

29

30

30

31

31

32

31

33

32

34

32

35

34

36

35

37

36

38

37

39

38

40

38

41

38

42

39

43

39

44

39

45

40

46

38

47

39

48

40

49

45

50

46

“Reading from the top, we see that 20 and 21-year-old women prefer 23-year-old guys; 22-year-old women like men who are 24, and so on down through the years to women at 50, who we see rate 46-year-olds the highest.

This isn’t survey data, this is data built from tens of millions of preferences expressed in the act of finding a date, and even from just following along the first few entries, the gist of the table is clear: a woman wants a guy to be roughly as old as she is. Pick an age in black under 40, and the number in grey is always very close. The broad trend comes through better” [with a graph]

Graph showing for a woman of each age the age of the men who look best to her

“The [yellow] diagonal line is the “age parity” line, where the male and female years would be equal. It is not a canonical math thing, just something I overlaid as a guide for your eye.

Often there is an intrinsic geometry to a situation – it was the first science for a reason – and we’ll take advantage wherever possible. This particular line brings out transitions, which coincide with big birthdays.

The first pivot point is at 30, where the trend of the [blue line] – the ages of the men – crosses below the [yellow age parity] line, never to cross back. That’s the data’s way of saying that until 30, a woman prefers slightly older guys; afterward, she likes them slightly younger.

Then at 40, the progression breaks free of the diagonal, going practically straight [horizontally] for nine years. That is to say, a woman’s tastes appear to hit a wall. Or a man’s looks fall off a cliff, however you want to think about it. If we want to pick the point where a man’s sexual appeal has reached its limits, it’s there: 40.

The two perspectives (of the woman doing the rating and of the man being rated) are two halves of a whole.

As a woman gets older, her standards evolve, and from the man’s side, the rough 1:1 movement of the [blue line] versus the [yellow line] implies that as he matures, the expectations of his female peers mature as well – practically year-for-year.

He gets older, and their viewpoint accommodates him. The wrinkles, the nose hair, the renewed commitment to cargo shorts – these are all somehow satisfactory, or at least offset by other virtues.

Compare this to the freefall scores going the other way, from men to women.”

A man's age

The age of the women who look best to him

20

20

21

20

22

21

23

21

24

21

25

21

26

22

27

21

28

20

29

20

30

20

31

20

32

20

33

20

34

20

35

20

36

20

37

22

38

20

39

20

40

21

41

21

42

20

43

23

44

21

45

24

46

20

47

20

48

23

49

20

50

22

Graph showing for a man of each age the age of the women who look best to him

“This graph – and it’s practically not even a graph, just a [horizontal blue line with a couple of low peaks] – makes a statement as stark as its own [blank] space.

A woman’s at her best when she’s in her very early twenties. Period. And really my [graph] doesn’t show that strongly enough. The four highest-rated female ages are 20, 21, 22, and 23 for every group of guys but one.

Again, the geometry speaks: the male pattern runs much deeper than just a preference for 20-year-olds. And after he hits 30, the latter half of our age range (that is, women over 35) might as well not exist.

Younger is better, and the youngest is best of all, and if “over the hill” means the beginning of a person’s decline, a straight woman is over the hill as soon as she’s old enough to drink.

Of course, another way to put this focus on youth is that males’ expectations never grow up. A 50-year-old man’s idea of what’s hot is roughly the same as a college kid’s, at least with age as the variable under consideration – if anything, men in their twenties are more willing to date older women.

In a mathematical sense, a man’s age and his sexual aims are independent variables: the former changes while the latter never does.

I call this Wooderson’s law, in honour of its most heinous proponent, Matthew McConaughey’s character from 'Dazed and Confused.'”

Part 2 Chapter 1 — The Confounding Factor

This chapter shows starkly just how much more racist the USA is than the UK.

“If you stand on the south-west corner of Fifty-Eighth and Fifth with a clipboard and do a little people-watching, you can very quickly conclude that most New Yorkers are beautiful, thin, and above all, rich. Every thread, every grommet, every crease shines with money.

Of course, many New Yorkers are rich, but that’s not the whole story here. You’re standing outside Bergdorf Goodman, and that’s a confounding factor.

That is a technical term for something you haven’t accounted for in your analysis but that nonetheless affects its results.

Making sure you’re not perched in some bitwise version of the Upper East Side is one of the most time and thought-consuming parts of working with digital data. When you have seemingly every variable and every possibility available for analysis and speculation, your research is free to travel wherever your curiosity leads. But true to the cliché, that freedom requires eternal vigilance.

And here’s where I have an admission to make.

So far in these pages, wherever you’ve seen the data of a person-to-person opinion, in the votes, in the date results from Crazy Blind Date, the charts, the tables – in every ratio, in every total – whenever one user was judging another, both people involved were white.

I had to make it that way, because when you’re looking at how two American strangers behave in a romantic context, race is the ultimate confounding factor. And to make sure whatever I wanted to say about attraction or sex spoke to those ideas alone, I needed to cut it from the discussion.

On OkCupid, one of the easiest ways to compare a black person and a white person (or any two people of any race) is to look at their “match percentage.”

That’s the site’s term for compatibility. It asks users a bunch of questions, they give answers, and an algorithm predicts how well any two of them would get along over, say, a beer or dinner. Unlike other features on OkCupid, there is no visual component to match percentage. The number between two people only reflects what you might call their inner selves – everything about what they believe, need, and want, even what they think is funny, but nothing about what they look like.

Judging by just this compatibility measure, the four largest racial groups on OkCupid – Asian, black, Latino, and white – all get along about the same. In fact, race has less effect on match percentage than religion, politics, or education.

Among the details that users believe are important, the closest comparison to race is Zodiac sign, which has no effect at all. To a computer not acculturated to the categories, “Asian” and “black” and “white” could just as easily be “Aries” and “Virgo” and “Capricorn.”

But this racial neutrality is only in theory: things change once the user’s own opinions, and not just the colour-blind workings of an algorithm, come into play.

Given the full profile, with the photo dominating the page, this is how OkCupid’s users rate each other by race.”

Average ratings given from men to women on OkCupid

Her race

Asian

black

Latina

white

His race

Asian

3.16

1.97

2.74

2.85

black

3.4

3.31

3.43

3.23

Latino

3.13

2.24

3.37

3.19

white

2.91

2.04

2.82

2.98

“I’ve given the raw data above, unadorned, because by now you’re at least a little familiar with OkCupid’s 1 to 5-star system. But to make the trends easier to see, I am going to take that same matrix and “normalise” each row.

In the table below, each entry is the percentage difference (+/-) from the average (the “normal”) in the row. Its the same information, just phrased a bit differently.

Think of the normalised number as the member’s relative preference for women. Example, as you can see, Asian men think Asian women are 18% better-looking than the average, while black men think they’re just 2% better. And so on:

Normalised ratings from men to women on OkCupid

Her race

Asian

Black

Latina

white

His race

Asian

+18%

-27%

+2%

+7%

black

+2%

-1%

+3%

-4%

Latino

+5%

-25%

+13%

+7%

white

+8%

-24%

+5%

+11%

The two essential patterns of male-to-female attraction are plain: men tend to like women of their own race. Far more than that, though, they don’t like black women. Message data is highly correlated with these ratings, so they follow the pattern as well.

In the most superficial way, OkCupid’s members reflect the general composition of internet users, with of course the caveat that (almost) everyone on the site is single.

The site’s users are younger than the national average (OkCupid’s median age is 29), and they tend to be less religious. The racial composition is about what you’d expect.

...

Going one demographic level deeper, OkCupid users are, if anything, more urban, more educated, and more progressive than the nation at large. The site’s biggest markets by far are places like New York, San Francisco, Los Angeles, Boston, and Seattle. 85% of the users have gone to college. Self-described liberals outnumber self-described conservatives more than two to one. There is a broad, site-wide ethos of open-mindedness. And an unintentionally hilarious 84% of users answer this match question:

'Would you consider dating someone who has vocalised a strong negative bias toward a certain race of people?'

In the absolute negative (choosing “No” over “Yes” and “It Depends”). In light of the previous data, that means 84% of people on OkCupid would not consider dating someone on OkCupid.

Essentially anything that, in theory, would make a group of people “less racist,” that’s what OkCupid users are.

I point this out to people, who, like me, lead nice lives in large, diverse cities; who think of their opinions and tastes as nothing if not enlightened; who unwind at night with a glass of wine and a Facebook dose or two of progressive righteousness.

When I show here that black women and later, black men, get short shrift, and that adding whiteness to a user’s identity makes him or her more attractive, I'm not describing some Ozark [a range of mountains in some Conservative leaning rural states] fever dream. I'm describing our world, mine and yours. If you’re reading a popular science book about Big Data and all its portents, rest assured the data in it is you.

But look one more time at the match question above, which was written by one of OkCupid’s users and has been answered close to two million times: “vocalised” is an odd word.

Get rid of it, and it still more or less reads “Would you date a racist?” Which I once assumed was the question’s real intent.

The writer, however, understood the subtleties of the dataset before I did. On a dating site you can act on impulses that you might otherwise keep quiet. On some level, the users come to judge and be judged by others, and each person joins the site free of the context of their everyday life. The site doesn’t connect you to your family. Nothing gets posted to your friends’ timelines.

The game is: it shows you people, and you like them or you don’t; you talk to them or you don’t. There’s nothing else to it.

In a digital world that’s otherwise compulsively networked, there is an old-school solitude to online dating. Your experience is just you and the people you choose to be with; and what you do is secret. Often the very fact that you have an account – let alone what you do with it – is unknown to your friends. So people can act on attitudes and desires relatively free from social pressure.”

One of the striking points from the book is that American racial preferences are much stronger than elsewhere in the world.

“In fact, OkCupid’s patterns change in places outside the United States.

In the UK, the site’s black members get 98.9% of the messages white members do. In Japan, 97.8%. In Canada, 90%.

Many of the black users in the former two countries, especially Japan, are Americans abroad.”

Assessment of the book

I found the book fascinating.

This kind of data allows you to learn things about people which surveys just cannot accomplish. When people are clicking on a dating site, or carrying out searches on Google, they are revealing their true preferences and interests.

They will act in ways that they never would if they were responding to a survey face-to-face, or even an online survey where they are aware that another human will be looking at the survey results and in some way “judging them.”

For example, as shown above, when picking women to contact, men overwhelmingly select young women. This is very different from what they say when filling out preference forms on OkCupid.

While I quoted only a few short extracts, it is a treasure trove of information. For example, there is detailed data on searches for the word “nigger” on Google during the 2008 presidential election when Barack Obama was dominating the airwaves.

I believe everyone will benefit from reading it and will find it as illuminating as I did.

 

The Disqus comments facility below allows you to comment on this page. Please respect others when commenting.
You can login using any of your Twitter, Facebook, Google+ or Disqus identities.
Even if you are not registered on any of these, you can still post a comment.
comments powered by Disqus

 

 

Follow @Mohammed_Amin

Tap for top of page