Archive for September 8, 2010

OKCupid, Religion and Readability

OK Cupid has been doing some interesting things recently with data mined from their users which, on the one hand, is fascinating and, on the other hand, is extremely creepy.

It all boils down to some pretty interesting data points (what do white/black/asian/Hispanic people like or have in common, what is the focus of men and women of different races, etc). The one thing I was a little shocked by was this graphic at the end.

I was a little surprised that the Protestant reading level was so low. And when I am surprised by data, I try to see if I can replicate it. (This strikes me as an eminently scientific thing to do and yet I am bemused when people think I’m “attacking” their data. Whatever.)

Note: I should mention at the outset that it does kind of irk me that OKCupid displays this data and then kind of assumes that it holds for all people across the board when it is obvious to anyone who devotes more than 5 seconds to thinking about it that the data really only holds for OKCupid users who (I’m going out on a limb here) are probably disproportionately young, tech savy and single.

Moving along. To determine reading levels they ran the Coleman-Liau Index on the profiles so I went and typed up two sample religious profile summaries, one Christian and one atheist. They’re only a couple sentences long which I figure is fine since OKCupid profile summaries aren’t exactly known for their complex narrative arcs.

Here are the profiles that I typed up, attempting to mimic what I thought would be a fair religious summary from a similar reading level.

Atheist: I am an atheist. I believe that there is no God and that most people only believe religion because they are taught to do so by society and possibly also their parents.

Christian: I am a Christian. I believe that Jesus died on the cross for my sins and that he was raised again on the third day. I think that the Bible teaches us the truth and that God loves us very much.

Running them through this Coleman Liau tester to see what popped up.

Christian: 5.47 grade level

Atheist: 10.15 grade level

I laughed.

Really? Those little blurbs are so radically different that the Christian one is 4.68 grades stupider based on nothing more than a readability analysis? Sounds like BS to me.

Let’s try adding some evolution in there:

Atheist: Same as before + “When it comes to the world around us, evolution is the most likely explanation for everything.”

Score: 12.23

Christian: Same as before + “When it comes to the world around us, I think there are probably gaps in evolutionary theory and that evolution can’t explain everything.”

Score: 8.90

Well, that closes the gap by 1.35 points, but we’re still looking at a 3.33 grade gap between two positions that are transparently written with an identical textual style.

I progressively tried to add more and more to the Christian profile to counter-act the low score I got from starting with the basics of Christian belief.

Finally I ended up with:

I am a Christian. I believe that Jesus died on the cross for my sins and that he was raised again on the third day. I think that the Bible is true and that God loves us very much. I think there are probably gaps in evolutionary theory and that evolution can’t explain everything. Additionally, the philosophical underpinnings for views that argue against Christianity frequently neglect to apply the same standard of ideological rigor to their own faith based assumptions. Consequently, they hold Christianity to a double standard assuming that their position is the default one and that there is no need to defend it.

Score: 12.33

This “Christian profile view” scores about  as well as an absurdly simple statement of atheism with a supporting line about evolutionary theory. Basically, the algorithm they used translates “6th grade atheism” at the same level of textual complexity as “basic Christian beliefs + a philosophy degree”. (I flatter myself somewhat, but the final two lines are clearly a college level writing style.)

Here is not what I’m saying: I don’t think there is any level of conspiracy theory behind any of this. No one designed the algorithm so that Christians would look stupid.


It seems likely that a simple statement of Christian belief like the one entered above anchors the score at the low end. The more someone communicates their Christian belief in the language that has been familiar in churches for centuries, the less likely they are to score well regardless of the remaining textual analysis of their profile. This anchoring effect might get lost if the profile was a three page essay. But profiles tend toward being short, simple statements meant to clearly indicate basic beliefs, inclinations, or personality traits.

Note: I promise I’ll pull back on religious topics. I just get irritated when people pull “evidence” of religious people being inferior in some way shape or form. It usually strikes me as hackery that the creators or purveyors of whatever data set are perfectly happy to accept and so they neglect to do any sort of skeptical follow-up.

You can actually see the same thing with a lot of war-based data. Half the time, the people pointing to the data didn’t even get the data right and a good chunk of the remaining examples strip context out of the data. Bugs the hell out of me.

A Visual Essay On The Koran-Burning Church

Religious Outliers Nonsense (or “Atheists Are Richer Than Religious People If You Take All Poor Atheists Out Of Your Sample”)

Charles Blow’s most recent New York Times op-ed is something of a boon for visualization enthusiasts. He replaces almost his entire article with a visualization. This illustrates that he recognizes power of visual communication to make and reinforce a point in a way that is self-obvious and can stick with the reader better than words.

Unfortunately, he has decided to use data that misleads his audience to such an extent that I can only conclude that he is unconcerned with the truth insofar as it undermines his desired objective.

Blow’s main point is that the US is an outlier in the world because we’re religious but also rich while “religiosity was highly correlated to poverty”.

I’ve reproduced the chart in question below. (Click to enlarge)


Now, keep in mind that this is not charting religion as it is listed in the CIA World Factbook, but according to the specific question: “Is religion an important part of your daily life?” That will be important in a little bit.

This chart seems to prove his point. Until you realize what isn’t on the map.

Here is a list of the countries that didn’t manage to make their way onto the map due to the fact that Gallup didn’t poll them:

China – 1.33 billion people, heavily non-religious, poor

North Korea – 22 million people, heavily non-religious, unbelievably poor

Cuba – 11 million people, presumed non-religious, poor

Taiwan – 23 million people, 93% Buddist*, rich (comparable to Japan)

Problem number one – Charles Blow has a duty to inform his audience of these omissions. The countries without data represent nearly 25% of the world population and skew heavily toward non-religious. They are too large and too important to the data set and visual reference to simply ignore. Yet Mr. Blow doesn’t seem interested in mentioning them.

Problem number two – Mr. Blow heavily implies that there is a causal relationship between religiosity and wealth. But (as we all know) correlation doesn’t imply causation. Western European countries (and countries filled with people from Western Europe) are richer, as are developed Asian countries. Eastern European and South American countries are less rich. Middle eastern, and African countries tend to be much poorer. There’s a correlation in geo-political histories here that is stronger than religion.

Of course Mr. Blow could always go to rural India and inform them that their poverty is related to their devotion to Hindu and has nothing to do with British imperialism. Or perhaps to the deep south where he can proclaim to the +90% Christian black population that their economic woes are related to their religious tendencies.

Problem number 3 – But the final problem is the worst one because it involves an outright lie:

Singapore is more religious and richer than the United States. And Mr. Blow didn’t map it. At all.

It’s possible that Mr. Blow is actually so numerically illiterate that he didn’t know he was supposed to tell people about key missing data points. But taking out data that doesn’t align with his point is disgusting manipulation. The end result of his deception (conscious or otherwise) is “If you take out all the poor atheists and take out all the rich religious people, then this pattern emerges…”

Mr. Blow should put Singapore back in to the data set and add a correction to his article that announces how his data set has enormous gaping holes. And he should probably never be allowed to touch charting software again.

* The CIA Factbook has Taiwan listed at 93% Buddhist, but I’m not sure how they would answer the specific question that Gallup asked. I’ve heard some atheists claim Buddhism as an “atheistic religion” (no personal god) so it could be that the citizens of Taiwan wouldn’t say that religion plays a big role. I simply don’t know.