Standard Deviations.

While working on my own forthcoming book The Inside Game (due out April 21st from HarperCollins; pre-order now!), I stumbled across a chapter from Prof. Gary Smith’s book Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics, a really wonderful book on how people, well-meaning or malicious, use and misuse stats to make their arguments. It’s a very clear and straightforward book that assumes no prior statistical background on the part of the reader, and keeps things moving with entertaining examples and good summaries of Smith’s points on the many ways you can twist numbers to say what you want them to say.

Much of Smith’s ire within the book is aimed at outright charlatans of all stripes who know full well that they’re misleading people. The very first example in Standard Deviations describes the media frenzy over Paul the Octopus, a mollusk that supposedly kept picking the winners of World Cup games in 2010. It was, to use the technical term for it, the dumbest fucking thing imaginable. Of course this eight-legged cephalopod wasn’t actually predicting anything; octopi are great escape artists, but Paul was just picking symbols he recognized, and the media who covered those ‘predictions’ were more worthy of the “fake news” tag now applied to any media the President doesn’t like. Smith uses Paul to make larger points about selection bias and survivorship bias, about how some stories become news and some don’t, how the publish-or-perish mentality at American universities virtually guarantees that some junk studies (found via p-hacking or other dubious methods) will slip through the research cracks, and so on. This is more than just an academic problem, however: One bad study that can’t survive other researchers’ attempts to replicate the results can still lead to significant media attention and even steer changes in policy.

Smith gives copious examples of this sequence of events – bad or corrupt study that leads to breathless news coverage and real-life consequences. He cites Andrew Wakefield, the disgraced former doctor whose single fraudulent paper claimed to find a link between the MMR vaccine and autism; the media ran with it, many parents declined to give their kids the MMR vaccine, and even now, twenty years and numerous debunking studies later, we have measles outbreaks and a reversal of the eradication the hemisphere had achieved in 2000. Smith chalks some of this up to the publish-or-perish mentality of American universities, also mentioning Diederik Stapel, a Dutch ex-professor who has now had 58 papers retracted due to his own scientific misconduct. But these egregious examples are just the tip of a bigger iceberg of statistical malfeasance that’s less nefarious but just as harmful: finding meaning in statistical significance, journals’ preferences for publishing affirmative studies over negative ones (the file drawer problem), “using data to discover a theory” rather than beginning with a theory and using data to test it, discarding outliers (or, worse, non-outliers), and more.

Standard Deviations bounces around a lot of areas of statistical shenanigans, covering some familiar ground (the Monty Hall problem and the Boy or Girl problem*) and less familiar as well. He goes after the misuse of graphs in popular publications, particularly the issue of Y-axis manipulation (where the Y axis starts well above 0, making small changes across the X-axis look larger), and the “Texas sharpshooter” problem where people see patterns in random clusters and argue backwards into meaning. He goes after the hot hand fallacy, which I touched on in Smart Baseball and will discuss again from a different angle in The Inside Game. He explains why the claims that people nearing death will themselves to live through birthdays or holidays don’t hold up under scrutiny. (One of my favorite anecdotes is the study of deaths before/after Passover that identified subjects because their names sounded “probably Jewish.”) Smith’s reach extends beyond academia; one chapter looks at how Long-Term Capital Management failed, including how the people leading the firm deluded themselves into thinking they had figured out a way to beat the market, and then conned supposedly smart investors into playing along.

* Smith also explains why Leonard Mlodinow’s explanation in Drunkard’s Walk, which I read right after this book, of a related question where you know one Girl’s name is Florida is incorrect, and thank goodness because for the life of me I couldn’t believe what Mlodinow wrote.

I exchanged emails with Smith in September to ask about the hot hand fallacy and a claim in 2018 by two mathematicians that they’d debunked the original Amos Tversky paper from 1986; he answered with more detail that I ended up using in a sidebar in The Inside Game. That did not directly color my writeup of Standard Deviationshere, but my decision to reach out to him in the first place stems from my regard for Smith’s book. It’s on my list now of books I recommend to folks who want to read more about innumeracy and statistical abuse, in the same vein as Dave Levitan’s Not a Scientist.

Next up: About halfway through Mary Robinette Kowal’s The Calculating Stars.

Everybody Lies.

Seth Stephens-Davidowitz made his name by using the enormous trove of data from Google search inquiries – that is, what users all over the world type in the search box – to measure things that researchers would typically measure solely by voluntary responses to surveys. And, as Stephens-Davidowitz says in the title of his first book, Everybody Lies, those surveys are not that reliable. It turns out, to pick one of the most notable results of his work (described in this book), that only 2-3% of men self-report as gay when asked in surveys, but the actual rate is probably twice that, based on the data he mined from online searches.

Stephens-Davidowitz ended up working for a year-plus at Google as a data scientist before leaving to become an editorial writer at the New York Times and author, so the book is bit more than just a collection of anecdotes like later entries in the Freakonomics series. Here, the author is more focused on the potential uses and risks of this enormous new quantity of data that, of course, is being collected on us every time we search on Google, click on Facebook, or look for something on a pornography site. (Yep, he got search data from Pornhub too.)

The core idea here is twofold: there are new data, and these new data allow us to ask questions we couldn’t answer before, or simply couldn’t answer well. People won’t discuss certain topics with researchers, or even answer surveys truthfully, but they will spill everything to Google. Witness the derisive term “Dr. Google” for people who search for their symptoms online, where they may end up with information from fraudsters or junk science sites like Natural News or Mercola, rather than seeing a doctor. What if, however, you looked at people who reveal through their searches that they have something like pancreatic cancer, and then looked at the symptoms those same people were Googling several weeks or months before their diagnosis? Such an approach could allow researchers to identify symptoms that positively correlate with hard-to-detect diseases, and to know the chances of false positives, or even find intermediate variables that alter the probability the patient has the disease. You could even build expert systems that really would work like Dr. Google – if I have these five symptoms, but not these three, should I see a real doctor?

Sex, like medical topics, is another subject people don’t like to discuss with strangers, and it happens to sell books too, so Stephens-Davidowitz spent quite a bit of time looking into what people search for when they’re searching about sex, whether it’s pornography, dating sites, or questions about sex and sexuality. The Pornhub data trove reveals quite a bit about sexual orientations, along with some searches I personally found a bit disturbing. Even more disturbing, however, is just how many Americans secretly harbor racist views, which Stephens-Davidowitz deduces from internet searches for certain racial slurs, and even shows how polls underestimated Donald Trump’s appeal to the racist white masses by demonstrating from search data how many of these people are out there. Few racists reveal themselves as such to surveys or researchers, and such people may even lie about their voting preferences or plans – saying they were undecided when they planned to vote for Trump, for instance. If Democrats had bothered to get and analyze this data, which is freely available, would they have changed their strategies in swing states?

Some of Stephens-Davidowitz’s queries here are less earth-shattering and seem more like ways to demonstrate the power of the tool. He looks at whether violent movies actually correlate to an increase in violent crime (spoiler: not really), and what first-date words or phrases might indicate a strong chance for a second date. But he also uses some of these queries to talk about new or revived study techniques, like A/B testing, or to show how such huge quantities of data can lead to spurious correlations, a problem known as “the curse of dimensionality,” such as in studies that claim a specific gene causes a specific disease or physical condition that then aren’t replicated by other researchers.

Stephens-Davidowitz closes with some consideration of the inherent risks of having this much information about us available both to corporations like Google, Facebook, and … um … Pornhub, as well as the risks of having it in the hands of the government, especially with the convenient excuse of “homeland security” always available to the government to explain any sort of overreach. Take the example in the news this week that a neighbor of Adam Lanza, the Sandy Hook mass murderer, warned police that he was threatening to do just such a thing, only to be told that the police couldn’t do anything because his mother owned the guns legally. What if he’d searched for this online? For ways to kill a lot of people in a short period of time, or to build a bomb, or to invade a building? Should the FBI be knocking on the doors of anyone who searches for such things? Some people would say yes, if it might prevent Sandy Hook or Las Vegas or San Bernardino or the Pulse Orlando or Columbine or Virginia Tech or Luby’s or Binghamton or the Navy Yard. Some people will consider this an unreasonable abridgement of our civil liberties. Big Data forces the conversation to move to new places because authorities can learn more about us than ever before – and we’re the ones giving them the information.

Next up: J.M. Coetzee’s Waiting for the Barbarians.