Think Twice

Michael Mauboussin’s short book on the psychology of bad decisions, Think Twice, features an endorsement on its cover from Billy Beane, saying he hopes his competitors don’t read the book. While it doesn’t go into anywhere near the depth on the psychology (and neurology) of decision-making as Daniel Kahnemann’s Thinking, Fast and Slow, Mauboussin’s book covers much of the same ground and does so in a quick, superficial way that might reach more people than Kahnemann’s more thorough but often dense treatise could.

Mauboussin’s book carries the subtitle “Harnessing the Power of Counterintuition,” but I would describe it more as a guide to avoiding decisions based on easily avoidable mental traps. Think Twice has eight chapters dealing with specific traps, most of which will be familiar to readers of Kahnemann’s book: base-rate neglect, tunnel vision, irrational optimism, overreliance on experts, ignoring context, phase transitions (black and grey swans), and conflating skill and luck. Where Kahnemann went into great depth with useful examples and sometimes less-useful descriptions of fMRI test results, Mauboussin writes like he can’t get to the point fast enough – an often desirable trait in the popular business non-fiction section of the bookstore, since the assumption is that business executives don’t have time to read (even if the book might save millions of dollars).

That lightweight approach still gives Mauboussin plenty of space to hammer home the critical lessons of the book. Some of his examples don’t need a lot of explanation, such as pointing out that playing French music or German music in a wine store aisle with wines from both countries skewed consumer choices – even though those consumers explicitly denied that the music affected their choices. (Context matters.) He targets sportswriters directly when discussing their (our) difficulty (or inability) in distinguishing skill from luck – and, in my experience, fans often don’t want to hear that something is luck, even when the sample size is so small that you couldn’t prove it was skill no matter how broad the confidence test. He mentions The Boss going off in the papers when the Yankees started 4-12 in 2005, and writers buying right into the narrative (or just enjoying the free content Steinbrenner was providing). But we see it every October, and during every season; are the Giants really the best team in baseball, or is there an element of luck (or, to use the more accurate term, randomness) in their three championship runs in five seasons? Yet we see articles that proclaim players to be clutch or “big game” every year; my colleague Skip Bayless loves to talk about the “clutch gene,” yet I see no evidence to support its existence. I think Mauboussin would take my side in the debate, and he’d argue that an executive making a decision on a player needs to set aside emotional characterizations like that and focus on the hard data where the sample sizes are sufficiently large.

His chapter on the world’s overreliance on experts also directly applies to the baseball industry, both within teams and within the media. It is simply impossible for any one person to be good enough at predictions or forecasting to beat a well-designed projection system. I could spend every night from February 10th until Thanksgiving scouting players, see every prospect every year, and still wouldn’t be better on a macro level at predicting, say, team won-lost records or individual player performances than ZiPS or Steamer or any other well-tested system. The same goes for every scout in the business, and it’s why the role of scouting has already started to change. Once data trackers (like Tracman) can provide accurate data on batted ball speeds/locations or spin rate on curveballs for most levels of the minors and even some major college programs, how much value will individual scouts’ opinions on player tools matter in the context of team-level decisions on draft picks or trades? The most analytically-inclined front offices already meld scouting reports with such data, using them all as inputs to build better expert systems that can provide more accurate forecasts – which is the goal, because whether you like projection systems or not, you want your team to make the best possible decisions, and you can’t make better decisions without better data and better analysis of those data. (Mauboussin does describe situations where experts can typically beat computer models, but those are typically more static situations where feedback is clear and cause/effect relationships are simple. That’s not baseball.)

Mauboussin’s first chapter describes the three central illusions that lead to irrational optimism, one we see all the time in baseball when teams are asked to evaluate or potentially trade their own prospects: the illusions of superiority, optimism, and control. Our prospects are better than everyone else’s because we scout better, we develop better, and we control their development paths. When you hear that teams are overrating prospects, sometimes that’s just another GM griping that he can’t get what he wants for his veteran starter, but it can also be this irrational optimism that leads many teams to overrate their own kids. There’s a strong element of base-rate neglect in all of these illusions; if you have a deep farm system with a dozen future grade-50 prospects, you know, based on all of the great, deep systems we’ve seen in the last few years (the Royals, Rangers, Padres, Red Sox, Astros) that some of those players simply won’t work out, due to injuries, undiscovered weaknesses, or just youneverknows. A general manager has to be willing to take the “outside view” of his own players, viewing them through objective lenses, rather than the biased “inside view,” which also requires that he be able to take that view because he has the tools available to him and the advisers who are willing to tell him “no.”

The passage on unintended consequences is short and buried within a chapter on complex adaptive systems, but if I could send just two pages of the book to new MLB Commissioner Rob Manfred, I’d send these. Mauboussin gives two examples, one of incompetent wildlife management in Yellowstone Park, one of the feds’ decision to let Lehman Brothers fail and thus start the 2008 credit crisis, both of which involve single actions to a complex system that the actors didn’t fully understand (or try to). So when MLB tries to tinker with the draft, or fold in the July 2nd international free agents into the rule 4 draft or a new one, or changes free agent compensation rules … whatever they do, this is a complex system with hundreds of actors who will react to any such rules changes in ways that can’t be foreseen without a look at the entire system.

The seven-page concluding chapter is a great checklist for anyone trying to bring this kind of “counterintuitive” thinking into an organization or just into his/her own decision-making. It’s preventative: here’s how you avoid rushing into major decisions with insufficient data or while under a destructive bias. I can see why Beane doesn’t want other GMs or executives reading this; competing against people who suffer from these illusions and prejudices is a lot easier than competing against people who think twice.

The Checklist Manifesto.

I learned of Atul Gawande’s brief business book The Checklist Manifesto: How to Get Things Right through a positive mention of it in Daniel Kahneman’s fantastic book on cognitive psychology, Thinking, Fast and Slow. Gawande, a successful surgeon in Boston, wrote two books on improving medical care through optimizing processes (rather than throwing money at new equipment or drugs). His third book is aimed at a more general audience, extolling the virtues of the checklist as a simple, effective way to reduce the frequency of the most avoidable errors in any complex system, even eliminating them entirely, saving money and even lives at a near-zero upfront cost.

When Gawande discusses checklists, he’s using the term in the sense of a back-check, a list that ensures that all essential steps have been taken before the main event – a surgery, a plane’s takeoff, a large investment – occurs. This isn’t a to-do list to get you through the day, the type of checklist I make every morning or the night before to make sure I don’t forget any critical tasks, work or personal, from paying bills to making phone calls to writing a dish post. Gawande instead argues for better planning before that first incision, saying that key steps are often overlooked due to a lack of communication, excessive centralization in a single authority (the surgeon, the pilot, etc.), or focus on more urgent steps that detracts from routine ones.

Gawande illustrates his points about the design and use of checklists primarily through his own experiences in surgery and through his work with the WHO on a project to reduce complication rates from surgery in both developed and developing countries – a mandate that included the requirement that any recommendations involve little or no costs to the hospitals. That all but assured that Gawande’s group would only be able to recommend process changes rather than equipment or hiring requirements, which led to a focus on what steps were often skipped in the operating room, deliberately or inadvertently. Several common points emerged. For example, other medical personnel in the room saw surgeons as authoritarian figures and wouldn’t speak up to enforce key steps like ensuring antibiotics were being delivered prior to incision, or critical information wasn’t passed between team members before the operation began. To solve these issues, Gawande needed to devise a way to increase communication among team members despite superficial differences in rank.

The group took a cue from aviation, with Gawande walking the reader back to the creation of preflight checklists and visiting Boeing to understand the method of developing checklists that work. (There’s been some backlash to Gawande’s recommendations, such as the fact that surgeons can “game” a checklist in various ways, detailed in this NEJM subscriber-only piece.) A checklist must be concise and clear, and must grab the lowest-hanging fruit – the most commonly-missed steps and/or the steps with the greatest potential payoff. The checklist also has a secondary purpose – perhaps even more important than making sure the steps on the list have been followed – which is increasing communication. Gawande fills in the blanks with examples from medicine, aviation, and finance of how simple and perhaps “stupid” errors have helped avoid massive mistakes – or how skipping steps or hewing to old hierarchies of command have led to great tragedies, including the worst aviation disaster in history, the 1977 runway crash of two Boeing 747s at Tenerife North Airport in the Canary Islands, killing 583 people. (This isn’t a great book to read if you’re afraid of flying or of surgery.)

Gawande reports positive results from the implementation of pre-surgery checklists in both developed and developing countries, even in highly challenging conditions in Tanzania, Jordan, and India. Yet he also discusses difficulties with buy-in due to surgeons being unwilling to cede any authority in the operating room or to divert attention from what they see as more critical tasks. Acceptance of checklists appears to have been easier in aircraft cockpits, while in the investment world, Gawande presents a little evidence that checklists have made virtually no inroads despite a few investors finding great success in using them to override their emotional (“fast thinking”) instincts.

Even if you’re in an industry where checklists don’t have this kind of immediate value, it’s easy to see how they might apply to other fields with sufficiently positive ROIs to make their implementation worth considering. A major league team might have a checklist to use before acquiring any player in trade, for example – looking at recent reports and game logs to make sure he’s not injured, talking to a former coach or teammate to ensure there’s no character issues, etc. A well-designed blank scouting report is itself a checklist, a way of organizating information to also force the scout to answer the most important questions on each player. (Of course, having pro scouts write up all 25 players on each minor league team they scout runs counter to that purpose, because they’re devoting observation time to players who are completely irrelevant to the scout’s employers.) The checklist is more than just a set of tasks; it’s a mindset, a way of forcing communication on group tasks while also attempting to avoid high-cost mistakes with a tiny investment of time and attention. If the worst thing you can say about an idea is that people need to be convinced to use it, that’s probably a backhanded way of saying it’s worth implementing.

Next up: I’m about halfway through Ursula K. Le Guin’s utopian/dystopian novel The Dispossessed.

Thinking, Fast and Slow.

Daniel Kahneman won the 2002 Nobel Prize in Economic Sciences (yes, the ‘fake’ Nobel) for his groundbreaking work in behavioral economics, the branch of the dismal science that shows we are even bigger idiots than we previously believed. Kahnemann’s work, and his best-selling book Thinking, Fast and Slow, identify and detail the various cognitive biases and illusions that affect our judgment and decision-making, often leading to suboptimal or undesirable outcomes that might be avoided if we stop and think more critically and less intuitively. (It’s just $2.99 for the Kindle right now, through that link.)

Kahnemann breaks the part of our brain that responds to questions, challenges, or other problems into two separate systems, which he calls System 1 and System 2. System 1 is the fast-reaction system: When you hear or read a question, or face a specific stimulus, your brain brings back an answer, an image, or a memory without you having to consciously search the hard drive and call up the file. System 2 does what we would normally think of as “thinking:” slow calculations, considering variables, weighing options, and so on. The problem, as Kahnemann defines it, is that System 2 is lazy and often takes cues from System 1 without sufficiently questioning them. System 1 can be helpful, but it isn’t always your friend, and System 2 is passed out drunk half the time you need it.

Thing 1 and Thing 2
Systems 1 and 2 in a rare moment of concordance.

The good news here is that Kahneman’s work, much of which with his late colleage Amos Tversky (who died before he could share the Nobel Prize with Kahneman), offers specific guidance on the breakdowns in our critical thinking engines, much of which can be circumvented through different processes or detoured by slowing down our thinking. One of the biggest pitfalls is what Kahneman calls WYSIATI – What You See Is All There Is, the process by which the brain jumps to a conclusion on the basis of insufficient evidence, because that evidence is all the brain has, and the human brain has evolved to seek causes for events or patterns. This leads to a number of biases or errors, including:

  • The halo effect: You like someone or something, and thus you judge that person or object or story more favorably. This is why good-looking politicians fare better than ugly ones in polls.
  • The framing effect: How you ask the question alters the answer you receive. Kahnemann cites differing reactions to the same number presented two ways, such as 90% lean vs 10% fat, or a 0.01% mortality rate versus 100 deaths for every 1 million people.
  • Base-rate neglect: A bit of mental substitution, where the brain latches on to a detail about a specific example without adequately considering the characteristics of that example’s larger group or type.
  • Overconfidence: This combines the WYSIATI problem with what I’ll call the “it can’t happen to me” syndrome, which Kahneman correctly identifies as a core explanation for why so many people open restaurants, even though the industry has one of the highest failure rates around.

Although Kahneman has crafted enough of a flow to keep the book coherent from chapter to chapter, Thinking, Fast and Slow is primarily a list of significant biases or flawed heuristics our brains employ and explanations of how they work and how to try to avoid them. This includes the availability heuristic, where we answer a question about probability or prevalence by substituting the easier question of how easy it is to remember examples or instances of the topic in question. If I give you a few seconds to tell me how many countries there are in Africa, you might name a few in your head, and the faster those names come to you, the larger your guess will be for the total.

Thinking, Fast and Slow also offers an unsettling section for anyone whose career is built on obtaining and delivering knowledge, such as subject-matter experts paid for their opinions, a category that includes me: We aren’t that good at our jobs, and we probably can’t be. One major reason is the representativeness fallacy, which leads to the base-rate neglect I mentioned earlier. The representativeness fallacy leads the subject – let’s say an area scout here, watching a college position player – to overvalue the variables he sees that are specific to this one player, without adequately weighting variables common to the entire class of college position players. It may be that college position players from that particular conference don’t fare as well in pro ball as those from the SEC or ACC; it may be that college position players who have or lack a specific skill have higher/lower rates of success. The area scout’s report, taken by itself, won’t consider those “base rates” enough, if at all, and to a large degree teams do not expect or ask the area scouts to do so. However, teams that don’t employ any kind of system to bring those base rates into their overall decision-making, from historical research on player archetypes to analysis of individual player statistics adjusted for context, will confuse a plethora of scouting opinions for a variety of viewpoints, and will end up making flawed or biased decisions as a result.

Kahneman’s explanation of regression to the mean, and how that should impact our forecasting, is the best and clearest I’ve come across yet – and it’s a topic of real interest to anyone who follows baseball, even if you’re not actually running your own projections software or building an internal decision-sciences system. Humans are especially bad at making predictions where randomness (“luck”) is a major variable, and we tend to overweight recent, usually small samples and ignore the base rates from larger histories. Kahneman lays out the failure to account for regression in a simple fashion, pointing out that if results = skill + luck, then the change in results (from one game to the next, for example) = skill + change in luck. At some point, skill does change, but it’s hard or impossible to pinpoint when that transpires. Many respected baseball analysts working online and for teams argue for the need to regress certain metrics back to the mean to try to account for the interference of randomness; one of my main concerns with this approach is that while it’s rational, it may make teams slower to recognize actual changes in skill level (or health, which affects skill) as a result. Then again, that’s where scouts can come in, noticing a decline in bat speed, a change in arm slot, or a new pitch that might explain why the noise has more signal than a regression algorithm would indicate.

One more chapter relevant to sports analytics covers the planning fallacy, or what Christina Kahrl always referred to as “wishcasting:” Forecasting results too close to best-case scenarios that don’t adequately consider the results of other, similar cases. The response, promulgated by Danish planning expert Bert Flyvbjerg (I just wanted to type that name), is called reference class forecasting, and is just what you’d expect the treatment for the planning fallacy to include. If you want to build a bridge, you find as many bridge construction projects as you can, and obtain all their statistics, such as cost, time to build, distance to be covered, and so on. You build your baseline predictions off of the inputs and results of the reference class, and you adjust it accordingly for your specific case – but only slightly. If all 30 MLB teams did this, no free-agent reliever would ever get a four-year deal again.

Thinking explains many other biases and heuristics that lead to inferior decision-making, including loss aversion, the endowment effect, and the one Ned Colletti just screwed up, the sunk cost fallacy, where money that is already spent (whether you continue to employ the player or not) affects decisions on whether or not to continue spending on that investment (or to keep Brandon League on the 40-man roster). He doesn’t specifically name recency bias, but discusses its effects at length in the final section, where he points out that if you ask someone how happy s/he is with his/her life, the answer will depend on what’s happened most recently (or is happening right now) to the respondent. This also invokes the substitution effect: It’s hard for me to tell you exactly how happy or satisfied I am with my life as a whole, so my brain will substitute an easier question, namely how happy I feel at this specific moment.

That last third of the book shifts its focus more to the psychological side of behavioral economics, with subjects like what determines our happiness or satisfaction with life or events within, and the difficulty we have in making rational – that is, internally consistent – choices. (Kahneman uses the word “rational” in its economic and I think traditional sense, describing thinking that is reasonable, coherent, and not self-contradictory, rather than the current sense of “rational” as skeptical or atheist.) He presents these arguments with the same rigor he employs throughout the book, and the fact that he can be so rigorous without slowing down his prose is Thinking‘s greatest strength. While Malcolm Gladwell can craft brilliant narratives, Kahneman builds his story up from scientific, controlled research, and lets the narrative be what it may. (Cf. “narrative fallacy,” pp. 199-200.) If there’s a weak spot in the book, in fact, it comes when Kahneman cites Moneyball as an example of a response (Oakland’s use of statistical analysis) to the representativeness fallacy of scouting – but never mentions the part about Tim Hudson, Mark Mulder, and Barry Zito helping lead to those “excellent results at low cost.” That aside – and hey, maybe he only saw the movie – Thinking, Fast and Slow is one of the most important books for my professional life that I have ever read, and if you don’t mind prose that can be a little dense when Kahneman details his experiments, it is an essential read.