I think that the general difficulty that many people have in understanding statistics is an important problem, because it leads people to misinterpret the world around them. General managers of baseball teams overpay for free agents coming off of good years because they underestimate the chances that the recent good year was just the result of variance around a mediocre mean – or at least they did until the Billy Beane era. Retail investors plow money into expensive mutual funds that have beaten the S&P 500 index for a few years in a row because they underestimate the chances that recent success is the result of pure, dumb luck; more importantly, the scandal of mutual fund expenses goes unchallenged because of the conventional wisdom that you should pay more to get into “better” funds. (I think it is possible, though unlikely, that some fund managers could actually be better than the market; but with all the statistical noise, you are not going to find them unless you look at a very long period of time.)
So I was happy to learn that my second-favorite radio show, Radiolab, was doing an episode on randomness. (You can stream it at that link, or download an MP3 from their podcast.) Their first segment does a good, clear job of debunking the human tendency to make too much of seemingly improbable events. For example, a woman in New Jersey wins the lottery in two consective years; what are the chances? But if you look at all the lotteries and all the lottery winners everywhere, it would be shocking if you didn’t have repeat winners.
Here’s an even simpler example from the show. Imagine blades of grass are sentient. There are millions of blades of grass in a fairway. Someone hits a drive and the golf ball crushes a single blade of grass. From the perspective of that blade, it’s a cruel freak of chance. But from the perspective of the fairway as a whole, it’s a near-certainty that some blade of grass will be crushed.*
But in the second segment, they take up that favorite example of statisticians everywhere: There are no streak shooters in basketball! (And if you think there are, you are just a weak creature of habit and prejudice who refuses to accept the pure truth of numbers.)
The story goes like this. Basketball players, announcers, and fans all believe that in certain games, or at certain times in a game, a player may become “hot” – he can’t miss, he’s on fire, he’s in the zone, etc. At that point, that player is shooting especially well, so his team should get him the ball. Well, say the statisticians, if you actually look at shooting percentages, you’ll see that his shooting percentage after making three consecutive shots is the same as it always is. In other words, if a player’s shooting percentage is 50%, and he hits five consecutive shots, that’s just random variation – there’s a 1/32 chance of that happening for any given sequence of five shots – so there’s no particular reason to think he’ll make the next one. Case closed.
This story has become such an article of faith in the “statistics are right/intuition is wrong” camp that it bears a little examination.
It dates back to a 1985 paper called “The Hot Hand in Basketball: On the Misperception of Random Sequences,” by Thomas Gilovich, Robert Vallone, and Amos Tversky. (It hasn’t hurt that Tversky was one of the modern founders of behavioral economics and almost certainly would have won the Nobel Prize with Daniel Kahneman had he not died in 1996 at the age of 59.) They define the “hot hand” hypothesis as “the belief that the performance of a player during a particular period is significantly better than expected on the basis of the player’s overall record” (pp. 295-96) and conclude that this belief reflects “the operation of a powerful and widely shared cognitive illusion.”
However, they actually prove something more modest: that the chances of a given player making a given field goal attempt are not related to the success or failure of his immediately preceding attempt or attempts (see Table 1, p. 299, and Table 2, p. 302). I can raise some quibbles here, like the fact that they don’t look at how much time passes between those shot attempts; if you hit two shots in the second quarter and then miss one in the fourth quarter, that’s the same to them as if you shoot on three consecutive possessions. (I believe some of their analyses even span separate games.) Remember, the median starter – say, your power forward or center on most teams – only takes about ten shots over the course of two hours. But the big issue is one they acknowledge: “The failure to detect evidence of streak shooting might also be attributed to the selection of shots by individual players and the defensive strategy of opposing teams” (p. 303). If someone is actually shooting better than usual, the other team will guard him more tightly, and he will also (rationally) choose to take slightly more difficult shots, both of which will push his actual field goal percentage down to his long-term average or even below it.
The authors deal with this objection in a clever way. They conducted an experiment with Cornell players, each of whom took 100 shots in succession from a variety of spots at a fixed distance from the basket. Then they analyzed those sequences of shots to look for correlations between one shot and the next, and also found that the results were remarkably similar to random sequences. However, it only takes about 10 minutes to take 100 shots (I assume someone was keeping the shooters fed with balls to make the experiment go more smoothly), so arguably those 100 shots are just one snapshot of a person’s shooting performance at one point in time. (Remember, it can easily take an NBA player in the rotation three weeks to take 100 game shots.)
So I will buy the conclusion that data about recent field goal attempts cannot be used to predict the outcome of the next field goal attempt. This is an analog to the efficient market hypothesis – you cannot predict which way an asset price will go based on its recent price movements. But I don’t think this proves that basketball players don’t shoot better in some periods and in some games than in others. The authors’ statement about the “performance” of a player is only necessarily true if we define performance narrowly to mean the likelihood of making his next field goal attempt, not if we define it to mean his shooting ability at that time.
Why do I cling to this difference, when I’m willing to believe that no one has the ability to beat the stock market? When it comes to stock prices, there is a very persuasive theory of why you can’t beat the market consistently; beating the market requires information, and if you have the information, then the people you are trading with already have that information, too. When it comes to basketball, it strains belief to think that your ability to shoot the ball is a constant, day after day, play after play, all the time. For one thing, sometimes you are tired, or sick (and few of us can replicate Michael Jordan’s “flu game”), or injured, or distracted; the idea that this wouldn’t affect your shooting seems preposterous. If your actual field goal attempts end up looking like random patterns, then I think that’s more likely a result of the complex and un-modelable way in which you, your team, and the other team adapt to each other.
* However, they did make one of those frustrating mistakes that leave you powerless in the car, helpless to stop the radio from saying things that are just not true. In the story, Jad and Robert flip a coin 100 times, and come up with a streak of 7 tails in a row. The chance of 7 tails in 7 flips is 1/128. The chances of getting a streak of 7 tails within a series of 100 flips is obviously much higher. But in the show they say that because 100/7 is about 14, there are 14 sets of 7 flips that you have to look at, and they calculate that the chance of getting a streak of 7 tails within a series of 100 flips is about 1/6.
This is just wrong; a series of 100 flips has not 14, but 94 different sets of 7 flips within it, so the chances of getting 7 consecutive tails are much higher than 1/6. Those 94 sets are not all independent, however, so it’s not as simple as calculating 1 – ((1 – 1/128) ^ 94). I used the brute force method and simulated 100 trials of 100 flips, and 31 of those trials had a streak of 7 tails. But 7 heads are just as remarkable as 7 tails, so you have to count those streaks, too; there were 36 of them. In total, 53 trials had a streak of 7 tails or a streak of 7 heads – meaning that such a streak is completely unremarkable.
By James Kwak