Statistics and Basketball for Beginners

I think that the general difficulty that many people have in understanding statistics is an important problem, because it leads people to misinterpret the world around them. General managers of baseball teams overpay for free agents coming off of good years because they underestimate the chances that the recent good year was just the result of variance around a mediocre mean – or at least they did until the Billy Beane era. Retail investors plow money into expensive mutual funds that have beaten the S&P 500 index for a few years in a row because they underestimate the chances that recent success is the result of pure, dumb luck; more importantly, the scandal of mutual fund expenses goes unchallenged because of the conventional wisdom that you should pay more to get into “better” funds. (I think it is possible, though unlikely, that some fund managers could actually be better than the market; but with all the statistical noise, you are not going to find them unless you look at a very long period of time.)

So I was happy to learn that my second-favorite radio show, Radiolab, was doing an episode on randomness. (You can stream it at that link, or download an MP3 from their podcast.) Their first segment does a good, clear job of debunking the human tendency to make too much of seemingly improbable events. For example, a woman in New Jersey wins the lottery in two consective years; what are the chances? But if you look at all the lotteries and all the lottery winners everywhere, it would be shocking if you didn’t have repeat winners.

Here’s an even simpler example from the show. Imagine blades of grass are sentient. There are millions of blades of grass in a fairway. Someone hits a drive and the golf ball crushes a single blade of grass. From the perspective of that blade, it’s a cruel freak of chance. But from the perspective of the fairway as a whole, it’s a near-certainty that some blade of grass will be crushed.*

But in the second segment, they take up that favorite example of statisticians everywhere: There are no streak shooters in basketball! (And if you think there are, you are just a weak creature of habit and prejudice who refuses to accept the pure truth of numbers.)

The story goes like this. Basketball players, announcers, and fans all believe that in certain games, or at certain times in a game, a player may become “hot” – he can’t miss, he’s on fire, he’s in the zone, etc. At that point, that player is shooting especially well, so his team should get him the ball. Well, say the statisticians, if you actually look at shooting percentages, you’ll see that his shooting percentage after making three consecutive shots is the same as it always is. In other words, if a player’s shooting percentage is 50%, and he hits five consecutive shots, that’s just random variation – there’s a 1/32 chance of that happening for any given sequence of five shots – so there’s no particular reason to think he’ll make the next one. Case closed.

This story has become such an article of faith in the “statistics are right/intuition is wrong” camp that it bears a little examination.

It dates back to a 1985 paper called “The Hot Hand in Basketball: On the Misperception of Random Sequences,” by Thomas Gilovich, Robert Vallone, and Amos Tversky. (It hasn’t hurt that Tversky was one of the modern founders of behavioral economics and almost certainly would have won the Nobel Prize with Daniel Kahneman had he not died in 1996 at the age of 59.) They define the “hot hand” hypothesis as “the belief that the performance of a player during a particular period is significantly better than expected on the basis of the player’s overall record” (pp. 295-96) and conclude that this belief reflects “the operation of a powerful and widely shared cognitive illusion.”

However, they actually prove something more modest: that the chances of a given player making a given field goal attempt are not related to the success or failure of his immediately preceding attempt or attempts (see Table 1, p. 299, and Table 2, p. 302). I can raise some quibbles here, like the fact that they don’t look at how much time passes between those shot attempts; if you hit two shots in the second quarter and then miss one in the fourth quarter, that’s the same to them as if you shoot on three consecutive possessions. (I believe some of their analyses even span separate games.) Remember, the median starter – say, your power forward or center on most teams – only takes about ten shots over the course of two hours. But the big issue is one they acknowledge: “The failure to detect evidence of streak shooting might also be attributed to the selection of shots by individual players and the defensive strategy of opposing teams” (p. 303). If someone is actually shooting better than usual, the other team will guard him more tightly, and he will also (rationally) choose to take slightly more difficult shots, both of which will push his actual field goal percentage down to his long-term average or even below it.

The authors deal with this objection in a clever way. They conducted an experiment with Cornell players, each of whom took 100 shots in succession from a variety of spots at a fixed distance from the basket. Then they analyzed those sequences of shots to look for correlations between one shot and the next, and also found that the results were remarkably similar to random sequences. However, it only takes about 10 minutes to take 100 shots (I assume someone was keeping the shooters fed with balls to make the experiment go more smoothly), so arguably those 100 shots  are just one snapshot of a person’s shooting performance at one point in time. (Remember, it can easily take an NBA player in the rotation three weeks to take 100 game shots.)

So I will buy the conclusion that data about recent field goal attempts cannot be used to predict the outcome of the next field goal attempt. This is an analog to the efficient market hypothesis – you cannot predict which way an asset price will go based on its recent price movements. But I don’t think this proves that basketball players don’t shoot better in some periods and in some games than in others. The authors’ statement about the “performance” of a player is only necessarily true if we define performance narrowly to mean the likelihood of making his next field goal attempt, not if we define it to mean his shooting ability at that time.

Why do I cling to this difference, when I’m willing to believe that no one has the ability to beat the stock market? When it comes to stock prices, there is a very persuasive theory of why you can’t beat the market consistently; beating the market requires information, and if you have the information, then the people you are trading with already have that information, too. When it comes to basketball, it strains belief to think that your ability to shoot the ball is a constant, day after day, play after play, all the time. For one thing, sometimes you are tired, or sick (and few of us can replicate Michael Jordan’s “flu game”), or injured, or distracted; the idea that this wouldn’t affect your shooting seems preposterous. If your actual field goal attempts end up looking like random patterns, then I think that’s more likely a result of the complex and un-modelable way in which you, your team, and the other team adapt to each other.

* However, they did make one of those frustrating mistakes that leave you powerless in the car, helpless to stop the radio from saying things that are just not true. In the story, Jad and Robert flip a coin 100 times, and come up with a streak of 7 tails in a row. The chance of 7 tails in 7 flips is 1/128. The chances of getting a streak of 7 tails within a series of 100 flips is obviously much higher. But in the show they say that because 100/7 is about 14, there are 14 sets of 7 flips that you have to look at, and they calculate that the chance of getting a streak of 7 tails within a series of 100 flips is about 1/6.

This is just wrong; a series of 100 flips has not 14, but 94 different sets of 7 flips within it, so the chances of getting 7 consecutive tails are much higher than 1/6. Those 94 sets are not all independent, however, so it’s not as simple as calculating 1 – ((1 – 1/128) ^ 94). I used the brute force method and simulated 100 trials of 100 flips, and 31 of those trials had a streak of 7 tails. But 7 heads are just as remarkable as 7 tails, so you have to count those streaks, too; there were 36 of them. In total, 53 trials had a streak of 7 tails or a streak of 7 heads – meaning that such a streak is completely unremarkable.

By James Kwak

46 thoughts on “Statistics and Basketball for Beginners

  1. Those guys have never played basketball at a championship level. The confidence of the shooter affects his performance. As he makes a string of difficult shots his mental and physical skills increase. This is not very pronounced in a game that has little value but in the finals it has real value. Michael Jordan was not any better a shooter than a dozen or more of his peers. He had peers who could leap as high and run as fast as he could. What he had that was unmatched was an irrepressable belief that he was the best at the moment when it mattered the most.

  2. The next Radio Lab show modified the story a little bit. They said that each shot was not an independent event. So the simplified explanation they used on the last show was not right. Basically players know they are on a streak and they modify their actions to increase the chances of continuing their streak. If each event is not independent the math becomes a lot harder.

  3. hhhmmnnn….so can this also be read to mean the SEC was completely incompetent in regards to Madoff? If so, well put.

  4. Any interested in probability, business and the stock or options markets should read Taleb’s first book “Fooled By Randomness”. It’s very readable. I personally found “Black Swan” quite a chore and much less useful.

    Another good one is “The Drunkard’s Walk” by Mlodinow but it it doesn’t have an investment focus.

  5. One day at the supermarket my purchase total was an exact number of dollars – no cents. The clerk and bagger looked at each other and asked “I wonder what the chances of that are?”. I replied “one in a hundred”. Their response – “no way”.

    Admittedly my analysis didn’t take into account the habit of pricing items ending in .98 or .99, or multiples of a nickel, but I suspect my estimate remains pretty close to correct.

  6. Based on my own experience I would have estimated the probability to be much lower than 1/100.

    I would say that’s the kind of thing that happens to me once every 5 years. Taking into account all possible kinds of shopping (not grocery alone) I probably find myself at the checkout more than twice a week.

  7. I’m familiar with this from Steven Jay Gould’s Full House, where he endorses the conclusion.

    I read that book years ago, but I recall being skeptical about the sample size – the data was compiled from one mediocre team over the course of one season, or something like that.

    Still, the idea that there’s no such thing as streak shooters, clutch hitters, etc. is a common (if minority) contrarian one among sports commentators, although they seldom offer any numbers to back it up.

  8. Cool. Obviously you know a bit about baseball referencing how Billy Beane has changed the analysis a bit. Baseball is such a small scale, but you can see how inefficient the pay structure is. Crappy older players get paid way too much money for mediocre years, and players in their prime get paid too little. As inefficient as baseball is, how much more are big Businesses? Big Business as we know it is a pyramid pay structure, where the cuts always come at the expense of the spenders.

    This is a flawed system. We will not recover sufficiently until labor is viewed as an investment, as opposed to something to always just cut cut cut. Especially since no one enforces anti-trust laws anymore.

  9. Actually, the idea that there is no such as a clutch hitter in baseball has been pretty well documented. This is assuming that when you say a player is “clutch” that you mean he will do better in future “clutch” situations.

    This is very different from saying that a player has performed better in the clutch. The are players who do, obviously, but their past performance is not predictive of future performance. The same could be said of a mutual fund. JC Bradbury covers clutch hitting pretty well in his book “The Baseball Economist.”

    I agree with ifaforo that Taleb’s “Fooled By Randomness” is a great read on this subject, especially for readers of this blog. I’d recommend “The Black Swan” as well.

    And for anyone that’s even a little bit interested in statistics and baseball and some really witty writing, you should go to the now-defunct blog firejoemorgan.com and go through the archives.

  10. Actually, the probability you are describing is extremely interesting mathematically.

    The chance of NOT having a series of 7 (or more) tails in 100 flips is about (c-1)*(c/2)^101, where c is a constant c(7) ~ 1.992.
    (see http://mathworld.wolfram.com/Fibonaccin-StepNumber.html)

    Apparently there is about a 66% chance of NOT having a series of 7 tails, so a 34% chance of having such a consecutive series (in 100 flips).

  11. I’m surprised “StatsGuy” hasn’t put in his 2 cents on this. I think Statistics are especially important for people to have knowledge of. Especially as they are used for political interests to fool people. Statistics are very useful in the field of Quality Control, and are a large part of the reason the Japanese are kicking America’s butt in automobile sales now. Anyone interested in that can look up the work of W. Edwards Deming.

  12. This paper, and the love it gets, is one of my major pet peeves. The quick and dirty version of what follows, which is a quick and dirty version of a paper written with a couple of colleagues a few years ago, is that the GVT make claims well beyond what is supported by their statistical analysis and, in settings where you can control for strategic interactions between players/teams, there is ample evidence of a hot hand.

    Gillovich, Vallone, and Tversky (GVT) tells us nothing about the quality of inference made by people believing in the hot hand. The main issue is that it ignores both the influence of the belief on the decisions made by the player believing that he is hot and the strategic response of the team defending a hot player. In other words, a player believing that he is hot may take more difficult shots and/or be defended more aggressively. While spectators and players may see these types of adjustments, they don’t show up in the GVT data.

    Furthermore, they reject a null hypothesis of streakiness without specifying any alternative. This provides evidence only of a lack of streaks in the data, yet they go on to state that this supports the notion that players and fans are irrational, suffering from some heuristic bias. The best thing I can say about it is that the conclusions they draw from a statistically weak test with an unspecified alternative provide the best evidence against rationality; if Tversky and colleagues can’t understand that rejection of a null does not imply that you accept an unspecified alternative, then maybe there’s no hope for any of us.

    If you look at data from sports where there are no strategic interactions such as bowling, like we did, and horseshoes, as Gary Smith did, there is lots of evidence supporting a hot hand in athletic performance.

    If you (or anyone else here, for that matter) is interested, I would be happy to send a copy of our paper “Runs, regimes and rationality: The hot hand strikes back,” for your reading pleasure.

  13. I’ve always found it disingenious that people argue there is no such thing as a hot hand. Rather, I think the correct statement is that you have no idea how much longer a hot streak will last. Having played baseball at a high level, there are obviously times when you are more likely to get a hit, but when that streak will occur is largely a random event.

  14. I have long believed that your (JK’s) point is correct, for reasons that would have been obvious to the “Streak” paper authors if they had thought beyond the level of activity they were examining. Randomness in human caapcity over time suggests streaks. Deviation in our physical condition from its optimal state is decidedly not as brief as the time between one coin flip and the next. Sprains, strains, sleepless nights, jet-lag, irritation at team mates and sweeties, road-weariness all last for a while. Periods when none of those things trouble us can last for days.

    If one grants the assumption that athletic performance can reflect factors which affect physical well-being, that those factors can occur randomly, and that they persist for longer than a coin flip (for an entire game, for instance) then randomness means streaks. The argument that there are no streaks requires that physical well-being has no effect on performance, or is overwhelmed by other factors, or is very steady over time.

  15. I second the recommendation for “Fooled By Randomness”, it is very well written, and being semi-focused on trading and market performance should be of interest to readers of this blog.

    And yes, “Black Swan” is less useful, especially for those that have already read Fooled By Randomness, since a lot of the ideas are the same. It’s written for a more general audience, and is broader and more philosophical in nature. Personally I enjoyed it, but I would recommend Fooled By Randomness first to anyone who is interested in Economics/Markets/Finance.

  16. Is is possible to be a “streaky” shooter or batter? In the same way certain financial products have more volatility? Take two hitters, both of whom hit .300 but one was much more more prone to “hot” and “cold” streaks due to physical and psychological reasons.

  17. I agree with you and the others that argue streaks should exist. The psychological component is important as well, confidence/nerves could/should impact performance. But to examine this one would have to only look at shots occurring within a short period of time.

    Treating one’s ability to sink a shot as an independent random variable is absurd. And their laboratory experiment fails to replicate the pressure of a real game (although maybe if there were some reward scheme for the length of a streak, that could simulate some of the real life pressure).

  18. Footballoutsiders.com also breaksdown the myth that there are clutch kickers. Although the sample of game winning kicks is small, it does look like a “clutch” kicker simply has had more opportunities to succeed.

  19. I completely agree. And this clearly applies to a sport like basketball where shooting form relies on muscle memory.

    Additionally, when a player is referred to as being “on fire” it is normally based on a series of jump shots or long distance shots. Rarely are centers considered streaky because their shots are so close to the basket and rely largely on moves or passes to create a high percentage shot.

  20. I agree. I wonder why Taleb bothered to write “Black Swan”. 80% of its ideas are already contained in “Fooled by Randomness”.

  21. I was in Vegas a little while ago and there was a guy on our Roulette table who constantly bet on 23.

    Eventually it did come, and the next turn, he said, take my bet off 23, what are the odds of this one coming a 23 again.

    And the dealer (or is it roller) guy said just as before — 1 in 38.

    Vegas is a good place to see how ppl deal with odds.

  22. Traders use pep pills? America arrests poor for such behaviour but glorifies pilots and athletes. Sucks to have poor parents.

  23. Actually, in all of sports, belief in one’s ability plays a substantial part in the results of his efforts. Also, in all of sports, I believe that tennis provides the best evidence of this. Take Roger Federer for example. He struggled to win for the first three years of his career. Yet now you watch one of his matches, and, although he has lost a few times in the last few years, in tournaments that he considers “important” he is nearly unbeatable. As of today, he reached the semifinals or better in 21 consecutive grand slam tournaments. He has great skill, but not in any way substantially greater than many whom he defeats.

    This is really, as in the examples of Tiger, Michael, and Roger, what separates them from the chaff — they all believe that when push comes to shove that they will win. The other thing that contributes is the idea that others come to share that belief they have, making it more likely that the others will fail.

    But, except for baseball, which is driven by statistics, few other sports can be very accurately statistically analyzed.

  24. A couple of interesting statistical items:

    I play blackjack and use a system. As a part of that system, I halve my bet following any winning hand, and double it after any losing hand. I have NEVER lost money playing black jack, period.

    Second, did you know that there is better than a 50/50 chance that on any given page of a metropolitan white pages, there will be two numbers with the last four digits. I find this interesting, but as a statistician, I have never run the numbers.

  25. I’m definitely not a statistician, but isn’t the hypothesis of the study incorrectly cast? While there are a myriad different influences that impact whether or not a player can make a shot (athletic ability, shot mechanics, location on the court, general health, the defensive abilities of the opponent, the game strategies employed by both teams, etc.), isn’t the real question one of reversion to a mean? And doesn’t that provide a rationale for streaks or hot hands?

    Take 3 or 4 seasons of a single player’s shooting stats, taking particular note of shot selection in relationship to position on the court. (Important because Shaq will score on average 60% of the time within the paint, but move him out another 5 feet and the FG% will drop dramatically; his career FG% from beyond the arc is .1% – 21/18,793. Which is a stupid stat in and of itself. Shaq knows that he has no business shooting from that position on the court and his career would have been remarkably short had he insisted on taking a couple of 3’s per game.) I would argue that if a player, who remains on the same team, with the same teammates (again, more relevant considerations to take into account) has a historical FG% of 45%, and has gone 0-15 in his last game, he will appear to be on a hot shooting streak over the course of the next few games, because he’s really a 45% shooter.

  26. it’s way less than 1/100 in the US and way more than 1/100 in Europe, where the culture of .99 prices is not that widespread.

  27. Very interesting. I would say that comparing the basketball and the mutual fund cases is misleading. In investing at least half (and probably more) of a calculation of an asset’s value involves not what it “should” be worth, but what other people (and which others) will think it’s worth. On second thought, perhaps this finds an analog in the effect of the defense reacting to a shooter’s “streak.”

  28. The influence of non-quantifiable factors is present in every example (basketball, equity markets, etc.) Sometimes the variables are information-based, other times they are based on physics, or weather, or other factors. The important question relates to the time horizon of the analysis, and the relevance of a snapshot in time that is a short period of either normal or abnormal behavior. It is statistically possible to have a shooting streak or a “tails” streak, or a winning stock market streak, regardless of how low the numerical probability. Drawing inferences based on small sample sizes or anomalous events is a very common occurrence in the media, and particularly in the “hot topics” of behavioral economics.

    Only when one looks at very long periods of time can broad mathematical theories be inferred, particularly statistical theories. This is certainly the case with regard to basketball and also with regard to the stock market.

  29. Bayard, will you tell me your system?

    I’m here to tell y’all, there is such a thing as a “hot streak” in sports. You must know that. I’m no athlete, by any means. In the little bit of sports I have played, golf, intramural basketball, whatever, occasionally I’ve gotten “hot”. Suddenly, for a time, I can’t miss a putt under ten feet. It’s not a statistical accident either. I can SEE the line in a way I’m normally unable to. The least athletic amongst those posting here have also played sports about like I have, I imagine. You know that on occasion your probability for success changes for a time, either for the better or for the worse. It’s a feedback loop thingy, I imagine.

    If studies don’t show that, perhaps the fault lies with the methodology… or the interpretation. I imagine with studies and statistics the devil is in the interpretation a lot.

  30. Statistics does after all deal with probabilities, very instructively, and sometimes very beautifully, but still not with certainties. One uses the results to make judgments, and sometimes choices, but if one mistakes them for hard reality one is missing the point. Even if the significance level is <.001 there is still that almost-one-in-a-thousand chance the implied conclusion is wrong.

  31. When it comes to “hot hands” and investing, I prefer to be in the market during those times when even a dart-throwing chimp can make money, and in cash the rest of the time.

  32. “But if you look at all the lotteries and all the lottery winners everywhere, it would be shocking if you didn’t have repeat winners.”

    Hah! You claim to understand probability and randomness and then you post this? Physician, heal thyself!

    Is it “shocking” to have repeat winners? No. Would it be “shocking” if we did not have repeat winners? Decidedly NOT.

  33. I don’t think it’s as simple as that, as most countries / states have some sort of sales tax.

    I think 1/100 is the right odds, unless you are in a US state without sales tax in which case I’d expect to see a bias towards .99, .98, .97… I wonder what the average number of items purchased is?

  34. The problem here is that in most cases, the words “clutch” and “streaky” are not defined specifically enough.

    If you define them specifically enough, you will have a fewer disagreements, because really most of the arguments are about what the words actually mean and imply.

  35. Are we sure the paper states specifically that the “hot hand” doesn’t exist? I thought it said that there was no conclusive evidence supporting the “hot hand.” Those are very different statements.

    My intuition says that the “hot hand” exists, but is more of a lukewarm hand, or perhaps more the lack of a “cold hand.” (The idea in the latter case being that you have an optimal “true” shooting accuracy, and as long as nothing is hampering that, like an injury or an illness–a cold hand, essentially–you shoot with that true shooting accuracy, and everything else is random variation.) The hot hand, assuming it exists, is probably pretty weak and easily swamped by other factors (such as an uptick in the shooter’s confidence leading to poorer shot selection); otherwise, it would be easy to discern.

    I also think it’s important to remember the true value of the paper–not its conclusions, because anyone can draw mistaken conclusions–but the actual experiments and measurements. Don’t throw the baby out with the bathwater.

    @Pyramid: The terms are ill-defined in large part because the phenomenon hasn’t been clearly identified in the data yet. Once you identify the phenomenon (again, assuming it exists), then you can better define it. But at this point it’s still in the “black cat in a dark room” stage.

  36. jonboinAR: It is indisputable that shooters FEEL hot. Whether or not they actually ARE hot hasn’t yet been settled (to my satisfaction, at least). And the easiest person to fool is yourself, which is why your own feeling of being hot is not compelling. No offense intended, ’cause it’s human nature, but objectively it’s not convincing. (For what it’s worth, I do feel my shooting get hot from time to time. And cold, too.)

    One thing that happens a lot is result merchanting. You of course intend to put the ball in the hole, so when it does get in the hole, it obviously did what you wanted. And when that happens several times in a row, the obvious explanation is that you’re zoned in, that your body is executing what you want it to do better than usual. In a certain sense, it clearly is, but the question (as always) is whether it’s doing that for some basal physiological reason, or whether it’s just random variation. The infuriating thing is that you aren’t a reliable judge of which it is, because your brain insists on using results as a diagnostic test.

  37. That system is the well-known Martingale, and for it to work (assuming even money bets), there must not be a house limit, or at least you can’t run into it. And if you play enough hands, you WILL run into it; it’s unavoidable. Google Martingale and betting to see why the notion that it avoids losing in the long run (in any real world casino) is a fallacy.

    So my guess is that if you aren’t losing money playing blackjack, you’re probably keeping track, consciously or subconsciously, of whether the deck is high or low. That may provide enough bias to prevent you from losing in the long run. You may feel it’s the Martingale, but it isn’t.

    Your second point is probably an understatement. Assuming that the last four digits are uniformly distributed, you only need to have about 118 listings on a page to get a 50-50 chance of a duplicate. Google birthday paradox.

  38. “The null hypothesis of streakiness”? Streakiness is a bias, and the null hypothesis is always the lack of an underlying bias; in order to provide compelling evidence that a bias exists, you must reject the null hypothesis that there is no such bias, that the observed deviations arise from random variation. The cited paper failed to reject the null hypothesis, and therefore fails to establish the existence of the hot hand. That is a very far cry from establishing the NON-existence of the hot hand.

    However, I do think that belief in the hot hand, in the continued absence of conclusive statistical evidence in support of it, is irrational–and I say that as someone who thinks that there probably is such a thing as a hot hand.

    I think that if you contend that paper “disproved the hot hand” by “rejecting the null hypothesis of streakiness,” you’ve probably misinterpreted the paper (or else misstated your understanding of it).

Comments are closed.