I realize this is a long and rather heady post - especially if you're not much into math. I posted this recently to Tom Tango's blog as a comment to an **article about statistical significance**. I dragged it over here for safe keeping.

* * * * *

I apologize if I’m butting in the middle of a conversation, but there’s been an issue that’s been bugging me for years and it seems appropriate here.

When I was in college (some 30 years ago), one of my statistics professors scolded me (as best I can remember), “You baseball guys have it all wrong. You treat a set of baseball statistics as if they are a random sample and run all of your significance tests. Baseball statistics are not a random sample of anything - they are a full and complete set of data. To analyze them using statistical techniques is a misuse of the science.” Being as stubborn as I was back then, I disagreed with him. But as the years have past, I never forgot what he told me and now in my middle-aged years have come to believe that what he said has some merit.

Take clutch hitting. Suppose Larry bats 17 for 50 (.340) in clutch situations and 100 for 400 (.250) in non-clutch situations. A standard statistical test will tell you that the difference between these two proportions is not significant at either a 95% or 90% confidence level, therefore you can not conclude with a high degree of certainty that Larry is a better hitter in the clutch.

But the fact is that Larry WAS a better hitter in clutch situations - a MUCH better hitter. Those 17 hits he got probably won his team a few games that they otherwise would not have won. That is very significant. To apply a statistical test to this data and conclude that we can not prove that Larry is a good clutch hitter is to say that there are other at-bats that Larry had that we don’t know about and if you measured those, the difference in his average might not be as great. This simply isn’t true.

Another analogy would be the U.S. Senate voting on a bill. If 51 Senators vote in favor of the bill and 49 vote against, even though a statistical test would tell you that there is no difference between to proportion of Senators who are in favor of or oppose the bill, this is regardless a very significant outcome because there are no other Senators to ask. You can say with 100% certainty that more Senators are in favor of the bill than not. Just like you can say with 100% certainty that Larry was a better clutch hitter than not.

When we analyze random samples of data, we do so to predict that which we can not measure, or is impractical to measure, or is too expensive to measure. If we have a machine that fills boxes with breakfast cereal, we don’t tear open every box at the end of the assembly line to be sure they are filled properly. We only test a sample and then assume that the others have similar characteristics. But in baseball we do measure everything. EVERYTHING. There is no data that is unknown. When we apply statistical techniques to this type of data, we area only analyzing events that will never occur. Perhaps a fools errand.

* * * * *

A couple of posters jumped all over this saying things like the set of at bats a player gets is a random sample of all of the at bats he *could have* gotten from an infinite set of possible at bats. Or this:

I think these guys are wrong. I also think that their opposition to my point of view is in part their trying to justify what they've spent much of their lives doing.If the 50 clutch AB’s (in which the hitter got 17 hits) ARE the complete data set, then there can’t be any other elements in that set, and therefore any clutch AB that arises later must be from another set, right? In which case the prior data has no relevance, right? In which case, what value does the designation of this hitter as a clutch hitter have? None. --Greg R

Mathematical statistics are central to my day-job career of market research. In it, I've seen a lot of abuse of statistical techniques. Anyone with a computer can run a regression analysis, but as with many tools, in the wrong hands they can do more harm than good. Chainsaws come to mind.

If there are any math gurus reading (John R), I'd love to get your take.

UPDATE: There's a vigorous discussion going on on this topic in the comments to the

**original post**, including a mea culpa by me for calling analysts who I respect "wrong". I should have simply said that I disagree with them. My bad.

## No comments:

Post a Comment