clock menu more-arrow no yes mobile

Filed under:

Too Much Noise About Small Samples

Alberto Callaspo continues to challenge Rogers Hornsby's record .424 season batting average.
Alberto Callaspo continues to challenge Rogers Hornsby's record .424 season batting average.
Christian Petersen

The samples we have for 2014 all have two qualities in common: They are really small and they are all we have. (I will now pause while you make a penis joke in your head.) Some of the early commentary has suggested that fans are wedded to small-sample results as gospel and that all the results so far this season should essentially be ignored.

I beg to differ. Context is everything. 30 ABs is about 10% as useful or reliable as 300 ABs but it is also not 0% as useful. Nor are all 30 ABs equal. The trick is to know what to make of a small sample, not whether to make anything of it at all.

In past years, for example, I have often drawn "small sample conclusions" that proved to be accurate over the ensuing large sample. Michael Taylor's long swing, Hiroyuki Nakajima's weak arm and flawed bat, and Donaldson's excellent swing, caught my attention at various spring trainings. My conclusions after a sample of 2 remained the same over samples of 10x and 100x. Of course, these were "process" assessments, not "Wow, he's 2 for 2 with a double!!!!" evaluations.

In contrast, after several hundred innings I'm still not sure what I think of A.J. Griffin. Chris Carter's first 30 ABs, like Donaldson's first 100, were extremely poor predictors of future results. Alberto Callaspo's .409/.480/.636 slash line is nothing more than statistical noise just as Josh Reddick, no matter what you think of him, will NOT finish the season batting or slugging .103.

However, some commenters have blasted criticism of Reddick and Daric Barton on the basis that "If a player went 3 for 29 in June no one would notice." That's not true at all. You don't think A's fans are aware when a player is 3 for his last 29? You don't think fans notice when a player looks as lost at the plate as they generally are during any 2 for 20 or 3 for 29 funk?

Also, the slow -- and by slow I mean somewhere between "Bengie Molina slow" and "continental drift" -- starts by Reddick and Barton come on the heels of many fans having significant concerns about the hitting abilities of both. The reaction to 20-30 ABs is coming not from the small sample but rather from the available sample only confirming, and then some, concerns based on a far larger sample that combines more than one season of data that reflects both stat and eyeball-based forebodings.

Furthermore, there is bad and there's bad. When you are worried about a player's ability to hit going forward and then they get off to a "meh" .225/.273/.425 start -- which was exactly Yoenis Cespedes' slash line entering Saturday's game -- you get a measured response of "It's early; let's see how the stats look when they stabilize." When that slow start looks like a .100-.103 batting average, .100-.103 slugging, and in Reddick's case also a 42% K-rate, you get a different response to a sample of the same size.

Small samples come inherently with a great deal of statistical noise, which is why they have to be taken with many grains of salt. 20-30 ABs in, a batting average can literally double in a week -- but in the cases of Barton and Reddick, that doesn't mean that if it doubles it will be good.

In summary, to take 20-30 ABs and extrapolate from it what a hitter's final slash line will be is a fool's errand. But for a player about whom you already had grave doubts, to start off that badly, and to in process confirm your fears and then some? I think 20-30 ABs, if sufficiently extreme, can absolutely be enough to place your already existing concerns up one notch on the worriometer.

Time will tell what all these guys can and cannot do, and over time all the stats will stabilize considerably towards the mean. But a little time has already told us something, and there's no reason to ignore small samples. Just know that they're small and put them in proper context.