Sample Size
Description:
So we have all of these statistics, but when can we use them? Suppose a player goes three for three in their first game in the big leagues. Should we expect this player to continue batting 1.000 for the rest of the season? Of course not, that’d be silly. Three at-bats is way too small a sample to draw conclusions about a player, but then we’re left with the question: at what point do statistics become reliable? For the answers, see below:
Offense Statistics:
- 50 PA: Swing%
- 100 PA: Contact Rate
- 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA
- 200 PA: Walk Rate, Ground Ball Rate, GB/FB
- 250 PA: Fly Ball Rate
- 300 PA: Home Run Rate, HR/FB
- 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate
- 550 PA: ISO
Pitching Statistics:
- 150 BF – K/PA, grounder rate, line drive rate
- 200 BF – flyball rate, GB/FB
- 500 BF – K/BB, pop up rate
- 550 BF – BB/PA
In case it’s not obvious, you can tell a lot more about a hitter from one year of data than you can about a pitcher. All this data is from research that Pizza Cutter conducted, which can be found in the links below. If a statistic is not included, the means it did not stabilize over the intervals that Pizza Cutter tested (which was up to 750 PA / BF).
Also, a quote worth remembering: “In small sample sizes, a good scout is ALWAYS better than stats.”
Links for Further Reading:
525,600 Minutes: How Do You Measure a Player in a Year? – Statistically Speaking / Pizza Cutter
On the Reliability of Pitching Stats – Statistically Speaking / Pizza Cutter
The Law of Large Numbers.
Great site, just found it.
I would add, but don’t have time to locate it, but tangotiger once studied the issue of when BABIP is statistically significantly lower than the .300 mean that most pitchers regress to per DIPS and he found, from what I recall, that it takes a starting pitcher 7 full seasons to have enough balls in play so that it can be statistically said that his BABIP is lower than the .300 league mean.
That, of course, means that the vast majority of relievers (except for those few, like Eckersley, who were starting pitchers and already compiled a lot of BIPs), their BIP will never reach that level.