clock menu more-arrow no yes mobile

Filed under:

Do Different Umpires Really Make a Difference?

Update (9:39 AM): If you haven't seen salb918's Fanpost yet, Tom Tango is taking a playing time survey and he needs your help! Go here: ( As of this writing, Oakland is tied with Florida for least responses.


Last week, baseballgirl brought up a Beyond the Box Score article on grading umpires. In that thread, the idea of correcting for umpire bias was raised. Now, I don't have a data set large enough to even begin thinking about creating an umpire-adjusted pitching statistic, but what I can do is see if it's really necessary.

In the article that baseballgirl linked to, Jeff Zimmermann used two methods of umpire analysis and took the average of both results to determine an umpire rating that compares each umpire to the league-average umpire at calling strikes and balls. In his results, 0% is considered league-average, with positive and negative percentage numbers representing pitcher-friendly and hitter-friendly game calling, respectively. Do note that Zimmermann restricted his study to umpires who accumulated at least 3750 plate appearances over the last three years. That's a huge sample, and I applaud Jeff for having the patience to wrestle with numbers of that scale.

Just as an experiment, I grabbed the five most-used A's starting pitchers from last year and added up the amount of plate appearances they racked up. Then, I simply took a weighted average of their plate appearances with respect to the umpire rating that Jeff Zimmermann calculated. The results are below.


Those numbers are awfully close to 0%. Close enough that it's fair to assume that while one game may be affected by a particularly pitcher- or hitter-biased umpire, they tend to average out over the length of a season.

The A's travel to Camelback Ranch in Glendale, AZ to tango with the Chicago White Sox at 1:05 PM. Don't touch that dial.


Odds and Ends

  • Even if I assumed that Jeff Zimmermann's method of grading umpires is perfectly accurate, I'd still run into a million problems attempting to apply it to an umpire-adjusted pitching statistic. First off, I'd need a huge set of data to draw from. The reason I can't look at a single pitching performance and assume that it's representative of a pitcher's true talent is because of the probability of bias coming from the extremely small sample size. A larger data set helps in that regard because most luck-driven events average out, leaving behind the true talent. Now imagine how many other variables there are when trying to isolate umpiring and its effect on pitching performances. The data set required would be unreasonably large, and I only have one season for most of these guys.
  • Second reason why an umpire-adjusted pitching statistic would be problematic: Gio Gonzalez had one of his best starts last year (September 25: 6.1IP, 4H, 0R, 1BB, 7K) under the supervision of Tim McClelland, whom Jeff Zimmermann pegged as the most hitter-friendly umpire in baseball. Umpire bias isn't exactly the most influential factor in pitching performance. Take a look at this chart of Trevor Cahill's starts. The statistical correlation between umpire factor and strikes thrown is minimal at best. In other words, umpiring is a very small part of being able to throw strikes, dwarfed by a multitude of other factors.
  • A note about the numbers: the plate appearance numbers I have in the table above are a little less than the actual amount of plate appearances they faced. I had to throw out a start or two from every pitcher because the umpire they drew that day didn't have a large enough body of work to make Jeff Zimmermann's 3750 PAs/3 years cutoff.
  • Seriously, go take a look at Beyond the Box Score. All of SBNation's non-team specific baseball blogs are fantastic (like John Sickels' Minor League Ball and Kyle Boddy's Driveline Mechanics) but Beyond the Box Score boasts a phenomenal group of statistics-focused writers who are great at what they do.