Like many of my FanPosts, this one began as a comment on another thread, in this case PL78's What to do with Mark Ellis?. But it has taken me a day to formulate my thoughts, and that post has already fallen off the list — and besides that, it had some squabbling that I'm happy to leave behind — so I'll start again with a fresh post.
In case you missed it, PL78 started a conversation about whether to trade Ellis and introduced it with some thoughts of his own. Grover opined that that introduction was error-ridden and poorly thought-out. When asked for examples of errors, one thing grover cited was PL's characterization of Ellis as "he of sparkling defense". In fact, grover said, Ellis's defense has been mediocre for the past year and a half and anyone who had done his homework should have known that.
This took me by surprise because I didn't know that Ellis's defense has been mediocre. I don't follow the stats as closely as the smart guys here, and as a non-resident fan who has had real world distractions this year I've missed a lot of games, but I don't consider myself clueless either. When I read "sparkling defense" in PL's post, that seemed perfectly reasonable to me. After all, Ellis does have good defense, doesn't he?
Well, I'll be darned. It turns out he doesn't. Like a good muppet, grover provided some numbers. Following up on them on FanGraphs, I see that it's true. Ellis's defense really has been mediocre since 2009.
This disturbs me. Why didn't I know this? That he plays (played) great defense is pretty much the one thing I do know about Ellis, and the reason I know (knew) it is because not too long ago it's all anyone would talk about. There were whole articles written about how fantastically awesome Ellis's defense is — and not just by A's-loving homers like us. The classic was this column on FanGraphs, written shortly after the A's re-signed Ellis for 2009 and 2010. Here Dave Cameron is so impressed by Ellis's great defense that he says for the A's to pay him only $5.5 a year is "one of the best free agent bargains in the history of baseball", and Ellis "should have gotten about three times what he signed for." This article was widely cited around the Internet, including on AN, with some even suggesting that the players union should have stepped in and insisted Ellis ask for more.
UZR and WAR
Let's back up and looks at the numbers. In 2008, the year the "great bargain" column was based on, Ellis had 14.7 UZR and 2.9 WAR. In 2009 he had 1.5 UZR and 1.2 WAR. For the first half of 2010 he has -1.0 UZR and 0.5 WAR.
For my readers not fluent in stat-speak, UZR stands for "ultimate zone rating" and it is one the leading valuations that attempts to measure defensive skill. A high number is good, and if you make double-digits you're really good. A negative number means you're bad. So what UZR is telling us is that Ellis was awesomely awesome in 2008, but he was mediocre in 2009, and in 2010 his defense is actually kinda bad.
WAR stands for "wins above replacement". The WAR formula attempts to gather up everything that is both relevant and measurable and bundle it up into one single number that tells you how good a player is. The unit of choice is in the name. "Replacement" is defined as crappy roster filler that you can get essentially for free off the waiver wire. So if a guy has a WAR of 2.9 for the season, that means if you didn't have him and were forced to play crappy roster filler instead, you'd have won 2.9 fewer games that season. WAR numbers are designed to be additive, so if instead of that 2.9 WAR guy you had a 5.0 WAR guy instead, you would have won 2.1 more games (ie, 5.0 minus 2.9).
Different stat-makers have different WAR formulas. The most commonly cited one is FanGraphs. You can read more about FanGraphs' formulation in a series of articles starting here. Baseball America has their own formula; it's the same basic idea but they measure things a little differently. Both are attempting to communicate the same thing: how good a player is in terms of wins above replacement.
WAR attempts to include all skills and one of those is defense. The FanGraphs formula uses UZR to measure defensive value. Roughly speaking 10 points of UZR translate to 1 point of WAR. That means that half of Ellis's value in 2008 came from his defense, and 80% of the decline in value from 2008 to 2009 is due to the decline in his defense.
This brings me close to the point of this post, but before I go there, let's take a look at Ryan Sweeney. During the 2009 season we heard a whole lot about how awesomely awesome Ryan Sweeney's defense was, and there were several articles reminiscent of the FanGraphs article about Ellis. For example, in a post titled Ryan Sweeney is Better Than You Think He Is, right here on AN, DFA concluded:
Sweeney gets a bad rap because it is hard to see his defensive value while his offensive value is just average. Moving forward it is clear that Ryan Sweeney is likely to be an excellent player.
According to his FanGraphs page, in 2009 Sweeney had 20.6 UZR and 3.8 WAR. For the first half of 2010 he has -2.6 UZR and 0.5 WAR. Sweeney's UZR has plummeted even more dramatically than Ellis's. Whereas Ellis went from awesomely awesome to bad in a year and a half, Sweeney has gone from even more awesomely awesome to even more bad in just half a year.
To give credit where it is due, many writers — including both Dave Cameron and DFA — gave the proper caveat that defensive metrics are imperfect and must be taken with a grain of salt. But these two examples require a whole lot of salt to explain. The numbers aren't just a little erratic from year to year, they're completely slip-on-a-banana-peel-and-fall-on-your-ass dead wrong.
Here's what concerns me. People are using UZR and WAR in order to project the future. When Dave Cameron said that Ellis's contract was "one of the best free agent bargains in the history of baseball", he wasn't talking about what Ellis was paid for 2008; he was talking about the new contract for 2009 and 2010. And he got it wrong. When DFA said Ryan Sweeney is better than we think, we wasn't just saying he was good in 2009, he said "moving forward" Sweeney is likely to be an excellent player. And moving forward, he's not.
This troubles me because I believed them both. I believed that Ellis's contract was a great deal, and I believed that Sweeney would continue to be good. I'm skeptical about WAR, sure, and even more skeptical about any defensive metric, but even I thought there was some predictive value in it. Now I'm not so sure. Are Ellis and Sweeney just two extreme cases? Because if they aren't, then there's something seriously wrong with UZR. Either it fails to accurately measure what it seeks to measure, or else it does accurately measure something which has no predictive value.
I'm posing the question as a poll because polls are fun and everyone can participate, but I assume 90% who answer will be like me — ie, you really don't have a clue and you're just taking a guess based on gut feeling. If you're one of the few who does have an educated opinion and can back up your answer with logical argument, please do so in the comments. Because I'd really like to know the answer.
I'd like to know because I want to know whether I should call bullshit on the inevitable next round of proclamations of who is better than we think, based on WAR. Just today I saw a comment in one of the threads that Pennington is actually the most valuable member of the A's right now, on account of his superior WAR. If I look on Cliff's FanGraphs page, I see that his UZR was -6.0 in 2008 and -4.5 in 2009, but now it's 1.6 so far in 2010. In other words, he was consistently kind of bad, but now suddenly he's kind of good. What should I make of that? Is this just a random fluctuation that makes his WAR figure unreliable, or did Cliff genuinely figure it out and become good at defense? And even if he did, what's to say he won't genuinely unfigure it out and be bad again next year? or next month for that matter?
What's the deal with Ellis's and Sweeney's gigantic drops in UZR?
Defensive metrics are lousy. UZR is not a reliable measure of defensive skill. (37 votes)
UZR does measure defensive skill, but defensive skill fluctuates wildly and has little predictive value. (55 votes)
UZR is a good metric and has strong predictive value generally; Ellis and Sweeney are just flukey exceptions. (13 votes)
105 total votes