Best and Worst Playoff Teams Since 1995, and how they did
So, this all started with my dislike for the "Playoffs are a crapshoot". I don't like the argument, because
- I like the thought that the playoffs select the best team.
- I think that the playoffs aren't completely random...the better team has a better chance of winning
- I fear that thinking of the Playoffs as a crapshoot causes complacency.
The feedback I got is that those 3 arguments are semantics, obvious, and/or wrong. I'm perfectly fine with that- This was my gut feeling, and actually, I'm pretty much done with those arguments.
The best explanation for the "Playoffs are a crapshoot" argument, I thought, came from WhiteElephants (near bottom)
the best strategy might not necessarily be to put together the best team in any given season if it means sacrificing playoff berths in future seasons. I think the main point in Moneyball is that the best strategy seems to be: just get to the playoffs as often as possible.
Anyways, I agree with this strategy...definitely want to get into the playoffs. So why am I writing a diary? Well, I wanted to look into how well regular season preformance predicted playoff outcomes.
First, I'm using run differential as my measure of regular season preformance. The reason for that is that I think RD is a better measure of how "objectively good" a team is. It's also reasonably accessible. We want a measure of "Objective goodness" because our putative goal is to see if good teams win in the playoffs.
Next, I'm using data from 1995-2008. This is how much data I entered before getting tired. There's nothing scientific about the bounds. For those keeping count, w/ 8 playoff teams a year, that's 14*8 = 112 Playoff teams.
So, I took the average of the regular season run differentials for the last 14 world series Winners and Losers. What I found is that the regular season run differential is NOT a good predictor of the outcome of the world series. In fact, the loser averages a higher RD than the champ! (132.4 vs. 131.3). So the world series is a coin flip...good.
I took similar averages for each level of the playoffs. Here's what I got:
Average run differential of teams who reached a given level of the Playoffs:
WS Winner 131.3
WS Loser 132.4
LCS Losers 113.
LDS Losers 107.4
Playoff Teams 114.6
The good news is, although the WS is a coin toss, there is some tendency of WS participant to have better RD's than Playoff teams in general. The winner of each playoff round, in general, does have better RD.
Ok, so next, I thought it'd be interesting to see who the "best" and "worst" playoff teams were, and how they fared:
Worst Teams to make the Playoffs.
Team RD OUTCOME
10. 2006 Oakland 44 ALCS Loss
9. 2003 Twins 43 ALDS Loss
8. 2003 Cubs 42 NLCS Loss
7. 1995 LA Dodgers 25 NLDS Loss
6. 2006 St. Louis 19 WORLD SERIES CHAMPS
5. 1995 Colorado 2 NLDS Loss
4. 1997 Giants -9 NLDS Loss
3. 2007 Arizona -20 NLCS Loss
2. 2005 San Diego -42 NLDS Loss
1. 1998 Cubs -100 NLDS Loss
So, the sixth worst playoff team in the last 14 years won the World Series. Good! The playoffs aren't a crapshoot at all! In fact, these ten teams won the division series 4/10 times. Sounds like a crapshoot. Haha.
So lets see how the best teams did.
Best Teams RD Outcome
- 1998 Yankees 309 WS Champs
- 2001 Seattle 300 ALCS Loss
- 1998 Houston 254 NLDS Loss
- 1998 Atlanta 245 NLCS Loss
- 2001 Oakland 239 ALDS Loss
- 1995 Cleveland 233 WS LOSS
- 1999 Arizona 232 NLDS Loss
- 2007 Boston 210 WS Champs
- 1997 Atlanta 210 NLCS Loss
- 2002 Angels 207 WS Champs
Yes folks, those 2001 A's were actually really good. As a whole, this group won 7/10 Divisional Series Games, 5/8 Championship Series games, and 3/5 World Series Games. As a whole, those numbers seem good. So if you're a top 10% regular season team (this is the top ten out of 112 teams), this SMALL SAMPLE SIZE says you have a 30% shot. Remember, this is not really predictive, its what happened.
Lastly, I broke down the outcomes by percentile (it was a histogram, but I don't want to have to upload graphics, so I'll do a table.
- Teams 1-10: 3 WS titles, 2 WS losses, 2 CS losses, 3 DS losses
- Teams 11-20: 1 WS title, 2 WS losses, 1 CS loss, and 6 DS losses
- Teams 21-30: 1 WS title, 3 WS losses, 1 CS loss, and 5 DS losses
- Teams 31-40: 1 WS title, 0 WS losses, 2 CS losses, and 7 DS losses
- Teams 41-50: 0 WS title, 1 WS loss, 5 CS losses, and 4 DS losses
- Teams 51-60: 1 WS title, 1 WS loss, 6 CS losses, and 2 DS losses
- Teams 61-70: 1 WS title, 1 WS loss, 0 CS losses, and 8 DS losses
- Teams 71-80: 1 WS title, 2 WS losses, 3 CS losses, and 4 DS losses
- Teams 81-90: 1 WS title, 1 WS loss, 1 CS loss, and 7 DS losses
- Teams 91-100: 2 WS titles, 1 WS loss, 2 CS losses, and 5 DS losses
- Teams 101-112: 1 WS title, 0 WS losses, 4 CS losses, and 7 DS losses
Hope this all didn't bore you guys too much, but as you suspected all along, it appears the playoffs are pretty damn close to a crapshoot.
24 comments
|
2 recs |
Do you like this story?
Comments
Didn't the 2007 Red Sox win the World Series?
It's not the results, it's how you look going about those results -- Tim McCarver
Yes, that was a data entry error
Thanks, it’s fixed now.
The question that immediately leaps into my mind, reading this, is:
how the f*** did the 1998 Cubs make the playoffs?
I think the more general point here is that it’s not worth building your team to project to more than about 95 wins. You pay a premium for those additional wins because they’re so hard to jam into a roster, and what you get back for them doesn’t make up for it. If you’re fortunate enough to have ended up with more than that amount of talent, you’re actually somewhat justified in “selling” during the offseason!
Your 2008 Athletics: It's Nothing Personal.
Good point.
What alarms me here is that only 30% of the 10 best teams in the last 13 years have won the World Series. Eesh. Look at 1998. 3 of the best 10 teams come from the same YEAR. Must have been a lot of weak teams back then.
"To this day and dating back 25 years, before every game he plays, Henderson stands completely naked in front of a full length locker room mirror and says, "Ricky’s the best," for several minutes."
by VORP is too nerdy on Dec 11, 2008 12:05 PM PST up reply actions
That might be part of it.
But that’s only 2 out of 30 teams.
"To this day and dating back 25 years, before every game he plays, Henderson stands completely naked in front of a full length locker room mirror and says, "Ricky’s the best," for several minutes."
by VORP is too nerdy on Dec 11, 2008 3:38 PM PST up reply actions
Thanks for the nod...
I’m glad you put the work in, I find it pretty interesting. I wonder though about comparing all the teams in one environment (not sure if that’s the right word, couldn’t think of something better).
If your looking for the odds for any given year that the best team will win, I think it would be best to compare that team solely with the other seven teams of that year. I think right now your numbers might be a little skewed by the fact that three of your top four teams came from the same year and therefore at least two of them will definitely not win the WS.
For example: in your top ten there are only 6 years of WS to win, so even if every single year in the history of baseball the best team always won the WS, your numbers would say they only won %60 of the time.
This is true.
It’s tough to design a methodology for comparison within one year. I can’t think of anything more meaningful that just listing the data.
One design I considered, which I thought would be much better (and then didn’t do because it was too much work) was to go series by series, and see how well the difference between two teams’ RD’s predicted the winner of the series. So, the predictive variable would be x = RD(team A) – RD(team B)
In cases where x is large, one would hope team A wins most of the time. Then you could categorize P(series win) for 0<x<10 (toss-up), 10<x<20, 20<x<30 and so on. I think the best part of this method is that it really leverages the sample size, since we’d have 114 series, and probably enough in each bin of run differential.
I don’t know how clear that was…it’s hard to type this kind of stuff.
I get what your saying...
and I understand that one could spend a lot of time trying to get some accurate numbers on the last fourteen years of playoffs. Your work is appreciated.
A very basic way to test within one year is just find out how many times out of the last 14 years the best team actually won, divide it by 14 and you’ve got a percentage. This way might be easy enough to go back to the first 8-team playoffs for a larger sample.
by WhiteElephants on Dec 11, 2008 4:00 PM PST up reply actions
but that's not qualitative
Run Differentials are harder to find for earlier years. I had to use RS and RA to calculate RD from 1995-2001 since I couldn’t find RD.
If the data is available, your design would be good, I think.
I’m working on something else now, so I’m not going to do it…
Good work, but a few other factors
This is all interesting, but it doesn’t go very far toward showing that the playoffs are a crapshoot. Yes, of course, any team can beat any team in a short series, but there are reasons to expect one team to beat another.
A few flaws with just using run differential as a proxy for team quality:
1) It ignores homefield advantage, which I think is a significant factor in the postseason (and a reason to want to maximize wins above PT’s 95 number). This is particularly important when looking at the World Series because A) homefield advantage isn’t determined by team quality, and B) a run differential in one league is not necessarily equal to a run differential in another league due to disparities in league quality.
2) Schedule differences. The AL West, for example, had the best record of any division in the AL every year from 2000-2007, which made it harder for a an AL West team to rack up run differential than it would be for a an AL Central team.
3) Raw run differential isn’t as good of an indicator as Pythag—a team that outscores its opponents 800-600 should be expected to have a better record than a team that outscores its opponents 1000-780, even though the latter has a better run differential.
4) Teams change in quality over the season with trades, promotions, and individual improvements/declines.
5) Player usage is different in the playoffs than in the regular season. In the playoffs, you don’t generally use your 5th starter or the back end of your bullpen, for example. You also don’t use reserves to rest starting position players.
We all know that a short series is a small sample—which leaves the outcome more subject to randomness. But we also know that it’s baseball being played, and that the better team generally wins.
In short, if your analysis is showing you that the better team doesn’t usually win, you’re probably using the wrong measure to determine who the better team is.
The "home field" team only wins five series in every nine
and it’s rare indeed that the series is going to come down to home field advantage. Generally speaking being on your home field is worth around half a run for a single game (home teams win about 55% of the time). So having one extra home game in a series has a pretty minute impact on the win probabilities in the series.
The rest of the difference in performance is presumably caused by the home teams being, on average, better clubs. Not much of an edge there. There are no “1 vs. 16” matchups in the baseball playoffs— every team has a substantial chance of losing every series. A 105-win juggernaut will go down to an 83-win mediocrity that squeaked in from a weak division a non-negligible fraction of the time.
His analysis basically looks logical— there’s a clear gradient downward from the best teams to the worst ones in terms of playoff performance. It’s just not a very steep gradient. Certainly not steep enough to justify trading long-term assets for short-term upgrades (eg this year’s Mark Teixeira trade).
Your 2008 Athletics: It's Nothing Personal.
parsing through this...
A few flaws with just using run differential as a proxy for team quality:
1) It ignores homefield advantage, which I think is a significant factor in the postseason (and a reason to want to maximize wins above PT’s 95 number). This is particularly important when looking at the World Series because A) homefield advantage isn’t determined by team quality, and B) a run differential in one league is not necessarily equal to a run differential in another league due to disparities in league quality.
2) Schedule differences. The AL West, for example, had the best record of any division in the AL every year from 2000-2007, which made it harder for a an AL West team to rack up run differential than it would be for a an AL Central team.
It would be impossible to design this study to adjust for home-field. I’d have to come up with an adjustment factor, which would require a much better study than this one.
Same for #2…
3) Raw run differential isn’t as good of an indicator as Pythag—a team that outscores its opponents 800-600 should be expected to have a better record than a team that outscores its opponents 1000-780, even though the latter has a better run differential.
I wish I had done pythag. However, data entry was hard enough here. RD values were availible directly since 2001. Pythag would require twice the data entry, since I’d have to enter RS and RA separately. I still haven’t figured out how to get CSV data sets yet. Baseball prospectus’ custom statistics reports never seem to work.
4) Teams change in quality over the season with trades, promotions, and individual improvements/declines.
Probably not too substantially. I guess a superstar would mean 2-3 Wins, which over a series, may not be too big. Also, impossible to account for…tracking acquisitions = too much.
5) Player usage is different in the playoffs than in the regular season. In the playoffs, you don’t generally use your 5th starter or the back end of your bullpen, for example. You also don’t use reserves to rest starting position players.
THIS IS A GOOD ARGUMENT. I like it. I made a similar one when arguing the playoffs are not a crapshoot. I might try to find a way to test this…but didn’t think of one last night. plenty of regular season RD can be explained by horrendous #5 starters…This would deflate (or inflate, in the case of a great #5 starter) Regular Season RD, without impacting Playoff win odds whatsoever.
Definitely one reason why we didn’t see much predictive value in the above data.
as far as a crapshoot goes
is it a crapshoot if a team that is only a little inferior to the best wins?
perhaps another way of looking at it is this: how often does a vastly inferior team upset the odds and win? i mean, it’s not much of a shock if one of five fairly similarly rated teams comes through, even if they’re not statistically the best over the course of a season, but it would be a surprise if a team that’s easily the worst wins. and if they do this consistently, then, yeah, it’s a crapshoot. or at least, what i’d describe as a crapshoot.
BB should send scouts to watch cricket players.
by alea iacta est on Dec 11, 2008 1:53 PM PST reply actions
In 2006...
there were 15 teams in baseball better(by pythag) than the WS champs.
by WhiteElephants on Dec 11, 2008 3:11 PM PST up reply actions
right...
but – 2006 looks like a one-off, to me, no? remember that there’s always going to be chance random fluctuations, so from time to time, we should see the worst team win.
BB should send scouts to watch cricket players.
by alea iacta est on Dec 11, 2008 3:35 PM PST up reply actions
ya, I'm not a big fan of just citing the Cards
noise will happen.
sure...
for the last two years I’ve just been agonizing over the fact that a below average team won the WS and needed to write it down. This seemed like a good place.
haha I get it...
makes sense. It was cruel watching Detroit losing that series. OTOH, our 2006 team was probably not ALCS quality, so can’t complain about that playoffs.
once again, the results of a crapshoot are neither randomly distributed nor binary
I'll send you a postcard from Space Mountain. @('.')@
ok
First of all, I don’t know if you read the fanpost, but I actually just did all this work to disprove my own original position. So your acting as though I’m being repetitive (“once again”) is cruelly ironic…you’re the one repeating without listening.
On the actual substance of your (argument?) claim…
The outcome of a series IS binary (W/L).
The Outcome of playoffs, (IE WS Champ) IS binary…you are champ, or you are not.
You defined crapshoot as a normal distribution…
1. You are the only person I have ever seen define it that way. Urbandictionary has it as ‘toss-up, roll of the dice’, while random house (via dictionary.com) has it as “anything random or risky”.
2. That’s not the definition people use when they deploy the argument regarding the playoffs. What they’re saying is that you can’t build a team to win the playoffs, just make it and hope. If crapshoot meant normal distribution, that would mean teams SHOULD go for upgrades even if they are going to make the playoffs either way, since raising their mean talent level would raise the probability of winning it all. Your definition is the EXACT opposite of what they mean in this context. They mean that improving your talent (thereby increasing the “mean” of the normal distribution) is pointless, since the playoffs are random.
in case people were wondering
a lot of my post was responding to this:
that’s why they say “crapshoot” and not “coin flip”
coin flip: binary outcome
crapshoot: rough bell curve distribution of outcomes
That’s Monkeyball on the MikeA DLD, and it’s what he’s referring to with “once again”

by 
























