Having fun with Pythagoras, or, who got lucky during 2009
Well, another season has come and gone, without the good guys winning again. Being a numbers-guy, I'd like to take a look at how the teams in MLB performed against their Pythagorean records (in case you don't know what I'm talking about, and haven't backed out of this post, look here). I'm especially interested in who under- and over-performed against their Pythag records, and want to take a stab at figuring out why.
All of the team data in the graphs and tables below came from baseball-reference.com or fangraphs.com.
First of all, let's take a look at how the teams in MLB performed this past season (the AL West teams are in bold).
| Team | Rs | Ra | W% | PW% | Pythag diff |
| TOR | 798 | 771 | 0.463 | 0.516 | -8.55 |
| CLE | 773 | 865 | 0.401 | 0.449 | -7.70 |
| WSN | 710 | 874 | 0.364 | 0.406 | -6.78 |
| OAK | 759 | 761 | 0.463 | 0.499 | -5.80 |
| ATL | 735 | 641 | 0.531 | 0.562 | -5.09 |
| ARI | 720 | 782 | 0.432 | 0.462 | -4.89 |
| PIT | 636 | 768 | 0.385 | 0.415 | -4.75 |
| BAL | 741 | 876 | 0.395 | 0.424 | -4.69 |
| LAD | 780 | 611 | 0.586 | 0.610 | -3.80 |
| NYM | 671 | 757 | 0.432 | 0.445 | -2.10 |
| TBR | 803 | 754 | 0.519 | 0.529 | -1.66 |
| CHC | 707 | 672 | 0.516 | 0.523 | -1.24 |
| CHW | 724 | 732 | 0.488 | 0.495 | -1.19 |
| KCR | 686 | 842 | 0.401 | 0.407 | -0.99 |
| STL | 730 | 640 | 0.562 | 0.560 | 0.30 |
| MIN | 817 | 765 | 0.534 | 0.530 | 0.60 |
| PHI | 820 | 709 | 0.574 | 0.566 | 1.28 |
| BOS | 872 | 736 | 0.586 | 0.577 | 1.53 |
| SFG | 657 | 611 | 0.543 | 0.533 | 1.63 |
| TEX | 784 | 740 | 0.537 | 0.526 | 1.72 |
| MIL | 785 | 818 | 0.494 | 0.481 | 2.05 |
| CIN | 673 | 723 | 0.481 | 0.467 | 2.30 |
| COL | 804 | 715 | 0.568 | 0.553 | 2.34 |
| DET | 743 | 745 | 0.528 | 0.499 | 4.70 |
| LAA | 883 | 761 | 0.599 | 0.568 | 5.05 |
| FLA | 772 | 766 | 0.537 | 0.504 | 5.42 |
| HOU | 643 | 770 | 0.457 | 0.418 | 6.24 |
| SDP | 638 | 769 | 0.463 | 0.415 | 7.71 |
| NYY | 915 | 753 | 0.636 | 0.588 | 7.71 |
| SEA | 640 | 692 | 0.525 | 0.464 | 9.78 |
| average | 0.500 | 0.500 | 0.038 | ||
| sigma | 0.070 | 0.060 | 4.868 |
What I've labeled as 'Pythag diff' is how many more (or fewer) games the team won as compared to their Pythag record, most easily calculated by this: Pythag diff = Games played X (W% - PW%). Games played is, or course, 162, unless you are the Cubs or the Pirates (161) or the Tigers or the Twins (163). A team with a positive difference outperformed it's Pythag record, etc. The average Pythag diff is roughly 0 (as it should be) with a standard deviation of 4.87. I have not rounded off any of these numbers in the analysis - of course, it is impossible to record a fraction of a win.
Two teams that really stand out here are the Blue Jays, who outscored their opponents by 27 runs, but won only 75 games, and the Mariners, who were outscored by 52 runs (almost 1/3 of a run per game), yet managed to win 85 games. Also, the A's should've won 81 games, but only posted 75 real wins.
Tables of numbers are fine, but it is instructive to look at the distributions. First, here are how actual winning percentages looked.
The distribution is skewed towards the winning side - there were 16 winning teams and 14 losing teams this past season (meaning, there were a few really bad teams, like the Nats, Orioles, Indians, and Pirates).
Here's how the Pythagorean winning percentages looked.
Now, this isn't even remotely normally distributed. At least the count is correct - there are 15 winners and 15 losers. But wow, that is a nasty-looking histogram.
Lastly, here is how the Pythag diff is distributed.
This one would look better if not for that large number of teams that were between 6 and 4 wins below their Pythag records. Note here that we have 14 under-performers, and 16 over-achievers.
One of my pet beliefs (meaning not necessarily quantifiable) is that good teams have a tendency to make their own luck (conversely, bad teams like the A's play just good enough to lose). Of our 16 winning teams, 12 exceeded their Pythag records. So, does winning percentage correlate at all with Pythag diff?
There is something there, although clearly there are some other factors involved (an r-squared of 0.29 means that 29% of the variation in the Pythag diff data can be explained by the winning percentage).
It may be instructive to look at the distribution of game results (net scores per game, so a ho-hum 14-13 victory is the same as a 3-2 walk-off winner) for individual teams. I went ahead and did this for the AL West, where we have one losing team (guess who) which under-performed by 5.8 wins, and three winning teams that over-performed by 1.72 wins (Texas), 5.05 wins (LA of Anaheim), and 9.78 wins (Seattle). These next four graphs are all plotted on the same scale for easy comparison.
A's
Mariners
One standout feature of these four plots is the record in one-run games. The A's were bad in these games with a 15-23 record. Texas was 19-18 in one run games, the Angels were 27-18, and the Mariners were 35-20. Seeing this, perhaps we should look at how one-run game record impacts Pythag diff.
Side note: Seattle played 55 one-run games - that's one third of the regular season. If you''ve ever wondered why the LL guys have a tendency to go off the deep end in their game threads, stop wondering.
OK, now we're cooking with gas. About half of the variation in Pythag diff can be explained by one-run winning percentage. Of course, teams with good one-run records also tend to have good overall records.
Some other factors that could possibly influence under/over performing are:
- Head-to-head matchups between two teams. For example (and I hate bringing this up), the Mariners were 14-5 against the A's this season. In 2006 when the A's last won the West, they outperformed Pythag by 8 wins, and went 17-2 against Seattle (that still amazes me). Last season, when the A's were only one game below Pythag, they went 10-9 against Seattle, and were 25-24 in one-run games. Toronto was also 6-12 against the Yankees despite only being out-scored by 5 runs in those 18 games).
- Defense. Seattle was the best team in the majors at defensive run prevention (as measured by ERA-FIP, as listed here), while Cleveland was second worst (only the Royals were worse, and they also under-performed against their Pythag record). I did try a correlation here, but the r-squared value was only 0.1 or so. The A's ERA-FIP was +0.19, which should not surprise anyone considering we had Adam Kennedy most of the season "playing" 3B, and the occasional Cust adventure in the OF.
In conclusion, if a team wants to consistently out-perform Pythagoras, they should:
- Be good at winning baseball games,
- Be good at winning one-run games,
- Totally own some team that you play a lot,
- Be very good at converting balls in play into outs, and
- Get lucky.
Your thoughts?
5 recs |
47 comments
| Add comment
Comments
You've inverted the causality in this part:
One of my pet beliefs (meaning not necessarily quantifiable) is that good teams have a tendency to make their own luck (conversely, bad teams like the A’s play just good enough to lose). Of our 16 winning teams, 12 exceeded their Pythag records. So, does winning percentage correlate at all with Pythag diff?
Winning teams aren’t making their luck— luck is making winning teams. (The best example of this is, of course, Seattle…)
Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving
by PaulThomas on Nov 7, 2009 10:52 PM PST reply actions 0 recs
It does seem more or less true for Scioscia's Angels, right?
Over the last couple years they’ve been good at scoring those one or two runs they really need in tight ballgames. Hence they tend to consistently look “lucky” when compares to their Pythagorean.
by DDroney on Nov 8, 2009 2:01 AM PST up reply actions 0 recs
I think it's both...
Luck makes winning teams make their own luck…
or
Winning teams make their own luck which attracts even more luck…
Which came first, the chicken or the egg? Yes!
"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard
by Gaijin_Suketto on Nov 9, 2009 3:07 PM PST up reply actions 0 recs
Ah, the old correlation/causality conundrum

"Loyal? I'm the most loyal player money can buy." - Don Sutton
by vignette17 on Nov 8, 2009 4:02 AM PST up reply actions 1 recs
Three statisticians went duck hunting
Upon spying a duck, the first one shot, but missed a yard high.
The second one then shot, but missed a yard low.
Exclaimed the third; “we got him!”
by bobnothing on Nov 8, 2009 2:05 PM PST up reply actions 0 recs
LOL!!!
My religion is A'slamic.
by WhoNeedsReligionWhenYaGotBaseball on Nov 8, 2009 6:40 PM PST up reply actions 0 recs
total agreement from me
Winning close games is pretty much the definition of beating your pythagorean record. And since the pythag winning percentages range from 0.4 to 0.6 or so (because even the best baseball teams lose 35-40% of the time), beating your pythagorean record by a few games is a very good way to finish over 500.
I dig the histograms doctorK! (scatter plots too)
by colin on Nov 8, 2009 8:26 AM PST up reply actions 0 recs
You are probably right
One thing that would be interesting to do would be to see if there are any teams who consistently out-perform Pythagoras over several seasons. The easiest example are the Angels, who have been +5, +12, +4, and +5 their last four seasons, but before that they were not exceptionally lucky either way. Also, the typical standard deviation is about 4-5 games each year (I only took the time to do the last three seasons), so being +4 or +5 is not particularly notable. Clearly, this year’s Mariners at +10, last year’s Angels at +12, and 2007’s D-Backs at +11 were, in fact, quite notable.
I’m also not surprised you made the first comment on this post.
Hey Al, just go away, baby.
by doctorK on Nov 8, 2009 12:41 PM PST up reply actions 0 recs
it would be pretty easy to clear up this bias
with some Monte Carlo. And pythagorean wins are a simple enough statistic that it could probably be handled analytically.
I’ve been wanting to do a fANpost for a while on the mathematics of the pythagorean wins formula, but the thesis is taking precedence right now.
by colin on Nov 8, 2009 1:01 PM PST up reply actions 0 recs
and by taking precedence
I mean ruining my life, of course.
by colin on Nov 8, 2009 1:01 PM PST up reply actions 0 recs
Been there - done that
The last three months of 1990 and first four months of 1991 are only a blur today – in fact, they were a blur then.
Hey Al, just go away, baby.
by doctorK on Nov 8, 2009 1:05 PM PST up reply actions 0 recs
Simple solution: Quit school and write a fanpost for AN
I taked that aproach years’ ago and I’m still prety educatified.
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 8, 2009 1:38 PM PST up reply actions 0 recs
It's "edumacated" Nico
"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard
by Gaijin_Suketto on Nov 9, 2009 3:09 PM PST up reply actions 0 recs
Same principle as "saxomophone"
It pays to study etymology.
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 9, 2009 4:48 PM PST up reply actions 0 recs
Ha ha ha
You mean “etology.”
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 9, 2009 7:42 PM PST up reply actions 0 recs
that word is so totally chromulent.
"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard
by Gaijin_Suketto on Nov 9, 2009 11:21 PM PST up reply actions 0 recs
The extra H embiggens the word.
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 10, 2009 12:05 AM PST up reply actions 0 recs
This would be a more interesting question:
Does Pythagorean differential positively correlate with third-order record? (You can find third-order record at baseballprospectus. Basically, for those who aren’t familiar with the stat, it’s a linear-weights estimate of how many runs a team should have scored and allowed, adjusted for strength of schedule, and then converted into an expected win/loss record.)
Third-order record is (mostly) luck-independent.
Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving
by PaulThomas on Nov 8, 2009 1:01 PM PST up reply actions 0 recs
For some reason
I couldn’t find 3rd order wins on that site anymore. How did the A’s do? And what about the Angels?
"If Bowden was a general contractor, he'd build houses with nine bedrooms, six garages, no bathrooms, and half a roof."
by DyeLongJustice on Nov 8, 2009 3:12 PM PST up reply actions 0 recs
so theoretically
we are still the worst team in the dvision? Well, at least we are close enough to luck into the playoffs.
"If Bowden was a general contractor, he'd build houses with nine bedrooms, six garages, no bathrooms, and half a roof."
by DyeLongJustice on Nov 8, 2009 4:23 PM PST up reply actions 0 recs
Yeah, ironically it doesn't change anything about the order of finish
just tightens the gaps.
Weirdly, Seattle was both “lucky” in the Pythagorean sense and also hugely “unclutch” in terms of runs scored relative to linear weights, so the two virtually cancel each other out.
Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving
by PaulThomas on Nov 8, 2009 4:52 PM PST up reply actions 0 recs
I believe TB was that way in 2008
"Loyal? I'm the most loyal player money can buy." - Don Sutton
by vignette17 on Nov 9, 2009 3:47 PM PST up reply actions 0 recs
Pretty graphs
What program did you use? Excel? I always am trying to make my graphics prettier.
Hopefully I’ll be able to bust out some new ones in my newest FP in my series (ETA next week). I’ve been working on adding some new toys and haven’t had much time to write. Very interesting fanpost btw.
"Loyal? I'm the most loyal player money can buy." - Don Sutton
by vignette17 on Nov 8, 2009 4:06 AM PST reply actions 0 recs
I'm curious too
at first I thought it was gnuplot, but now I’m not so sure
by colin on Nov 8, 2009 8:28 AM PST up reply actions 0 recs
Graphics were generated from RS/1
RS/1 is an old BBN software package that we still have on our network at my place of employment. Make really nice graphs as well as has great software for experimental design and analysis.
Hey Al, just go away, baby.
by doctorK on Nov 8, 2009 12:17 PM PST up reply actions 0 recs
Although RS/1 specifically is no longer supported
it’s one of many variations of the more generic R, which is freely available. You can download it at this site, which also has a bit of explanation of exactly what R is. In brief, it is a programming language designed for statistical work.
R is a standard tool in the statistics world. If you have any aspiration of getting involved in the business, or even if you’re just an enthusiastic amateur who isn’t afraid of some simple coding, it’s well worth your while to learn it. R is a large and powerful system so it would be quite a project to really know all its abilities well, but to learn enough to create some basic charts isn’t so hard.
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 8, 2009 12:33 PM PST up reply actions 0 recs
R is pretty cool
pet project of Cal profs…
by ohmangoAs on Nov 10, 2009 1:02 AM PST up reply actions 0 recs
Being a non numbers guy
I will just copy off PT’s paper.
But the cool graphs look like rollercoasters, and that makes me happy.
If any of this helps the A’s win, then thanks for posting!
"Tonto think Billy Beane need to make team full of squirrels and bears."
by OptimistPrime on Nov 8, 2009 8:11 AM PST reply actions 0 recs
Bell curves are like dinosaurs:
They’re small on one end, then much, much bigger in the middle, and then small again at the far end. This is my theory that it is, and it’s mine.
-Anne Elk.
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 8, 2009 9:03 AM PST up reply actions 0 recs
Carl Everett doesn't believe in bell curves.
Somebody actually saw Adam and Eve. No one ever saw a bell curve.
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 8, 2009 12:36 PM PST up reply actions 0 recs
It was the snake.
What a standard deviant that snake was.
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 8, 2009 1:39 PM PST up reply actions 0 recs
1-run games
I’ve always wondered if it’s proper to treat all 1-run games as the same. Do teams tend to do as well in 1-0 games as they do in 5-4 or 8-7 games? It seems to me that teams with better offenses might be a bit better at winning the high-scoring one-run games because they have a better ability to push across a few extra runs than does a team with a poor offense, but I don’t think I’ve ever seen someone break down games in quite that way.
I did the graphs too, very nice!
There is no gravity - the earth just sucks.
by JLeverenz on Nov 8, 2009 11:48 AM PST reply actions 0 recs
Actually, to build on that
As far as I know, the Pythagorean calculation doesn’t take into account the distrubution of the scoring; in other words, how consistently teams give up large totals, and how consistently they score big themselves. If I score, say, 4,4,4,4, or 7,7,1,1, then the total runs scored over the four games (obviously, this makes no analytical sense over four games, but you see where I’m going).
Do teams that are the most consistent in run scoring / prevention do best? Or is it the ones that put up big scores in some games, and then blow it in others? is there a ‘minimum level’ (4.5 runs, say) at which consistency is a good thing (and conversly, below which it’s a bad thing?)
I think I’m wondering whether a low standard deviation from the median number of runs scored per game is a good thing, or a bad thing, and how much that relates to what the median value of runs per game actually is.
These thoughts are a little blurred, and I’m not sure how much it makes sense, but it’s been something that’s been ticking over in my head since the AS break last year, when the Giants were overperforming their Pythagorean level, with a rather unusual team.
Maybe I’ll see if I can find some data on this, unless someone knows that the answers to this are already out there.
by bobnothing on Nov 8, 2009 7:33 PM PST up reply actions 0 recs
distributions
Pythagorean wins assumes that teams don’t control their scoring distribution. Your offense has some skill level and, assuming that the team is out trying to score as many runs as possible in every game, then the runs scored distributions should converge so some common shape with a large enough sample.
I’m not really sure how well that particular hypothesis has been tested, but I think it’s pretty essential for the Pythag wins formula.
by colin on Nov 9, 2009 7:20 AM PST up reply actions 0 recs
to put it another way
there is a built in assumption that only one number (i.e. run scored per game) is sufficient to characterize a team’s ability to score runs (and the same for run prevention)
by colin on Nov 9, 2009 7:22 AM PST up reply actions 0 recs
Oh yeah - I don't believe it's possible to be able to control the consistency of runs scored
But that doesn’t mean that it’s not a statistic that reveals something. What, I don’t know.
And yeah – like you said, I’ve not seen any testing of the hypothesis you state above; it’d be interesting to do so.
I was thinking more about this last night – to do so, I’d need access to evey game score over a number of seasons – is there a downloadable spreadsheet of this sort of thing?
by bobnothing on Nov 9, 2009 8:50 AM PST up reply actions 0 recs
Try the team schedule pages at baseball-reference.com
Example: A’s 2009
If you click the ‘CSV’ tab, then the table will be converted to comma-delimited, which is then easy to parse out in Excel.
Hey Al, just go away, baby.
by doctorK on Nov 9, 2009 9:01 AM PST up reply actions 0 recs
There is a perfectly reliable and achievable way
that any team could use to beat its Pythagorean record. It would simply have to play most games like normal but then any time it falls far behind, intentionally give away 30 or 40 runs in a game they’re going to lose anyway.
Do this and your W-L record will be well in excess of your Pythag record.
My point in making this absurd observation is to illustrate that beating one’s Pythagorean record is neither an accomplishment nor a worthwhile goal.
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 8, 2009 12:46 PM PST reply actions 0 recs
In other words, sign Chris Bootcheck for the back of your bullpen?
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 8, 2009 1:40 PM PST up reply actions 0 recs
Aha, now we know the Angels' pythag-breaking secret!
"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan
by iglew on Nov 8, 2009 3:03 PM PST up reply actions 0 recs
Now we know why they were +12 last year and +5 this year
Hey Al, just go away, baby.
by doctorK on Nov 8, 2009 3:34 PM PST up reply actions 0 recs
Fantastic, doctorK.
Don’t have much to add other than what’s already been said, but nice work.
No, there's no light,
in the darkest of your furthest reaches.
by danmerqury on Nov 8, 2009 2:10 PM PST reply actions 0 recs
Thanks for letting us know who got lucky during 2009
I was already pretty sure it wasn’t me.
I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal
by Nico on Nov 9, 2009 8:11 AM PST reply actions 0 recs
Nice work, doctorK.
I'm here to talk about the past.
by 67MARQUEZ on Nov 10, 2009 7:23 AM PST reply actions 0 recs

by 





























