Athletics Nation: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
New Blog: World Soccer Digest for Soccer Fans!

Having fun with Pythagoras, or, who got lucky during 2009

Well, another season has come and gone, without the good guys winning again. Being a numbers-guy, I'd like to take a look at how the teams in MLB performed against their Pythagorean records (in case you don't know what I'm talking about, and haven't backed out of this post, look here). I'm especially interested in who under- and over-performed against their Pythag records, and want to take a stab at figuring out why.

All of the team data in the graphs and tables below came from baseball-reference.com or fangraphs.com.

Star-divide

First of all, let's take a look at how the teams in MLB performed this past season (the AL West teams are in bold).

Team Rs Ra W% PW%  Pythag diff
TOR 798 771 0.463 0.516 -8.55
CLE 773 865 0.401 0.449 -7.70
WSN 710 874 0.364 0.406 -6.78
OAK 759 761 0.463 0.499 -5.80
ATL 735 641 0.531 0.562 -5.09
ARI 720 782 0.432 0.462 -4.89
PIT 636 768 0.385 0.415 -4.75
BAL 741 876 0.395 0.424 -4.69
LAD 780 611 0.586 0.610 -3.80
NYM 671 757 0.432 0.445 -2.10
TBR 803 754 0.519 0.529 -1.66
CHC 707 672 0.516 0.523 -1.24
CHW 724 732 0.488 0.495 -1.19
KCR 686 842 0.401 0.407 -0.99
STL 730 640 0.562 0.560 0.30
MIN 817 765 0.534 0.530 0.60
PHI 820 709 0.574 0.566 1.28
BOS 872 736 0.586 0.577 1.53
SFG 657 611 0.543 0.533 1.63
TEX 784 740 0.537 0.526 1.72
MIL 785 818 0.494 0.481 2.05
CIN 673 723 0.481 0.467 2.30
COL 804 715 0.568 0.553 2.34
DET 743 745 0.528 0.499 4.70
LAA 883 761 0.599 0.568 5.05
FLA 772 766 0.537 0.504 5.42
HOU 643 770 0.457 0.418 6.24
SDP 638 769 0.463 0.415 7.71
NYY 915 753 0.636 0.588 7.71
SEA 640 692 0.525 0.464 9.78
average

0.500 0.500 0.038
sigma

0.070 0.060 4.868

 

What I've labeled as 'Pythag diff' is how many more (or fewer) games the team won as compared to their Pythag record, most easily calculated by this: Pythag diff = Games played X (W% - PW%). Games played is, or course, 162, unless you are the Cubs or the Pirates (161) or the Tigers or the Twins (163). A team with a positive difference outperformed it's Pythag record, etc. The average Pythag diff is roughly 0 (as it should be) with a standard deviation of 4.87. I have not rounded off any of these numbers in the analysis - of course, it is impossible to record a fraction of a win.

Two teams that really stand out here are the Blue Jays, who outscored their opponents by 27 runs, but won only 75 games, and the Mariners, who were outscored by 52 runs (almost 1/3 of a run per game), yet managed to win 85 games. Also, the A's should've won 81 games, but only posted 75 real wins.

Tables of numbers are fine, but it is instructive to look at the distributions. First, here are how actual winning percentages looked.

4066117106_2bbbaeb2e3_o_medium

via farm4.static.flickr.com

The distribution is skewed towards the winning side - there were 16 winning teams and 14 losing teams this past season (meaning, there were a few really bad teams, like the Nats, Orioles, Indians, and Pirates).

Here's how the Pythagorean winning percentages looked.

4066117128_48af204682_o_medium

via farm3.static.flickr.com

Now, this isn't even remotely normally distributed. At least the count is correct - there are 15 winners and 15 losers. But wow, that is a nasty-looking histogram.

Lastly, here is how the Pythag diff is distributed.

4066117164_546ba0cf95_o_medium

via farm3.static.flickr.com

This one would look better if not for that large number of teams that were between 6 and 4 wins below their Pythag records. Note here that we have 14 under-performers, and 16 over-achievers.

One of my pet beliefs (meaning not necessarily quantifiable) is that good teams have a tendency to make their own luck (conversely, bad teams like the A's play just good enough to lose). Of our 16 winning teams, 12 exceeded their Pythag records. So, does winning percentage correlate at all with Pythag diff?

4066117188_07d5855e62_o_medium

via farm3.static.flickr.com

There is something there, although clearly there are some other factors involved (an r-squared of 0.29 means that 29% of the variation in the Pythag diff data can be explained by the winning percentage).

It may be instructive to look at the distribution of game results (net scores per game, so a ho-hum 14-13 victory is the same as a 3-2 walk-off winner) for individual teams. I went ahead and did this for the AL West, where we have one losing team (guess who) which under-performed by 5.8 wins, and three winning teams that over-performed by 1.72 wins (Texas), 5.05 wins (LA of Anaheim), and 9.78 wins (Seattle). These next four graphs are all plotted on the same scale for easy comparison.

A's

4066117062_f486596660_o_medium

via farm3.static.flickr.com

Rangers

4065366645_df78993f41_o_medium

via farm3.static.flickr.com

Angels

4065366309_520c7a56ee_o_medium

via farm3.static.flickr.com

Mariners

4065366591_5c14de8aed_o_medium

via farm4.static.flickr.com

One standout feature of these four plots is the record in one-run games. The A's were bad in these games with a 15-23 record. Texas was 19-18 in one run games, the Angels were 27-18, and the Mariners were 35-20. Seeing this, perhaps we should look at how one-run game record impacts Pythag diff.

Side note: Seattle played 55 one-run games - that's one third of the regular season. If you''ve ever wondered why the LL guys have a tendency to go off the deep end in their game threads, stop wondering.

4065366549_55b02335d6_o_medium

via farm3.static.flickr.com

OK, now we're cooking with gas. About half of the variation in Pythag diff can be explained by one-run winning percentage. Of course, teams with good one-run records also tend to have good overall records.

4065771927_2dfc27af97_o_medium

via farm3.static.flickr.com

Some other factors that could possibly influence under/over performing are:

  • Head-to-head matchups between two teams. For example (and I hate bringing this up), the Mariners were 14-5 against the A's this season. In 2006 when the A's last won the West, they outperformed Pythag by 8 wins, and went 17-2 against Seattle (that still amazes me). Last season, when the A's were only one game below Pythag, they went 10-9 against Seattle, and were 25-24 in one-run games. Toronto was also 6-12 against the Yankees despite only being out-scored by 5 runs in those 18 games).
  • Defense. Seattle was the best team in the majors at defensive run prevention (as measured by ERA-FIP, as listed here), while Cleveland was second worst (only the Royals were worse, and they also under-performed against their Pythag record). I did try a correlation here, but the r-squared value was only 0.1 or so. The A's ERA-FIP was +0.19, which should not surprise anyone considering we had Adam Kennedy most of the season "playing" 3B, and the occasional Cust adventure in the OF.

In conclusion, if a team wants to consistently out-perform Pythagoras, they should:

  1. Be good at winning baseball games,
  2. Be good at winning one-run games,
  3. Totally own some team that you play a lot,
  4. Be very good at converting balls in play into outs, and
  5. Get lucky.

Your thoughts?

5 recs  |  Comment 47 comments  |  Add comment

Story-email Email Printer Print

Comments

Display:

You've inverted the causality in this part:

One of my pet beliefs (meaning not necessarily quantifiable) is that good teams have a tendency to make their own luck (conversely, bad teams like the A’s play just good enough to lose). Of our 16 winning teams, 12 exceeded their Pythag records. So, does winning percentage correlate at all with Pythag diff?

Winning teams aren’t making their luck— luck is making winning teams. (The best example of this is, of course, Seattle…)

Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving

by PaulThomas on Nov 7, 2009 10:52 PM PST reply actions   0 recs

It does seem more or less true for Scioscia's Angels, right?

Over the last couple years they’ve been good at scoring those one or two runs they really need in tight ballgames. Hence they tend to consistently look “lucky” when compares to their Pythagorean.

by DDroney on Nov 8, 2009 2:01 AM PST up reply actions   0 recs

I think it's both...

Luck makes winning teams make their own luck…

or

Winning teams make their own luck which attracts even more luck…

Which came first, the chicken or the egg? Yes!

"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard

by Gaijin_Suketto on Nov 9, 2009 3:07 PM PST up reply actions   0 recs

Ah, the old correlation/causality conundrum

"Loyal? I'm the most loyal player money can buy." - Don Sutton

by vignette17 on Nov 8, 2009 4:02 AM PST up reply actions   1 recs

Three statisticians went duck hunting

Upon spying a duck, the first one shot, but missed a yard high.

The second one then shot, but missed a yard low.

Exclaimed the third; “we got him!”

by bobnothing on Nov 8, 2009 2:05 PM PST up reply actions   0 recs

LOL!!!

My religion is A'slamic.

by WhoNeedsReligionWhenYaGotBaseball on Nov 8, 2009 6:40 PM PST up reply actions   0 recs

total agreement from me

Winning close games is pretty much the definition of beating your pythagorean record. And since the pythag winning percentages range from 0.4 to 0.6 or so (because even the best baseball teams lose 35-40% of the time), beating your pythagorean record by a few games is a very good way to finish over 500.

I dig the histograms doctorK! (scatter plots too)

by colin on Nov 8, 2009 8:26 AM PST up reply actions   0 recs

You are probably right

One thing that would be interesting to do would be to see if there are any teams who consistently out-perform Pythagoras over several seasons. The easiest example are the Angels, who have been +5, +12, +4, and +5 their last four seasons, but before that they were not exceptionally lucky either way. Also, the typical standard deviation is about 4-5 games each year (I only took the time to do the last three seasons), so being +4 or +5 is not particularly notable. Clearly, this year’s Mariners at +10, last year’s Angels at +12, and 2007’s D-Backs at +11 were, in fact, quite notable.

I’m also not surprised you made the first comment on this post.

Hey Al, just go away, baby.

by doctorK on Nov 8, 2009 12:41 PM PST up reply actions   0 recs

it would be pretty easy to clear up this bias

with some Monte Carlo. And pythagorean wins are a simple enough statistic that it could probably be handled analytically.

I’ve been wanting to do a fANpost for a while on the mathematics of the pythagorean wins formula, but the thesis is taking precedence right now.

by colin on Nov 8, 2009 1:01 PM PST up reply actions   0 recs

and by taking precedence

I mean ruining my life, of course.

by colin on Nov 8, 2009 1:01 PM PST up reply actions   0 recs

Been there - done that

The last three months of 1990 and first four months of 1991 are only a blur today – in fact, they were a blur then.

Hey Al, just go away, baby.

by doctorK on Nov 8, 2009 1:05 PM PST up reply actions   0 recs

Simple solution: Quit school and write a fanpost for AN

I taked that aproach years’ ago and I’m still prety educatified.

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 8, 2009 1:38 PM PST up reply actions   0 recs

It's "edumacated" Nico

"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard

by Gaijin_Suketto on Nov 9, 2009 3:09 PM PST up reply actions   0 recs

Same principle as "saxomophone"

It pays to study etymology.

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 9, 2009 4:48 PM PST up reply actions   0 recs

Ha ha ha

You mean “etology.”

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 9, 2009 7:42 PM PST up reply actions   0 recs

that word is so totally chromulent.

"Flea Markets aren't just for blind dates anymore!"- The Reverend Billy Lard

by Gaijin_Suketto on Nov 9, 2009 11:21 PM PST up reply actions   0 recs

The extra H embiggens the word.

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 10, 2009 12:05 AM PST up reply actions   0 recs

This would be a more interesting question:

Does Pythagorean differential positively correlate with third-order record? (You can find third-order record at baseballprospectus. Basically, for those who aren’t familiar with the stat, it’s a linear-weights estimate of how many runs a team should have scored and allowed, adjusted for strength of schedule, and then converted into an expected win/loss record.)

Third-order record is (mostly) luck-independent.

Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving

by PaulThomas on Nov 8, 2009 1:01 PM PST up reply actions   0 recs

For some reason

I couldn’t find 3rd order wins on that site anymore. How did the A’s do? And what about the Angels?

"If Bowden was a general contractor, he'd build houses with nine bedrooms, six garages, no bathrooms, and half a roof."

by DyeLongJustice on Nov 8, 2009 3:12 PM PST up reply actions   0 recs

Try this

linky

Hey Al, just go away, baby.

by doctorK on Nov 8, 2009 3:40 PM PST up reply actions   0 recs

so theoretically

we are still the worst team in the dvision? Well, at least we are close enough to luck into the playoffs.

"If Bowden was a general contractor, he'd build houses with nine bedrooms, six garages, no bathrooms, and half a roof."

by DyeLongJustice on Nov 8, 2009 4:23 PM PST up reply actions   0 recs

Yeah, ironically it doesn't change anything about the order of finish

just tightens the gaps.

Weirdly, Seattle was both “lucky” in the Pythagorean sense and also hugely “unclutch” in terms of runs scored relative to linear weights, so the two virtually cancel each other out.

Linda's in the cold ground, won't see her anymore
Somewhere out on the highway tonight, the drunken engines roar
It's just one of those things, one of those things
-- Al Stewart, "Accident on 3rd St."
In memory of Nick Adenhart and all victims of drunk driving

by PaulThomas on Nov 8, 2009 4:52 PM PST up reply actions   0 recs

I believe TB was that way in 2008

"Loyal? I'm the most loyal player money can buy." - Don Sutton

by vignette17 on Nov 9, 2009 3:47 PM PST up reply actions   0 recs

Pretty graphs

What program did you use? Excel? I always am trying to make my graphics prettier.

Hopefully I’ll be able to bust out some new ones in my newest FP in my series (ETA next week). I’ve been working on adding some new toys and haven’t had much time to write. Very interesting fanpost btw.

"Loyal? I'm the most loyal player money can buy." - Don Sutton

by vignette17 on Nov 8, 2009 4:06 AM PST reply actions   0 recs

I'm curious too

at first I thought it was gnuplot, but now I’m not so sure

by colin on Nov 8, 2009 8:28 AM PST up reply actions   0 recs

Graphics were generated from RS/1

RS/1 is an old BBN software package that we still have on our network at my place of employment. Make really nice graphs as well as has great software for experimental design and analysis.

Hey Al, just go away, baby.

by doctorK on Nov 8, 2009 12:17 PM PST up reply actions   0 recs

Although RS/1 specifically is no longer supported

it’s one of many variations of the more generic R, which is freely available. You can download it at this site, which also has a bit of explanation of exactly what R is. In brief, it is a programming language designed for statistical work.

R is a standard tool in the statistics world. If you have any aspiration of getting involved in the business, or even if you’re just an enthusiastic amateur who isn’t afraid of some simple coding, it’s well worth your while to learn it. R is a large and powerful system so it would be quite a project to really know all its abilities well, but to learn enough to create some basic charts isn’t so hard.

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 8, 2009 12:33 PM PST up reply actions   0 recs

R is pretty cool

pet project of Cal profs…

by ohmangoAs on Nov 10, 2009 1:02 AM PST up reply actions   0 recs

Being a non numbers guy

I will just copy off PT’s paper.

But the cool graphs look like rollercoasters, and that makes me happy.
If any of this helps the A’s win, then thanks for posting!

"Tonto think Billy Beane need to make team full of squirrels and bears."

by OptimistPrime on Nov 8, 2009 8:11 AM PST reply actions   0 recs

Bell curves are like dinosaurs:

They’re small on one end, then much, much bigger in the middle, and then small again at the far end. This is my theory that it is, and it’s mine.

-Anne Elk.

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 8, 2009 9:03 AM PST up reply actions   0 recs

Carl Everett doesn't believe in bell curves.

Somebody actually saw Adam and Eve. No one ever saw a bell curve.

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 8, 2009 12:36 PM PST up reply actions   0 recs

It was the snake.

What a standard deviant that snake was.

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 8, 2009 1:39 PM PST up reply actions   0 recs

1-run games

I’ve always wondered if it’s proper to treat all 1-run games as the same. Do teams tend to do as well in 1-0 games as they do in 5-4 or 8-7 games? It seems to me that teams with better offenses might be a bit better at winning the high-scoring one-run games because they have a better ability to push across a few extra runs than does a team with a poor offense, but I don’t think I’ve ever seen someone break down games in quite that way.

I did the graphs too, very nice!

There is no gravity - the earth just sucks.

by JLeverenz on Nov 8, 2009 11:48 AM PST reply actions   0 recs

Actually, to build on that

As far as I know, the Pythagorean calculation doesn’t take into account the distrubution of the scoring; in other words, how consistently teams give up large totals, and how consistently they score big themselves. If I score, say, 4,4,4,4, or 7,7,1,1, then the total runs scored over the four games (obviously, this makes no analytical sense over four games, but you see where I’m going).

Do teams that are the most consistent in run scoring / prevention do best? Or is it the ones that put up big scores in some games, and then blow it in others? is there a ‘minimum level’ (4.5 runs, say) at which consistency is a good thing (and conversly, below which it’s a bad thing?)

I think I’m wondering whether a low standard deviation from the median number of runs scored per game is a good thing, or a bad thing, and how much that relates to what the median value of runs per game actually is.

These thoughts are a little blurred, and I’m not sure how much it makes sense, but it’s been something that’s been ticking over in my head since the AS break last year, when the Giants were overperforming their Pythagorean level, with a rather unusual team.

Maybe I’ll see if I can find some data on this, unless someone knows that the answers to this are already out there.

by bobnothing on Nov 8, 2009 7:33 PM PST up reply actions   0 recs

distributions

Pythagorean wins assumes that teams don’t control their scoring distribution. Your offense has some skill level and, assuming that the team is out trying to score as many runs as possible in every game, then the runs scored distributions should converge so some common shape with a large enough sample.

I’m not really sure how well that particular hypothesis has been tested, but I think it’s pretty essential for the Pythag wins formula.

by colin on Nov 9, 2009 7:20 AM PST up reply actions   0 recs

to put it another way

there is a built in assumption that only one number (i.e. run scored per game) is sufficient to characterize a team’s ability to score runs (and the same for run prevention)

by colin on Nov 9, 2009 7:22 AM PST up reply actions   0 recs

Oh yeah - I don't believe it's possible to be able to control the consistency of runs scored

But that doesn’t mean that it’s not a statistic that reveals something. What, I don’t know.

And yeah – like you said, I’ve not seen any testing of the hypothesis you state above; it’d be interesting to do so.

I was thinking more about this last night – to do so, I’d need access to evey game score over a number of seasons – is there a downloadable spreadsheet of this sort of thing?

by bobnothing on Nov 9, 2009 8:50 AM PST up reply actions   0 recs

Try the team schedule pages at baseball-reference.com

Example: A’s 2009
If you click the ‘CSV’ tab, then the table will be converted to comma-delimited, which is then easy to parse out in Excel.

Hey Al, just go away, baby.

by doctorK on Nov 9, 2009 9:01 AM PST up reply actions   0 recs

There is a perfectly reliable and achievable way

that any team could use to beat its Pythagorean record. It would simply have to play most games like normal but then any time it falls far behind, intentionally give away 30 or 40 runs in a game they’re going to lose anyway.

Do this and your W-L record will be well in excess of your Pythag record.

My point in making this absurd observation is to illustrate that beating one’s Pythagorean record is neither an accomplishment nor a worthwhile goal.

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 8, 2009 12:46 PM PST reply actions   0 recs

In other words, sign Chris Bootcheck for the back of your bullpen?

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 8, 2009 1:40 PM PST up reply actions   0 recs

Aha, now we know the Angels' pythag-breaking secret!

"Go ahead and overachieve, you scrappy Brett-Favre-colored walk-takers." —Rev Halofan

by iglew on Nov 8, 2009 3:03 PM PST up reply actions   0 recs

Fantastic, doctorK.

Don’t have much to add other than what’s already been said, but nice work.

No, there's no light,
in the darkest of your furthest reaches.

by danmerqury on Nov 8, 2009 2:10 PM PST reply actions   0 recs

Thanks for letting us know who got lucky during 2009

I was already pretty sure it wasn’t me.

I like Cindi. A. She never pretends to know more than she does. B. She has unbridled enthusiasm for her "Hotties," and isn't afraid to show it. -IM4Oakgal

by Nico on Nov 9, 2009 8:11 AM PST reply actions   0 recs

Nice work, doctorK.

I'm here to talk about the past.

by 67MARQUEZ on Nov 10, 2009 7:23 AM PST reply actions   0 recs


User Tools

Welcome to the SB Nation blog about Oakland Athletics.

Community Guidelines ANcillary Terms
Start posting about the Athletics »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Depaulbluedemons_small
Community Prospect List #17
Imgp0089_editedagasin_small
DLD 11/17/09 - Nintendo 64 and a Nerd's top 10 Epic Movie Fights
Me_at_att_park_small
Greener Grass, Episode 6: It's All About Culture
Cimg0007_small
Bailey wins ROY!!!!
Depaulbluedemons_small
Community Prospect List #16

Recent FanPosts

Countdown_small
Some things I am "coming around" on...
Bill_king_small
On Trading Catchers....
Small
A's ink 10 year deal with KTRB 860 am
Bill_king_small
Huston Street and the Blown Save
413niegoftl__sl500_aa280__small
UPDATE: Denorfia Outrighted; Becomes 6-Year Minor League Free Agent
Small
Free agents and ballpark

+ New FanPost All FanPosts >

SPONSORS


Managers

Tyler_at_maya_school_small Tyler Bleszinski

08-_the_author_small 67MARQUEZ

Baseball_small baseballgirl

Poochini-butt_in_box_2_small Nico

As_kings_cal_small louismg

Editors

Countdown_small Taj Adib

Ziegler160px_small Flashfire

527918550406_0_bg_small notsellingjeans