How other pitchers "Control" The A's
So I've been wondering why Pitchers like Ervin Santana dominate us, and at the same time, we hit decently against better pitchers such as AJ Burnett. My theory, which may not be surprising to many of you, is that pitchers with good control (who do not issue walks) absolutely destroy the A's. Even though this may be obvious, I want to take a closer statistical look at the issue.
Scope: I used statistics from 95 starts against the A's in 2008. These 95 starts came from only 76 starters. This is because some starters have played the A's multiple times (for example, Garland has started 4 times vs. the A's this year).
The relationship between opposing starters' walk rate and the number of runs the A's score:
On the Vertical Axis, I've placed the ERA posted by the opposing starter in their start vs. the A's. On the horizontal axis is that same starter's Walks per 9 innings in 2008. So, for example, the point at (2.5,20) represents a start by James Shields, who posted a 20.32 ERA in his start against the A's, despite having a 2.33 BB/9.
Next, I ran a linear regression on this data, and found that OPPERA = 1.3*BB/9 + 1. In other words, this model predicts that a pitcher who has a 2.0 BB/9 would post a 3.6 ERA in his start against the A's. For each additional BB per 9, a given pitcher will allow an additional 1.3 runs. Note that this is only a model (and the r-squared value for the regression was only .2811). So BB/9 has substantial predictive value for how a pitcher will fare against the A's, but isn't perfect.
So now, I'm sure you're thinking that this was a lot of work for nothing. After all, it's obvious that pitchers who issue a lot of walks will issue a lot of runs. Well, here's where it gets interesting.
The Relationship between a starter's 2008 ERA and his 2008 BB/9
So I was curious whether Walk Rates predict overall ERA's vs all MLB teams in the same way they do vs. the A's. In other words, is the model in the section above similar to the relationship we see for starters in general.
The Plot above uses the same 76 starters (IE the ones who played the A's) and investigates the relationship between their walk rate and their ERA.
Here, the correlation is stronger (R squared = .4), but the slope is much less pronounced. For a typical starter, an increase of 1 walk per 9 innings resulted in only a .39 increase in their ERA.
So, against all of MLB, for these starters, giving up one more walk cost them only 0.4 of a run, but against the A's, it cost them 1.3 Runs.
I think this says a lot about the how extremely Patient the A's are, but also how dependent they are on Walks. Another way of looking at this is that if a pitcher has a BB/9 of 1.0 against the A's (IE they focus very very hard on not walking players), the A's would only score 2.3 Runs.(Remember, it's only a model- so don't take this as gospel).
The ultimate point is this- it's good that each additional walk increases the A's scoring substantially, but the steepness of the slope shows that if a pitcher controls his walks, the A's are in deep trouble. They live and die by walks.
Remember, this was based on only 95 starts, and only 75 pitchers. The correllations were statistically significant, but not extremely strong. So, this isn't proof, per se, that the A's live and die by the walk, but it does help support our suspicion that more than other teams, the A's need walks to succeed.
35 comments
|
5 recs |
Do you like this story?
Comments
If you have players with Low batting averages and relativly High OBP's in comparison, thye better damn well have power to go with it.
Unfortunetly, the A’s have very low batting averages, decent OBP’s, and no power. So when they do put the ball in play, its a single, and odds are it will not come at an oppertune time.
Perhaps the A’s should, if they haven’t already, change their hitting philosophy and start trying to target players with high batting averages-guys who’s OBP is batting average dependant.
While these guys tend to lose a lot of value later in their career as they lose the ability to hit for average, the A’s will have all these guys earlier in their career. Indeed, its why so much of AN likes Ryan Sweeney.
Now if we could only get someone who could hit for average, get on base and hit for power. Unfortunetly, .300/.400/.500 guys don’t grow on trees, and the A’s seem also to be reluctant to try to trade for such players. It sucks that the Indians managed to grab LaPorta from the Brewers for CC-I wish we could have gotten him (I remember many of us wishing the A’s could have traded Street (and maybe more) for just LaPorta himself.) And with Street’s recent breakdown, the Brewers next prospect who is also a good hitter like LaPorta, Gamel, is unlikely to be dealt to the A’s since Street blew it infront of their top assistant GM. Well, maybe they’ll try for Duke?
facepalm.jpg
This definitely highlights the effect of the A's batting Philosophy
I have a hard time believing that scouting reports don’t SCREAM that pitchers should throw strikes against the A’s. The A’s tendency to walk is well publicized…is there a point at which that becomes a disadvantage?
I am sure most every teams scouting report says throw strikes, but Geren also tells the
same thing to Eveland. Not all pitchers can throw a strike when they want to.
by theblackpearl on Jul 22, 2008 9:10 AM PDT up reply actions
What makes me cringe?
Is seeing A’s batter after batter caught “struckout looking.” See ball, hit ball.
I am very curious, if data exists,
to see where the A’s rank on strikeouts looking, not just strikeouts.
"PECOTA can pretty much kiss my ass."-Nico
Funny even Cotroneo and Fosse
keep mentioning, the A’s are just too pitch selective to a misguided point. Now wonder they lead MLB in Ks. I think most of the hitters are overthinking at the plate.
Do we even have decent OBPs?
...or is the difference between their OBP and their AVG just decent?
by Gallagher's Watermelons on Jul 22, 2008 12:23 AM PDT reply actions
The A's are tied for 4th in the AL in the difference between their OBP and AVG...
...and are closer to the top than the bottom. This is helped by them ranking 5th in walks. Even there, they have over 100 more than the last-place team, Kansas City. The A’s do have the most strikeouts in the AL, though.
They have the worst AVG (and worst SLG as the fewest total bases will do that) but three teams have a lower OBP (LA, KC, SEA). Here’s a bit sorted by the OBP/AVG difference:
TEAM OBP AVG DIFF TB .337 .259 .078 TOR .335 .259 .076 BOS .356 .280 .076 NYY .342 .269 .073 CHW .338 .265 .073 CLE .323 .250 .073 OAK .320 .247 .073 TEX .349 .279 .070 DET .343 .274 .069 BAL .329 .260 .069 LAA .319 .259 .060 MIN .337 .279 .058 SEA .313 .256 .057 KC .316 .263 .053
Last of the Ninth - Photography Site / jamesvenes.com - Blog
Exactly my point
We have a solid DIFFERENCE, but our overall AVG and OBP both suck. We need guys that are just overall BETTER HITTERS. The downside is that there’s really nothing there for us. :(
by Gallagher's Watermelons on Jul 22, 2008 12:52 AM PDT up reply actions
Yeah. When you're hitting .270 as a team that difference is nice. When you're hitting .247, it still sucks.
Last of the Ninth - Photography Site / jamesvenes.com - Blog
Does higher OBP really always = more runs scored,
or does it sometimes mean a lot more runners left on base? HMMMMMM…
by Gallagher's Watermelons on Jul 22, 2008 1:10 AM PDT up reply actions
Not really
It’s almost impossible for higher OBP not to lead to higher runs scored.
Your 2008 Athletics: It's Nothing Personal.
And the A's have a low OBP and low run total, so that part holds true.
If you check the highest OBPs of the teams in the AL (that’s all I’m looking at for now) they’re also scoring the most runs. The Mariners have the lowest OBP and the fewest runs.
Last of the Ninth - Photography Site / jamesvenes.com - Blog
Both, really
You’re going to score more runs but you’re going to leave a few more on in the process.
Last of the Ninth - Photography Site / jamesvenes.com - Blog
The first graph almost cannot help but have a more extreme slope than the second ...
it’s simply the nature of the sample, with such a wider range of ERAs …
"It's for your own good. Big strong Devo knows whats best for Poppy" -- Mossback
Yeah, I've thought about this a lot
I’m not sure how to fix this…there’s not enough data for any one starter (causing huge variations as you note), but also no way to get more data without going beyond 2008 (which brings up other issues).
I think a good way to do it ...
would be to replace the season scores from the second graph with the numerous game scores that they are made up of …
"It's for your own good. Big strong Devo knows whats best for Poppy" -- Mossback
Impossible
The data collection would take literally a month.
Do you know a way to get runs allowed by a starter without manually getting it from a boxscore? The only way I know to accurately get the data is to check the game’s boxscore. Nothing else is game specific.
Yeah ... the game data is available ...
but you’d have to set up a database to use it …
Here’s a basic description of how it would be done:
http://fastballs.wordpress.com/2007/08/23/how-to-build-a-pitch-database/
Now if you set up such a database and wanted to let me use it …
"It's for your own good. Big strong Devo knows whats best for Poppy" -- Mossback
I just reran it using z-scores
and plotted how the z-scores for each sample of ERA’s varied with BB/9… the differences in Slope held up.
Is that method valid? Does plotting Z-scores for each data point instead of the point itself correct for high variability?
To recap, I took (GameEra-AvgGameERA)/Std. Dev of game ERA and plotted it against bb/9
Did the same for Season Era
then compared the slopes, and saw higher slopes for the game ERA data.
Seems to me that this should also naturally bring more extreme results ...
but I could be wrong …
"It's for your own good. Big strong Devo knows whats best for Poppy" -- Mossback
And the moral of the story is
the A’s suck at hitting. Not walking, hitting. Hitting is more useful than walking—doing both, of course, is even better.
"PECOTA can pretty much kiss my ass."-Nico
statistically speaking
Is that the R^2 value basically tells you how much of the variation in the sample can be explained by the model. So these two plots are telling me that overall, variation in number of walks contributes to about 40% (R2=0.4) of the variation in a pitcher’s ERA, and for the A’s, this is down to 028%. My gut feeling is that this is not nearly enough sample size to estimate any variance contribution.
I would also like to know what are the 95% confidence bounds on the linear fit slope. By saying something is statistically significant, you always have to quantify your confidence level.
by asfansince1989 on Jul 22, 2008 10:20 AM PDT reply actions
Your Gut is right on the sample size
I’m very sure that this sample is not large enough…the reason it’s not large enough is because the Game ERA values vary so much (st. Dev = 5).
Speaking of a confidence interval, I believe (and I could very easily be wrong…I’m stretching the limits of my Stats knowledge here), isn’t appropriate since the data above is not a sample of a larger population. I used all the 2008 games (IE the entire population). For that matter, I guess that means that the numbers above are not, strictly speaking, statistics…they’re parameters.
The only reason I included the r^2 values was to provide a quick and dirty look at the correlation. Also, the lower R2 value for the A’s number is mostly because the ERA’s for a game vary a lot more than ERA’s for a season. As a result, any predictor would have a lower R2 value because the fluctuation in Game ERA’s is more random than the fluctuation in Season ERAs.
I’m sorry the methodology is unclear…there’s always a fine line on how much methodology to explain in a fanpost, and also, I’m not the biggest expert on this stuff either. This kind of statistical analysis hasn’t been published yet, as far as I know, and part of the reason for that, I think, may be that the results aren’t very clear. Game-by-game ERA against a given team (the A’s) varies a lot, making it hard to predict.
I’ve already collected a good amount of Data…would anyone want me to post similar analysis of K/9 and WHIP as predictors of performance against the A’s. It would probably have the same flaws as this fanpost…so I’ll only post it if people want it.
Yes, I missed one obvious difference
between the top and bottom graph, one is per game, and one is per pitcher per season. The fact that the bottom graph’s data points are aggregated by pitcher over all his games would remove certain amount of variance from the data, so to speak. The real test would be to do variance components analysis on a game basis, including factors such as pitcher, opponents, home/road, # of hits, # of walks, etc. If you do this for the A’s alone, and for the rest of the league, you can then see if which of the categories have the most difference. For that I’m sure you’ll need to collect a whole lot of data.
As far as confidence interval, it’s completely unrelated to the fact whether you are sampling or not. It only depends on the model you are assuming, and the number of data points you have. One way to assess how confident you are in the fitted slope, a quick and dirty way, you can randomly sample half of your data points, do a fitting, then do another sample, do a fitting. Do this a few time, and see how different the slope values you get from each fitting is. That would give you an idea how confident you are about the slope.
by asfansince1989 on Jul 22, 2008 1:40 PM PDT up reply actions
More to come?
I’ve already collected a good amount of Data…would anyone want me to post similar analysis of K/9 and WHIP as predictors of performance against the A’s? It would probably have the same flaws as this fanpost…so I’ll only post it if people want it.
I collected all this hoping to see stronger results, but it didn’t pan out fully.
There are three kinds of lies
Lies, damned lies, and statistics.
Post it! Don’t cost nothing.
it is not possible to strategize while the ball is coming towards you
by eastcoasta'sfan on Jul 22, 2008 8:22 PM PDT up reply actions
Cool post.
If I may ask, are you using R to do your analysis and graphs?
Raw Data
I have raw data (game-by-game) of each start against the A’s, including the number of runs allowed and innings pitched.
If anyone wants it, for curiousity, or to do something with it themselves, let me know.
collecting the data was the hardest part of this post.
If the A's Don't Walk, They Don't Score ...
and you’ve demonstrated it. What’s amazing, though, is that even though people on AN have known for years that you beat the A’s simply by throwing strikes all the more so when there’s no one on the team with any power so many guys just don’t throw strikes, and the A’s actually win a lot of games in which runners got on base because of walks. If I were a manager of a team facing the A’s, I would pay a pitcher for giving up home runs when behind in the count, and charge him for issuing walks, just to get the point across.
The most amazing guy was Kendall and how often relative to his power he would draw a walk. You kept wondering: how in the world can the pitcher ever even get to 3 balls – what’s he worried about, that Kendall might bleed a ball into right field? But despite everything, Kendall worked a lot of deep counts. I guess it is hard to throw strikes, but there are some pretty crappy guys on the Twins who used to just wear the A’s out. Carlos Silva comes to mind.
One conclusion may be significant when BB/9 <2
Looking at the top chart, one thing that I’m still chewing over is how tight of a distribution the per game ERA’s are when the A’s get < 2.5), even the 3 data points at ERA ~6, which are probably 4 runs in ~6 innings of work, and are not overwhelming offensive productions. This says to me that maybe the slope may be arguable due to small sample size, but it may be enough evidence to tell that A's just don't score when they don't walk, mathematically the intersection is the number that bears this out. On league average when no BB/9, team can still get 3 runs, but A's only get 1 run. That is a huge difference.
Basically, I agree with everything that is shown here, as well as what is obvious to human eyes if we had just watched a couple of those 0 walk games.

by 


























