Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Kentucky Football: Tee Martin Reportedly Leaving for USC

Stats, Projections and the Future: Joint Diary (hopefully)

Grover and Sal were commenting on projections and stats in the LD yesterday, and I thought this diary was as good a place and time as ever to talk about which sets of information and which ranking systems we think are most appropriate to be trusted.  Because I'm further exploring a discussion primarily based between two people, I'm calling this the first-ever 'joint-diary' on AN.

Moneyball taught us that the A's have been one of the most stat-oriented ballclubs, and it seemed for a while that Stat-Ball was winning out over Old-Ball (gut feelings, hunches, non-quantifiable experience).  But with the recent overhaul of the Dodgers, perhaps the pendulum of baseball ideology is swaying back in the other direction.

Grover commented before that it's a thin line he tread-- navigating the waters of stat-land and mixing in projections.

PECOTA isn't omnipotent and they can't say for certain... because no one can... how anyone will pitch next year. If they were willing to sacrifice their 1st born children if they mess up a projection THEN their data should be given more importance.

All I'm saying is a preseason projection shouldn't have the power to sway opinion on deals from the previous years. Their 2006 performances should be the measuring stick for any such judgement.

And then he says...

The A's have a couple prospects (Putnam and Buck off the top of my head) who should be able to match Barton's current projected production at the big league level. The difference between Barton and those guys is age, Daric's younger so he has more "potential" than the other two.

Sal read it as apathy, and I agree to an extent, but I definitely hear what Grover is saying.  In reality all PECOTA is attempting to do is quantify players' abilities, and create projections about how a guy will perform.  But it's not a perfect science by any means.  In fact, it all boils down to which stat set or method of projection you believe does the best job of doing so.

With this being said, I originally was just going to ask Grover to elaborate on how he handles the intersection between stats and projections and intangibles, but as I was writing my response to his post, I got curious as to how everyone regards this complicated and complex intersection.

In terms of religion, I'm agnostic, because there's no way to know the truth-- like to actually know it.  And I guess I'd term myself 'agnostic' in terms of baseball too.  What I mean is that stats are all we have in order to find 'truth' in baseball, so to disregard them would be foolish, but at the same time, I wonder if statistics and numbers tell the whole story, or what other factors might be in play?

Statheads put their faith in stats because stats most often correlate to results, but perhaps we must step back yet again and consider the idea that something else causes numbers in the first place.  What I'm talking about is basically like the reason that Daric Barton is a professional ballplayer and not, say, me or you.  And that certain players perform consistently better than others.  Some people call this 'talent,' others call it 'potential.'

But I have a lot of problems with these words.  Talent and potential are two things that don't exist intrinsically, or maybe they do, but in reality they are things that can only be measured definitively in retrospect.

Sure, stats tell a lot of the story, but what percentage of the story do they tell?

100%? 99%? 90%? 75%? 50%??

I think most fans will concede that the stats we have tell at least 50% of the 'story of baseball,'  as they provide a quantified history of the game, as well as an up-to-date snapshot of what is occurring in baseball.  

But I have a really hard time believing that we're anywhere near being able to totally quantify the game-- unless, of course, you have a mathematical breakdown at an atomic or quantum level and can completely understand the way particles interact (and if you do, we should talk, because then we can go beyond baseball and revolutionize the modern paradigm of thought).

So, assuming we don't have the second coming of Einstein reading AN, who's ready to apply quantum theory to baseball, we're left with a slew of stat-sets to choose from.  To which do you subscribe, AN?

[And just one last caveat to the uber-statheads who I (and many other AN'ers, I'm sure) admire: there are many people who know less than you, but want to learn.  So if you can classify what is specific to each stat-set, what makes it unique in itself, and why you favor it, I (and countless others) would appreciate it.  Thanks and looking forward to some good discussion!]

Comment 36 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

a couple of thoughts on stats
Your diary was getting into some very philosophical areas, so my thoughts seem very pedestrian, but I'll share anyway.

To me there are stats that try to tell you what happened and stats that try to tell you what will happen.

1. Stats to Give Credit for the Past

Traditional baseball stats (BA, ERA, etc.) are based on arbitrary notions of morality (like walks are the pitcher's fault, errors are always bad and the fault of the fielder touching the ball).  

Ultimately, the only stat that counts is The Team Win. Therefore, in my mind, the ideal credit stat would be an ideal version of Win Expectancy Added, with play by play fielding data added and somehow weighted for lineup, bench and manager strategy. In a past diary, I noted an intriguing relationship between Win Expectancy Added and Win Shares Above Bench, which made me feel better about both. Also due to the James Pythagorean Formula, Wins are almost completely reduced to Runs Scored and Runs Allowed. Thus all the stats like Runs Created and VORP (based on runs contributed) are all 'credit' stats using Runs as a proxy for Wins.

These stats should not be used in deciding on a deal until you've established their predictive power.

2. Stats to Predict the Future

The traditional baseball stats (pitcher wins, errors, BA, etc.) try to morally assign credit to what happened. But people tried to use them to predict what will happen. It turns out these stats do not predict the future well, thus the saber-revolution with PECOTA and Defense Independent Pitching Stats, study of development of players by age, etc.  These stats are all about trying to isolate numbers that are consistent through time.

These stats are the only ones that can be used in deciding on a deal.

3. On Assessing Deals

I don't spend much time myself thinking about who 'won' a deal (it's not zero-sum... teams have short-term goals and long-term goals, so both teams can win or lose) or even how 'good' a deal was (you'd need to know all the deals actually available at that time with the GMs in power at the time).  You make the deal with the best of your knowledge of the range of probable future outcomes. After the deal, history will tell you how to refine your understanding of the stats that predict.

As fans, we can't resist deciding if a deal was a benefit to the team. To that extent, of course it's only the stats that give credit that count. But if the contracts are still up, PECOTA could give you reason to hope. But I don't think fans have enough info to really decide if a deal was the best possible or not.

by Apricot on Feb 18, 2006 7:58 AM PST reply actions  

Clutch Hitting
I hsve never seen a stat that measures clutch hitting.  Can we measure how successful a batter is in an RBI situation?  Total RBIs is  dependent on so many variables and does not really do the trick.  For example, could we measure how many RBIs per RBI opportunity? Say bases empty homeruns do not count and walks do not count unless it forces in a run.  We could develop our own AN clutch hitting stat.
Jim

by jarforcefatherofforce on Feb 18, 2006 10:11 AM PST reply actions  

funny you should say that...
...Back in October of last year, I actually dedicated an entire diary to this question, whether or not clutch hitting can be statistically quantifiable.  Unfortunately, it was not met with the enthusiasm I was hoping for (8 measly comments), so I didn't continue my efforts.  If you want to see what I was working on, check it out and email me if you want to see the preliminary work I did.
Beware of the Elephant...

by rungood on Feb 18, 2006 12:29 PM PST up reply actions  

If I recall correctly...
Bill James and company have tried a few times to come up with a way to measure "clutch", and have failed. It's apparently not possible to to differentiate between players "clutchness" in a significant way. I may be wrong, and I don't recall where I read that.

Here's a link to Stats INC's AL BA Late & close, which is an attempt to have a "clutch stat"

http://snap.stats.com/premium/sfa/stats/getleaders.asp?rank=056&Submit=Go

My problem with these types of stats isn't that I dispute the existence of "clutchness", but that it's invariably based on too small of a sample size. I think some people do better under pressure, and change their swing style, but I haven't seen any data as of yet that creates a significant difference between their "clutch" and normal performance.

by Ryan Armbrust on Feb 18, 2006 12:32 PM PST up reply actions  

Cluchness
There may be a correlation between normal and clutch hitting but there should be a stat to tell us that.  I don't know if Stat Inc.'s Late & Close Stat is a major indicator.  Is driving in two runs with runners in scoring position in the 8th inning more inportant that the third inning?  I don't think so. Couldn't we develop a simple stat the tells us the percentage of runners in scoring position per attempt?  If Nick Swisher drives in 75 runs, doesn't it depend a great deal on how many runners were in scoring position when he came to bat?  He may have a higher percentage than many clean up hitters.  I think this might be a good measure of "cluch" hitting.
Jim

by jarforcefatherofforce on Feb 18, 2006 2:18 PM PST up reply actions  

Isn't that
batting average with runners in scoring position? Or something similar? Basically you're asking for a stat saying how many times a hitter hit with runners in scoring position... % of runners in scoring percentage per attempt to hit would essentially be saying, OK, there are an average of, say, .86 runners on base when Nick Swisher is at the plate, but that doesn't help much because his OBP/AVG would not change just by the fact that there are more players in front of him who can get on base. Or would it? Seeing as Swish probably won't play much more than 10 years in MLB, any sample size would be way too small... I think generally speaking, what you're looking for is BA/RISP, but that isn't a good predictor of "clutchness" since it inherently implies that a hitter can "choose" when he wants to hit a ball...

by Alon on Feb 20, 2006 10:26 AM PST up reply actions  

Someone post this in a diary, please
http://humbug.baseballtoaster.com/archives/321248.html
I'd do it but I don't know how! It's too hilarious and ingenius not to post, someone do it now!
"We don't start nothin', but we don't take nothin' either" - Dusty Baker

by Philip Christy on Feb 18, 2006 12:33 PM PST reply actions  

That's crazy.
I liked "Notably Milder".
"How much room do I have to cover out here?" -- Kotsay

by Sharon on Feb 18, 2006 1:22 PM PST up reply actions  

lol
I hate "Ma Killers".
"How much room do I have to cover out here?" -- Kotsay

by Sharon on Feb 18, 2006 1:46 PM PST up reply actions  

Philosophically
I often hear something like, "Player X does A, B, and C, but it doesn't show up in the stat sheet.  Ergo, statistical analysis is inherently flawed."  That's the wrong way to look at it.  The correct way to think about the problem, as with science, is to think about how we might answer the question.  So, we should say, "Player X does A, B, and C - how can we quantify the effects of A-C."

To think that we should not try to do so is lazy.  TO think that we cannot do so is defeatist.  I am sure that somebody once said - maybe in the 20s, maybe in the 60s - "Player X sure takes a lot of walks - how can we quantify that?"  Boom, OBP is born.  At some level, everything can be made more objective, if not quantifiable.  Even PECOTA takes into account factors that usually fall into the "scouts" realm: body type, handedness, weight, height, etc.

What's next?  Objectifying and quantifying health, physical and metnal, the dreaded "clutch" debate, defense, stuff like that.  We're far from done, but we've made a good dent.

Bill James has an interesting philosophical take on things:

There has never been a time in baseball history when 10-game winners were paid as much as 20-game winners. There has never been a time when .250 hitters were paid as much as .300 hitters. There has never been a time when outfielders who drove in 100 runs didn't make the All-Star team, or when outfielders who drove in 60 runs did. There has never been a time when you would trade a 30-homer guy for a 20-homer guy, unless there was something else in the deal.  

The stats have always played a huge role in how players are evaluated, and a huge role in every decision. What has changed is two things. First, there are more stats around. And second, there is some change in the emphasis on different stats.   On base percentage is more important than it used to be.

(Emphasis mine)

I can't say I disagree.  Even the notorious anti-stat Rev Halofan uses stats.  He just happens to mistakenly think that OPS and ERA aren't statistics.

Copernicus felt the same way about the geocentric crew.

by salb918 on Feb 18, 2006 6:13 PM PST reply actions  

agreed.
We have a LONG way to go.  It can be said for baseball, and it can be said for life in general.  In baseball it's the reason new stats keep getting created year in and year out.

So, the next logical questions are these:

a) What stats do you trust most today?

b) What are the stats of tomorrow?  --> i.e.- do you really think something like a "health %" will exist and be taken seriously?

c) How might defense be better quantified?  I've heard of DIPS, but I don't know much about it.  Does it leave a lot out or would you say it's pretty complete and meaningful?  I ask because I don't know.

Beware of the Elephant...

by rungood on Feb 18, 2006 6:25 PM PST up reply actions  

I think I'm going to cry.
That's just great.

lmao

"How much room do I have to cover out here?" -- Kotsay

by Sharon on Feb 19, 2006 5:56 PM PST up reply actions  

Better than ice-dancing.
Copernicus felt the same way about the geocentric crew.

by salb918 on Feb 19, 2006 5:58 PM PST up reply actions  

He's not serious, right?
"How much room do I have to cover out here?" -- Kotsay

by Sharon on Feb 19, 2006 6:02 PM PST up reply actions  

There's no way to know
That's what makes it great.
Copernicus felt the same way about the geocentric crew.

by salb918 on Feb 19, 2006 6:21 PM PST up reply actions  

He's a machine.
Wow.
Copernicus felt the same way about the geocentric crew.

by salb918 on Feb 20, 2006 8:10 AM PST up reply actions  

Is it just me
Or is he projecting the Angels to score 1000 plus runs? And he complements Stoneman on signing Weaver instead of Byrd and Washburn... Just browsing through Halo Heaven is an interesting experience of a whole lot of anti-Billy Beane stuff (they even have a name... Beane-eaters?), and with a token ridiculously long tedious list of the Angels top 100 players (which isn't really quantifiable until you reach maybe the top 15...)

On Darin Erstad:
"Before you yammer on with your stats and disbelief in the intangibles, the poetry and unquantifiable foggy grit that make great team players, look at your Rosetta Stone of sabermetric stats - Win Shares. Guess what Stat-Ass? Darin Erstad has the 8th most Win Shares in Angel history. So he is not overrated. He's Top Ten material all around."

I love the eloquence of the first sentence followed by "Guess what Stat-Ass?" and then follows with a Win Shares stat, as though all stats-land is against Erstad

by Alon on Feb 20, 2006 10:47 AM PST up reply actions  

heh heh
I hate to be on the Rev's side even for an instant, but Erstad is definitely a whipping boy for Der Stat-Asses. Kind of the way Jeter is a whipping boy for Der Defensive Stat-Asses and A-Rod is a whipping boy for Der Grit-Asses.

by Apricot on Feb 20, 2006 10:52 AM PST up reply actions  

Baseball Prospectus ran 100 game simulations
with the A's getting anywhere from 94 - 109 wins.

That is some of the most sophisticated projecting done today and they have a 15% variance.

15%!

Our readers are that good going with gut instincts.

Last year I predicted 94 wins for the A's. They only had 88 with 114 runs as thier margin over rivals. I consider myself very close since the Angels got 95 wins with thier cushion of 118 runs.

This year I see A's pitching and defense improved adding maybe 3 runs to that margin. I also see offense improved with the youngsters a year older and wiser and newcomers MB & Frank adding pop and protection. Offense improves and adds 45 runs to that margin.

A 162 run margin will add many wins. But how many?
I'm thinking the A's get more laughers and end up with 105 wins.

Swisher: "If we're healthy, we could be a monster."

by A s Eh on Feb 18, 2006 11:44 PM PST reply actions  

10 runs = 1 win
And basically teh A's are projected to win 10 or so more games than they won last year, so look for ~ 98 or so average

by Alon on Feb 20, 2006 10:54 AM PST up reply actions  

Stats etc.
One of the nicest things about statistics is they make no pretensions about fallibility. If you say the A's are going to win 100 games, and your standard deviation is 20 games, you know exactly how accurate the prediction is (in that case, not so accurate). When PECOTA gives a .300 EqA prediction, but the 10% is .250 and the 90% is .350, you know that there just isn't enough information available to make an accurate projection.

Secondly, projections based on personal knowledge and instinct are notoriously inaccurate. There are far too many details for a person to track. People may assign improper value to different factors. When you're speaking of statistics, your model is always consistent, and it allows you to very quickly address previous errors.

One of the fundamental flaws in human reasoning is each person's belief in the accuracy of their own observations. Rain dancing does not change weather patterns, not matter how many generations of very intelligent and observative people believed it did. The only way to avoid propagating error is to constantly challenge, test, and re-evaluate our own beliefs. To do this we must have objective data, or statistics.

by MrIncognito on Feb 19, 2006 2:27 PM PST reply actions  

Einstein hated quantum mechanics
He was really the last (and arguably the greatest) classical physicist.

If you want to apply quantum theory to baseball then (i) you probably want the uncertain Heisenberg  or the catty Schroedinger or the constant Planck or someone, and (ii) things are going to get very weird (especially in an infinite universe).

by green star oakland on Feb 19, 2006 10:36 PM PST reply actions  

I had to read your post twice
to realize it wasn't about how well Hatteberg matches up against Schoeneweis.

by Nico on Feb 19, 2006 10:57 PM PST up reply actions  

you're right, but I was talking about these guys:

But yea, I should have written Planck or Shroedinger....or his cat. ;)

Beware of the Elephant...

by rungood on Feb 20, 2006 10:03 AM PST up reply actions  

Why would
knowing where an atom is but not how fast it's going or the reverse help in baseball?

by Alon on Feb 20, 2006 11:02 AM PST up reply actions  

It might help
if you had Atom Hyzdu in a rundown.

by Nico on Feb 20, 2006 11:19 AM PST up reply actions  

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about Oakland Athletics.

Community Guidelines ANcillary Terms

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Aperture_logo_small
Community Prospect List #4
Img_2672_small
Long-Term Outlook

Recent FanPosts

Small
Moneyball Part II: Billy Beane Shocks the World. Again.
Hahaha_small
Let's Make Some Nicknames!
Fubarcloud_small
Wolf being told to spend money
Small
The wRC+ Challenge
Pumpkin_small
Maybe this is a stupid stats question
Small
A's reportedly sign Cespedes
Unknown_small
Is It Really Worth It: Three Veterans Who May Be Playing Oakland Next Year, But Shouldn't Be
Small
Manny's Contract
Small
fantasy baseball league for A's fans!

+ New FanPost All FanPosts >


Front Page Writers

Maya_papi_small Tyler Bleszinski

08-_the_author_small 67MARQUEZ

Josefav2_small danmerqury

Baseball_small baseballgirl

Poochini-butt_in_box_2_small Nico

Img_0653_small dwishinsky

Front Page Writers

Smiley_face_small gigglingone

Venasfans_small OaklandSi

60-minutes-clock_small cuppingmaster

Patpicturebucky2_small YonYonson

Img_3830_small David Fung

Moderators

Photofunia-5c770b_small coffee roaster

Denver_small Colorado Fan

Ls_logo100_small LoneStranger

Thumbs_up_small LongTimeFan

Marty_profile_in_green_small mrod

Img_1877_small Billy Frijoles

Babycomputergeek_small paris7

Img_0115_small Tutu-late