Stats, Projections and the Future: Joint Diary (hopefully)
Grover and Sal were commenting on projections and stats in the LD yesterday, and I thought this diary was as good a place and time as ever to talk about which sets of information and which ranking systems we think are most appropriate to be trusted. Because I'm further exploring a discussion primarily based between two people, I'm calling this the first-ever 'joint-diary' on AN.
Moneyball taught us that the A's have been one of the most stat-oriented ballclubs, and it seemed for a while that Stat-Ball was winning out over Old-Ball (gut feelings, hunches, non-quantifiable experience). But with the recent overhaul of the Dodgers, perhaps the pendulum of baseball ideology is swaying back in the other direction.
Grover commented before that it's a thin line he tread-- navigating the waters of stat-land and mixing in projections.
PECOTA isn't omnipotent and they can't say for certain... because no one can... how anyone will pitch next year. If they were willing to sacrifice their 1st born children if they mess up a projection THEN their data should be given more importance.All I'm saying is a preseason projection shouldn't have the power to sway opinion on deals from the previous years. Their 2006 performances should be the measuring stick for any such judgement.
And then he says...
The A's have a couple prospects (Putnam and Buck off the top of my head) who should be able to match Barton's current projected production at the big league level. The difference between Barton and those guys is age, Daric's younger so he has more "potential" than the other two.Sal read it as apathy, and I agree to an extent, but I definitely hear what Grover is saying. In reality all PECOTA is attempting to do is quantify players' abilities, and create projections about how a guy will perform. But it's not a perfect science by any means. In fact, it all boils down to which stat set or method of projection you believe does the best job of doing so.
With this being said, I originally was just going to ask Grover to elaborate on how he handles the intersection between stats and projections and intangibles, but as I was writing my response to his post, I got curious as to how everyone regards this complicated and complex intersection.
In terms of religion, I'm agnostic, because there's no way to know the truth-- like to actually know it. And I guess I'd term myself 'agnostic' in terms of baseball too. What I mean is that stats are all we have in order to find 'truth' in baseball, so to disregard them would be foolish, but at the same time, I wonder if statistics and numbers tell the whole story, or what other factors might be in play?
Statheads put their faith in stats because stats most often correlate to results, but perhaps we must step back yet again and consider the idea that something else causes numbers in the first place. What I'm talking about is basically like the reason that Daric Barton is a professional ballplayer and not, say, me or you. And that certain players perform consistently better than others. Some people call this 'talent,' others call it 'potential.'
But I have a lot of problems with these words. Talent and potential are two things that don't exist intrinsically, or maybe they do, but in reality they are things that can only be measured definitively in retrospect.
Sure, stats tell a lot of the story, but what percentage of the story do they tell?
100%? 99%? 90%? 75%? 50%??
I think most fans will concede that the stats we have tell at least 50% of the 'story of baseball,' as they provide a quantified history of the game, as well as an up-to-date snapshot of what is occurring in baseball.
But I have a really hard time believing that we're anywhere near being able to totally quantify the game-- unless, of course, you have a mathematical breakdown at an atomic or quantum level and can completely understand the way particles interact (and if you do, we should talk, because then we can go beyond baseball and revolutionize the modern paradigm of thought).
So, assuming we don't have the second coming of Einstein reading AN, who's ready to apply quantum theory to baseball, we're left with a slew of stat-sets to choose from. To which do you subscribe, AN?
[And just one last caveat to the uber-statheads who I (and many other AN'ers, I'm sure) admire: there are many people who know less than you, but want to learn. So if you can classify what is specific to each stat-set, what makes it unique in itself, and why you favor it, I (and countless others) would appreciate it. Thanks and looking forward to some good discussion!]
36 comments
|
0 recs |
Do you like this story?
Comments
a couple of thoughts on stats
To me there are stats that try to tell you what happened and stats that try to tell you what will happen.
1. Stats to Give Credit for the Past
Traditional baseball stats (BA, ERA, etc.) are based on arbitrary notions of morality (like walks are the pitcher's fault, errors are always bad and the fault of the fielder touching the ball).
Ultimately, the only stat that counts is The Team Win. Therefore, in my mind, the ideal credit stat would be an ideal version of Win Expectancy Added, with play by play fielding data added and somehow weighted for lineup, bench and manager strategy. In a past diary, I noted an intriguing relationship between Win Expectancy Added and Win Shares Above Bench, which made me feel better about both. Also due to the James Pythagorean Formula, Wins are almost completely reduced to Runs Scored and Runs Allowed. Thus all the stats like Runs Created and VORP (based on runs contributed) are all 'credit' stats using Runs as a proxy for Wins.
These stats should not be used in deciding on a deal until you've established their predictive power.
2. Stats to Predict the Future
The traditional baseball stats (pitcher wins, errors, BA, etc.) try to morally assign credit to what happened. But people tried to use them to predict what will happen. It turns out these stats do not predict the future well, thus the saber-revolution with PECOTA and Defense Independent Pitching Stats, study of development of players by age, etc. These stats are all about trying to isolate numbers that are consistent through time.
These stats are the only ones that can be used in deciding on a deal.
3. On Assessing Deals
I don't spend much time myself thinking about who 'won' a deal (it's not zero-sum... teams have short-term goals and long-term goals, so both teams can win or lose) or even how 'good' a deal was (you'd need to know all the deals actually available at that time with the GMs in power at the time). You make the deal with the best of your knowledge of the range of probable future outcomes. After the deal, history will tell you how to refine your understanding of the stats that predict.
As fans, we can't resist deciding if a deal was a benefit to the team. To that extent, of course it's only the stats that give credit that count. But if the contracts are still up, PECOTA could give you reason to hope. But I don't think fans have enough info to really decide if a deal was the best possible or not.
Clutch Hitting
by jarforcefatherofforce on Feb 18, 2006 10:11 AM PST reply actions
funny you should say that...
If I recall correctly...
Here's a link to Stats INC's AL BA Late & close, which is an attempt to have a "clutch stat"
http://snap.stats.com/premium/sfa/stats/getleaders.asp?rank=056&Submit=Go
My problem with these types of stats isn't that I dispute the existence of "clutchness", but that it's invariably based on too small of a sample size. I think some people do better under pressure, and change their swing style, but I haven't seen any data as of yet that creates a significant difference between their "clutch" and normal performance.
by Ryan Armbrust on Feb 18, 2006 12:32 PM PST up reply actions
Cluchness
by jarforcefatherofforce on Feb 18, 2006 2:18 PM PST up reply actions
Isn't that
Someone post this in a diary, please
I'd do it but I don't know how! It's too hilarious and ingenius not to post, someone do it now!
by Philip Christy on Feb 18, 2006 12:33 PM PST reply actions
That's crazy.
lol
Poppy's on it...
Philosophically
To think that we should not try to do so is lazy. TO think that we cannot do so is defeatist. I am sure that somebody once said - maybe in the 20s, maybe in the 60s - "Player X sure takes a lot of walks - how can we quantify that?" Boom, OBP is born. At some level, everything can be made more objective, if not quantifiable. Even PECOTA takes into account factors that usually fall into the "scouts" realm: body type, handedness, weight, height, etc.
What's next? Objectifying and quantifying health, physical and metnal, the dreaded "clutch" debate, defense, stuff like that. We're far from done, but we've made a good dent.
Bill James has an interesting philosophical take on things:
The stats have always played a huge role in how players are evaluated, and a huge role in every decision. What has changed is two things. First, there are more stats around. And second, there is some change in the emphasis on different stats. On base percentage is more important than it used to be.
(Emphasis mine)
I can't say I disagree. Even the notorious anti-stat Rev Halofan uses stats. He just happens to mistakenly think that OPS and ERA aren't statistics.
agreed.
So, the next logical questions are these:
a) What stats do you trust most today?
b) What are the stats of tomorrow? --> i.e.- do you really think something like a "health %" will exist and be taken seriously?
c) How might defense be better quantified? I've heard of DIPS, but I don't know much about it. Does it leave a lot out or would you say it's pretty complete and meaningful? I ask because I don't know.
Speakin' of the REV
I think I'm going to cry.
lmao
He's not serious, right?
There's no way to know
Is it just me
On Darin Erstad:
"Before you yammer on with your stats and disbelief in the intangibles, the poetry and unquantifiable foggy grit that make great team players, look at your Rosetta Stone of sabermetric stats - Win Shares. Guess what Stat-Ass? Darin Erstad has the 8th most Win Shares in Angel history. So he is not overrated. He's Top Ten material all around."
I love the eloquence of the first sentence followed by "Guess what Stat-Ass?" and then follows with a Win Shares stat, as though all stats-land is against Erstad
heh heh
Baseball Prospectus ran 100 game simulations
That is some of the most sophisticated projecting done today and they have a 15% variance.
15%!
Our readers are that good going with gut instincts.
Last year I predicted 94 wins for the A's. They only had 88 with 114 runs as thier margin over rivals. I consider myself very close since the Angels got 95 wins with thier cushion of 118 runs.
This year I see A's pitching and defense improved adding maybe 3 runs to that margin. I also see offense improved with the youngsters a year older and wiser and newcomers MB & Frank adding pop and protection. Offense improves and adds 45 runs to that margin.
A 162 run margin will add many wins. But how many?
I'm thinking the A's get more laughers and end up with 105 wins.
by A s Eh on Feb 18, 2006 11:44 PM PST reply actions
10 runs = 1 win
Stats etc.
Secondly, projections based on personal knowledge and instinct are notoriously inaccurate. There are far too many details for a person to track. People may assign improper value to different factors. When you're speaking of statistics, your model is always consistent, and it allows you to very quickly address previous errors.
One of the fundamental flaws in human reasoning is each person's belief in the accuracy of their own observations. Rain dancing does not change weather patterns, not matter how many generations of very intelligent and observative people believed it did. The only way to avoid propagating error is to constantly challenge, test, and re-evaluate our own beliefs. To do this we must have objective data, or statistics.
Einstein hated quantum mechanics
If you want to apply quantum theory to baseball then (i) you probably want the uncertain Heisenberg or the catty Schroedinger or the constant Planck or someone, and (ii) things are going to get very weird (especially in an infinite universe).
by green star oakland on Feb 19, 2006 10:36 PM PST reply actions
I had to read your post twice
only with Johnny Bench
by green star oakland on Feb 19, 2006 11:05 PM PST up reply actions
you're right, but I was talking about these guys:
But yea, I should have written Planck or Shroedinger....or his cat. ;)

by 
























