clock menu more-arrow no yes

Filed under:

Staturday Presents: The Do-It-Yourself Guide to Statistical Scouting

New, comments

Good morning.

Actually, I take that back. There's rarely anything good about mornings and then there's the fact that it may not be morning when you read this, so let me greet you with a more AN appropriate salutation:

Meh.

As many of you know, I enjoy keeping track of Oakland's farm system and have often taken the time to point out rising stars to the good (and some not so good, you know who you are) folks here at AN. That's something I plan on continuing this season in addition to my Staturday duties and I wouldn't bet against seeing a Minor League report show up on the Front Page before all is said and done. Now, as much as I like to be a primary provider of minor league info to AN the facts are my job tends to keep me busy over the summer and after trading Swisher and Haren for 9 not necessarily ready-for-primetime players there are going to be more then the usual numbers of folks keeping an eye on the farm.

So today I'm going to show you how to create a Major League Equivalency (MLE) for a minor league hitter.

So what is an MLE? Simply put, it's a way to take a prospect's minor league numbers and convert them into their potential major league production. It's important to remember that the MLE does not try to predict any kind of development or progression in the target prospect, it merely puts a big league spin on what the player has already done in the minors. We'll be weighing the player's raw data against the level of competition, league and park factors before seeing how he'd manage in the Show.

Today's volunteer is former D'Back wonder-stud Carlos Gonzalez. We are going to take his 2007 performance for the AA Mobile BayBears and convert it into a 2007 MLE. First we start with the raw data:

458 AB 131 H 33 D 16 HR 32 BB 103 K

The above is the core data, our formula is designed to manipulate that data into the MLE. Gonzalez also scored 63 runs and hit 3 Triples but we aren't worried about those numbers right now.

The next step is to modify CarGon's stats into park neutral numbers. Relax, it sounds more complicated then it is. I come bearing Mobile's 2007 Park Multipliers. http://www.baseballthinkfactory.org/files/oracle/discussion/2007_minor_league_park_multipliers/.

What's a park multiplier you ask? Well, every ballpark has it's quirks and the mulitplier is the number you need to multiply the raw data set by to come up with a non-park influenced (ie. neutral) number. Since I'm focusing on Gonzalez's 2007 numbers I've chosen to use Mobile's 1 Year data for this example. There are also 3 and 4 year averages available and I recommend using them when trying to use 2008 data to create an MLE. Back to Mobile's multipliers.

Let's focus on Hits for a moment. If you followed the link above you'd see that Mobile's 2007 Hits Column says 0.99. Ah-hah! So that's our multiplier, right? Not quite. What that 0.99 number is saying is that Mobile was 1% better than league average at preventing hits... it took away hits and that means we need a multiplier greater then 1 to use with our raw Hits number to come up with a neutral number. For right now, remember this formula: (X+.99)/2 = 1 with 1 = neutral. So, our multiplier (X) is 1.01.

Pretty simple, right? Well it stays simple until you hit 1.10 in the Column then your supposed to modify the equation some. The A's AAA and AA affiliates don't hit 1.10 so all I'm going to say is: Cut the part over 1.00 in half (1.10 becomes 1.05, etc). Back to our example.

So, having looked at the Mobile column our multipliers are (Hits=1.01) (Doubles=.99) (HR=.88) (BB=.93) (K=.98)

Adjusted Hits = Raw Hits(131) X Multiplier(1.01)

Adjusted Hits = 131(1.01) = 133

Adjusted Doubles = 33(.99) = 33

Adjusted Home Runs = 16(.88) = 15

Adjusted Walks = 32(.93) = 30

Adjusted Strike Outs = 103(.98) = 101

Just to head off any questions... yes, walks and strike outs can be influenced by the ballpark. Accept it, move on. Actually... lets look at the Adjusted HR total again. 16(.88) = 14.08 exactly but I rounded up to 15. Why did I do that? Personal preference. You cannot hit 14 and 1/10th HRs, you either hit 14 or 15 bombs. So if I've got any percentage beyond 14 I round up. If you choose to round up or down depending on the percentage that's your choice and it's perfectly fine, just be consistent about what you choose to do.

So our Adjusted (and now neutral) line looks like this:

458 AB 133 H 33 D 15 HR 30 BB 101 K

Next we need to handicap Gonzalez for being a AA player. We do that with two factors, "m" and "M". When a player jumps from AAA to the Show he ordinarily loses about 20% of his offensive ability relative to league. When the jump is from AA he takes a roughly 30% hit. We'll write this as:

m = .80 when jumping from AAA; m = .70 when jumping from AA. "M" is the square root of "m" therefore"

M(AAA) = .894

M(AA) = .837

Now we need to know the park multipliers for the Coliseum, we'll use 3 year averages for this and refer to these figures as PM. I'm sure you can figure out which PM to use as appropriate.

PM(H) = .95 PM(D) = .98 PM(HR) = .84 PM(BB) = .98 PM(K) = .88

Now we find the respective MLEs! Remember, Gonzalez was in AA last year so use the correct "m" and "M" multipliers.

MLE(H) = AdjH(.98)(M)(PM) Heads up! Don't confuse that (.98) with some other multiplier, it's a constant that's been included to the formula to make the numbers work right. So...

MLE(H) = 133(.98)(.837)(.95) = 104 Hits

MLE(D) = AdjD(M)(PM) MLE(D) = 33(.837)(.98) = 28 Doubles

MLE(HR) = AdjHR(m)(PM) MLE(HR) = 15(.70)(.84) = 9 HR

MLE(BB) = AdjBB(m)(PM) MLE(BB) = 30(.70)(.98) = 21 BB

MLE(K) = AdjK(1.05)(PM) MLE(K) = 101(1.05)(.88) = 94 K

Clear as mud? Don't worry, we're almost done.

MLE(AB) = RawAB-RawH+MLE(H). So...

MLE(AB) = 458-131+104 = 431 AB

We'll throw in his 3 Triples for flavor and viola!

Carlos Gonzalez's 2007 MLE:

431 AB 104 H 28 D 3 T 9 HR 21 BB 94 K 241/277/383 660 OPS.

Certainly not the most impressive line, especially when you consider that last year AL RFers averaged an .824 OPS. I hope this was clear enough for everyone, I know the formula got scattered a bit by my explanations so if anyone would like me to re-do the formula in it's entirity (minus my ramblings) in a comment I'd be happy to do so. If no one needs me to do that then I'm not going to do it voluntarily because I'm not out of my frakking mind! One last piece o' yummy goodness I will provide are the 3 year multipliers for Sac and Midland, just plug them into your formula and you'll be good to go. I'm not messing with A-ball numbers because the A's rarely run a guy up from A-ball to the Show in a year.

Sac: H=1.03 D=1.03 HR=1.04 BB=1.03 K=.91

Midland: H=.99 D=.99 HR=1.07 BB=1.03 K=1.01

Edit: It seems my intent behind all this wasn't as clear as I had hoped. MLE is not designed to predict future big league production, it merely tries to place minor league numbers into a major league context. My hope is that whoever is following the Sacramento River Cats in 2008 will be able to look at Gonzalez's 850 OPS and be able to evaluate how well his numbers could carry over into the Show. There's no point in calling these guys up before they're ready.