## Does Plate Discipline Matter?

USA TODAY Sports

In my last post on the struggles of Yoenis Cespedes, I made the case that Cespedes' variations (small I might add) in plate discipline are not a large factor in the performance drop from 2013 vs. 2012. A lot of the comments suggested that I glanced over the differences too lightly. So I thought I would dig into the matter a little more deeply.

I ran a linear regression model against plate discipline on 698 hitters from 2000 until 2013 (plate discipline stats aren't available before 2000). I ran the plate discipline stats for correlation against every offensive category. Below is the data. (if you want to see all the plots, go here)

A little background. O-swing is the percentage of pitches out of the strike-zone a batter swings at. O-contact is the percentage of pitches out of the strike-zone a batter swings at and makes contact. Z-swing is the percentage of pitches in the strike-zone a batter swings at. Z-contact is the percentage of pitches in the strike-zone a batter swings at and makes contact.

Linear regression shows how strong a set of data can be represented by a straight-line. For the data below, I show two values: the r2 and coefficient. The r2 value corresponds to how well a line represents the data. Zero is a terrible fit, one is a perfect fit. The coefficient is the slope of the line, as in y = mx+b. The coefficient is the slope, or m. For example, say a set of data has an r2 value of .8 and a coefficient of 10. That means the data is well represented by a line (.8) and has a strong dependence on the variable (10).

 OPS OPS HR HR AVG AVG BB% BB% Rsq Coeff Rsq Coeff Rsq Coeff Rsq Coeff O-Swing 0.021334 -0.175719 0.000059 -10.021932 0.000265 -0.005266 0.344052 -0.253627 Z-Swing 0.026056 0.218331 0.036769 282.474002 0.001213 0.012677 0.055523 -0.114551 Swing 0.014279 -0.196365 0.000393 35.483822 0.000013 0.001567 0.424816 -0.384965 O_Contact 0.036423 -0.129274 0.01637 -94.390557 0.090233 0.054757 0.073718 -0.066101 Z_Contact 0.069737 -0.441019 0.052596 -417.140229 0.129464 0.161709 0.082134 -0.172024 O-Contact_div_O-Swing 0.000047 0.000672 0.008932 -10.035454 0.043308 0.00546 0.136997 0.01297 Z-Contact_div_Z-Swing 0.053327 -0.113771 0.060097 -131.541957 0.020205 0.018846 0.001815 0.007544 Z-Contact_div_O-Contact 0.01582 0.035226 0.001018 9.734726 0.053505 -0.017434 0.050727 0.022672
 K% K% BB_K BB_K OBP OBP SLG SLG Rsq Coeff Rsq Coeff Rsq Coeff Rsq Coeff O-Swing 0.029642 0.139306 0.347776 -2.011276 0.193944 -0.189868 0.000245 0.013858 Z-Swing 0.024581 0.142623 0.095617 -1.185679 0.022592 -0.072857 0.085785 0.291367 Swing 0.005198 0.079683 0.328069 -2.668343 0.221724 -0.277305 0.004472 0.080824 O_Contact 0.349892 -0.269478 0.066978 0.496966 0.000519 -0.005531 0.061722 -0.123771 Z_Contact 0.71858 -0.952137 0.182205 2.020909 0.00022 0.008868 0.134239 -0.45003 O-Contact_div_O-Swing 0.292536 -0.035466 0.575468 0.209673 0.154187 0.01372 0.033006 -0.013028 Z-Contact_div_Z-Swing 0.242359 -0.163125 0.175613 0.585295 0.014563 0.021307 0.139056 -0.135122 Z-Contact_div_O-Contact 0.167052 0.076988 0.022368 -0.118746 0.001243 0.003539 0.023718 0.031723
 BABIP BABIP wRCplus wRCplus Rsq Coeff Rsq Coeff O-Swing 0.007833 0.030193 0.011021 -32.44824 Z-Swing 0.002156 -0.017811 0.000921 10.546182 Swing 0.001767 -0.019591 0.045625 -90.180507 O_Contact 0.00067 0.004973 0.006371 -13.89066 Z_Contact 0.004317 -0.031115 0.042298 -88.242624 O-Contact_div_O-Swing 0.004077 -0.001765 0.001993 1.118129 Z-Contact_div_Z-Swing 0.000099 0.001391 0.012201 -13.981491 Z-Contact_div_O-Contact 0.001872 -0.003437 0.000271 1.184201

Now I will take a stab at breaking down each category.

OPS

Plate discipline's effect on OPS is minimal at best. The strongest correlation is with Z-Contact, but with a negative impact. Keep in mind OPS is less than one, so the coefficients have good weighting, but on a poor fit.

HR

Not much correlation here. Z-contact is the highest, but again at a negative impact. Presumably this is the impact of Juan Pierre-type players.

AVG

This one isn't so bad. We get a pretty good relationship between Z-contact and AVG, with a decent coefficient (remember AVG is less than 1). We also see swinging and making contact outside of the zone isn't a death sentence to batting average.

Walks

Not bad. Second best fit of any group. Swinging at pitches results in less walks. Duh.

Strikeouts

Best fit! Again, duh. Making more contact results in less strikeouts.

BB/K ratio

Some decent fits here. Plate discipline seems to play a role in walks and strikeouts.

OBP

The best fit is Swing% with a negative impact. Free swingers seem to have a lower OBP. Who would have thought?

SLG

The best fit is with Z-Contact, but with a negative impact. Damn you Juan Pierre!

BABIP

Positively the worst fit of any group. I would be comfortable saying any small deviations in Cespedes' plate discipline are not affecting his BABIP. I was a little surprised on this one. I thought swinging at better pitches might result in better hit balls for a higher BABIP, but not the case.

wRC+

Finally. We see the best (and still not good) fit is what negative impact free swinging has. But overall not any good fits to draw any real conclusions on.

Conclusion

While I don't hold a PhD in Mathematics, I think the take-away from this is that predicting performance on plate discipline ALONE year-to-year is ill-advised. It is possible over a significant portion of at-bats you could draw a conclusion Player A may get more walks or strikeouts than Player B based on plate discipline, but the correlation numbers are just too small to predict any significant batted-ball differences in the short term. So while I am not thrilled that Cespedes is making contact at 4% fewer pitches in the strike zone, it is nothing to sit and cry about.

(apparently your BP/HR Derby swing is fine.)

