Phyllis Schlafly: Not happy with Win-Loss, either.
In a fascinating bit of hot stove nerdery, Nick Steiner at Hardball Times uncovers a new possible weakness in the ERA statistic in an innovative, defense-independent way. Long story short, he took AJ Burnett’s ten best 2009 outings (average outing: 1.06) and ten worst (average outing: 9.13) then looked at his stuff, location and pitch selection and found that AJ throws just about exactly the same when he’s getting shelled as when he’s dealing.
In his 10 best starts, he averaged a Game Score of 70.9. In his 10 worst starts, he averaged a Game Score of 31.9. More intuitively, his ERA was 1.06 in his good starts compared to 9.13 in his bad starts… quite the difference.
I then grabbed all of the PITCHf/x information on those two groups of starts. In case you are unfamiliar with it, PITCHf/x is a ball- tracking technology powered by SportVision, which measures certain key characteristics of each pitched ball, including speed, spin deflection (movement) and location. After manually classifying Burnett’s pitches game-by-game (yes this was a pain), I was ready to look at the data.
My agenda was simple. I wanted to see, using the intrinsic qualities of each pitch, exactly how differently he pitched in his best and worst starts of the season. I looked at three variables: stuff, location and approach.
…[I] found no meaningful differences in terms of what he threw, the velocity/movement of his pitches, where he threw them and when he threw them. I think I’ve established that there was practically no difference in how he pitched in his good starts compared to his bad starts.
Were his ten best outings and his ten worst outings facing the exact same lineups? If not, there’s a confounding variable. Was this accounted for?
It doesn’t appear it was. The method aimed to look at the degree of independence between an AJ outing’s ERA outcome and the factors that AJ has under his direct control. Lineups, count, on-base and defense all are silent in the analysis.
No argument (and no statistical training speaking) from me when you mention lineup being a confounding variable (a definition I just looked up and found it doesn’t mean “Chase Utley”). I am curious: since lineups are never static anyway, isn’t it fair to ignore them here like they’re ignored elsewhere in numberville?
I vote for throwing out the lineup confounding variable because I’m sick of hearing the brain scientists at ESPN make the claim that Mike Mussina belongs in the HOF because he faced tougher lineups by playing in the AL East.
I also just looked up “confounding variable”, which I had incorrectly believed to be a short-lived Jowe Head/Julian Cope collaboration. But if the Wikipedia article is accurate, and “The methodologies of scientific studies therefore need to control for these factors [confounding variables] to avoid a type 1 error; an erroneous ‘false positive’ conclusion that the dependent variables are in a causal relationship with the independent variable,” then I’m not sure how this is damaging to the thesis of the HBT article. The article seems to be trying to weaken the notion of direct causality between “the pitcher’s stuff” and his effectiveness, not prove a causal relationship.
I do find this construction a little troubling, though:
ERA means less than you might think in terms of measuring a pitcher’s effectiveness. We can see this by looking at a pitcher’s best and worst starts of the season and then looking at velocity, pitch location, and pitch selection and checking to see if there is really a difference between when the results he gets are brilliant and when the results he gets are abysmal. We will determine these results by looking at his ERA.
Still a really interesting article and the above criticism may result from me misreading something.
A good pitch thrown to a guy who can hit it is a bad pitch. A bad pitch thrown to a guy who will swing at it is a good pitch. That a pitcher throws similar pitches to different lineups and has different results against them shouldn’t surprise anybody.
Btw a confounding variable lineup is two fat chicks and a tranny.