Changes in home run rates during the Retrosheet years
|
|
The Hardball Times found this 2/15/2008 on www.hardballtimes.com [flag] |
Tags:
MLB
Comments (1)
Links (7)
THT: Tango: Changes in home run rates during the Retrosheet years
Published 2/15/2008 at BBTF's Baseball Primer Newsblog
THT: Tango: Changes in home run rates during the Retrosheet years Just once, I’d like to sit in my stewified bar and have the scemo next to me say..."Mantle wudda hit 90 HR’s...during this live ball era”, instead of you know what. The Real Culprit Now, consider the most dedicated baseball researcher alive regarding the home run, Greg Rybarcyzk of HitTracker Online, when he says: “In 2006 there were 1,454 homers in the “Just Enough” category, which means clearing the fence by approximately 10 feet or less. They are spread fairly smoothly from 0-10 feet of clearance.” Greg ...
Changes in home run rates during the Retrosheet years
Published 2/15/2008 by Tangotiger (tangotiger@yahoo.com) at THE BOOK--Playing The Percentages In Baseball
An article I did over at THT.
High End Balls
Published 2/15/2008 by StatsGuru at Baseball Musings
Tom Tango looks at the reasons for the increase in home runs from 1993 on, and comes down on the ball . It's important to know that the ball is not technically juiced, but is manufactured at the high end of the allowable range. My theory (which I could never get ESPN to pursue) was that there was a change in manufacturing practices that produced a more consistent ball. That consistency was set at the high end. My guess is that under older manufacturing techniques, hundreds of thousands of balls were manufactured before they were tested. I'm guessing as time went on, these ...
Friday Morning Rockpile: ". . . embrace those challenges of greatness"
Published 2/15/2008 by Russ <info@purplerow.com> at Purple Row: Front Page Posts
... Don't blame Coors Field and expansion for the increase in home run rates starting in '93, according to Tom M. Tango. What did? He doesn't know, but juicing the ball may be one answer. ...
Juicy balls, not players
Published 2/15/2008 by Bradford Doolittle at DoolittleBrothers.com
... I bring this up because Tom Tango put his considerable skills to work in looking at the issue over at HBT. This is powerful stuff, folks. Give it a read. ...
Friday lunchtime links
Published 2/15/2008 by Pat at Where have you gone, Andy Van Slyke?
... Tom Tango at The Hardball Times makes an interesting argument that the baseballs themselves are the reason for baseball's power boom since 1993. ...
Links and Tidbits
Published 2/17/2008 by Paul SF at YFSF
... Things I learned while surfing the Web last night:
Don't blame the juice, unless you mean the ball.
Never mind Engel Beltre, the Red Sox may have traded the best change-up in baseball for the post-roid shell of Eric Gagne. Never has a trade looked so good and so bad within just six months.
Not only are the Red Sox the best team in baseball, but they're the nicest, too!
Carl Pavano is a menace to everyone around him.
Joe Girardi is making sure the Yankees are as boring as ever -- "each Yankee ...

This is an interesting take on the issue, but I don't think I understand the ball-juicing thesis here. The difference between the 1998 major league ball and the post-1996 minor league ball is 8.7 feet, which is approximately the same as the required juice distance to raise HR totals from average to 2006 actuals? Ok, interesting coincidence. What's the theory? That MLB was using the minor league balls from 1996 to 2005 (though apparently not 1998), and then switched to the major league balls in 2006?
I have another set of question, and I don't mean this to be disrespectful, but do you have any training in statistics? Because to me, this appears statistically naive. I am no great expert on statistics and I could be wrong. But let's take an example: the 8.7 feet doesn't actually get you to the historical average of 2.9%, does it? It gets you to 3.04%. To get to 2.9% would imply a juicing distance of 10 feet, or 15% more than the 8.7 feet claimed. That doesn't appear to be stastically significant. Not to mention that it's not clear why 3.0% is "close enough" but not 3.1% or 3.2% (both highly plausible numbers given historical variance #s).
Or consider the use of the 1992/1993 "jump." Sure, it's a big jump, but that's mostly because 1992 was an especially low year. 1991 was closer to 1993 than 1992 was, and 1991 was closer to 1993 that it was to 1992.
Finally, a few words about the "matching" plate appearances. I'm not sure I fully understand your methodology, but it appears to me that you are compressing data in a non-uniform way. If X faces Y in park P 5 times in 1992 and 1 time in 1993, then X-Y-P will count as 1 matched PA; but if Z faces Y in park P 3 times in 1992 and 3 times in 1993, Z-Y-P will count as 3 matched plate appearances. You're in essence assigning random weights to different plate appearances, and randomly compressing data in different amounts. Who knows what vagaries that will introduce.
This is especially important since a relatively small number of pitcher-player combos will account for most of the differences between seasons, since the majority of home runs are hit by a minority of players and are hit off a minority of pitchers. So suppose player X's true rate of home run hitting off pitcher Y is 10%. In year 1, he hits 1 out of 2. In year 2, he hits 1 out of 18. Your data compression technique turns that into two data points of 50% and 5.5%, which averages out to something like 30%. That's a big distortion. Will these distortions even out across all the data points? Not if the distribution is both non-normal and heavily skewed (which is very likely is), and if the year-to-year variances are large (which again, they almost certainly are, as your Raines example illustrated).
Perhaps this explains why the matched and non-matched data don't track each other very well. In both 1992/93 and 93/94, the growth rates in matched and non-matched plate apperances differed by about 33%. It just so happened that the differences canceled each other out -- which would happen by luck 50% of the time. Perhaps you could compute the correlation between these two data sets over a longer period of time?
I don't mean to be overly critical, but I've seen a reference to this article on Neyer's ESPN blog as having a great deal of precision, as somehow proving a ball juicing. But I see a lot of questions raised by this analysis, and not that many reliable answers. I think this is an interesting approach, but I would like to see more sophisticated analysis.