Batted balls and park effects

16
0
 Batted balls and park effects  Links7
Beyond mere run factors. [link]

Tags:

Comments (16)

  • studes studes
    +1
    Awesome job, David.  Thanks for updating my work!
    Posted 3/20/2008 [reply] [flag]
  • Matthew Matthew
    +1

    Camden Yards supresses Ks by 7%, SafeCo Field inflates by 9%.

     Erik Bedard may strike out 1/3 of all batters he faces this year.

    Posted 3/20/2008 [reply] [flag]
  • sdanne sdanne
    +1

    David, did you use a method simmilar to what MGL used in his park factors?

    http://docs.google.com/View?docid=dfvdvsgn_26f49mrgk6

    In the caseof the NL west, I'd imagine it would have a larg effect on the results, considering the extreme parks in that division. 

    Posted 3/20/2008 [reply] [flag]
  • pizzacutter pizzacutter
    +1

    You're not allowed to call it Jacobs Field any more.  It's now "Progressive Field."  Also, is there any consistency year to year on these park effects?

    Posted 3/20/2008 [reply] [flag]
    • David Gassko David Gassko
      +1
      Yep, all these numbers are regressed based on their y-t-y correlations, so the "r" has to be positive for the park factors not to all be 1.
      Posted 3/20/2008 [reply] [flag]
  • MGL MGL
    +1

    David, it was not real clear until the last paragrpah that the PF's you list in all the charts are regressed.  You talk about how and why (you would regress) but you don't specifically say that you are presenting the regressed values.

    Also, how much data (how many years) are these regressed PF's based on?  And, as sdanne mentions, how are the raw (before regression) PF's computed?  Do you  take into consideration the unbalanced schedule?

    Nice work, but lots of important pieces of info you left out, I think.

    Posted 3/20/2008 [reply] [flag]
    • David Gassko David Gassko
      +1

      Mitchel,

       The park factors are based on up to five years of data. They do not take into account an unbalanced schedule. They are computed by taking, for example, (Home K/PA)/(Road K/PA).

      Posted 3/21/2008 [reply] [flag]
      • MGL MGL
        +1
        O.K., good.  How about new parks or changes to parks?  Did you only use the number of years that a park stayed constant - for example, how many years did you use for COL, LAD, PHI, or SD, as they all had small changes in the park over the last few years?  Also, are the PF's home/road stats or are they home/all parks (which is essentially home/road+1/n-1*home)?  Just trying to put the numbers into context.  Thanks.
        Posted 3/21/2008 [reply] [flag]
  • David Gassko David Gassko
    +1
    I used 4 years for DET, KC, PHI, and SD; 3 years for WAS; and 2 for STL. The PF were calculated as home/road -- like I said, the simplest construct possible.
    Posted 3/21/2008 [reply] [flag]
  • tetepoov tetepoov
    +1

    "The spread here isn’t quite as great as it is for strikeouts, but it is certainly meaningful. What is unclear is why this effect exists. The parks at the top and bottom seem fairly random, but there is a definite correlation (.32) in walk park factors from year to year."

    I have a question here... How did you control for the pitchers? The "home" pitchers account for 50% of all Ks, flyballs, line drives etc. in a home ball park. It stands to reason that the Padres have a K+ ballpark because the Padres have K+ pitchers. 

    In turn, Colorado may have a K- ballpark because they have K- pitchers. I am willing to bet that if you take out the home team from the equation you would have less effect.

    Also, .32 (in a year to year correlation) isn't very strong. I am skeptical. I will look at your data and see if I should believe you :-)

    Posted 3/22/2008 [reply] [flag]
    • David Gassko David Gassko
      +1
      I control for pitchers because we're comparing the team's stats at home to their stats on the road. So in a neutral park, the pitchers would have the same K-rate at home as on the road, whether is 4.0 or 8.0. (Technically, it would be a bit higher at home because of home field advantage, but that's cancelled out by the batters, who would have a slightly lower K-rate at home due to HFA).
      Posted 3/22/2008 [reply] [flag]
  • MGL MGL
    +1

    Telepoov, I'll anser for Dave since I'm here.  When you do a PF, the home team is equally represented in the home and road data.  They comprise 50% of the data at home AND 50% of the data on the road. The formula that David used is (home team data plus road team data at home) divided by (that same home team data on the road plus the data from those same opponents at home).  So the pool of players in the demoniator is exactly the same (more or less) as the pool of players in the denominator when you are doing a PF.

    As far as whether a .32  y-t-y correlation is "strong", "weak" or in between, you can call it whatever you want.  If you work with y-t-y correlations in baseball as much as people like David and I (sabermetricians) do, you would call that "strong," especially for a PF, but it doesn't really matter what you call it, therefore there can be no argument whether it is weak or strong - it is what it is (that would be like if I said that there were 3,618 murders in the U.S. last year - boy is that a lot, and someone engaged me in an argument over whether that is a lot or a little).

    Anyway, it is too complicated to explain here, but the correlation depends on how much data you have.  If a y-t-y is .32, 5 years to 5 years is something like .70, so there is no "magic" to quoting a (one) year to year correlation.  It is just convenient to do so.  David might as well have said, "The correlation was .70 when I correlated 5 years of K PF to another 5 years of K PF," and you might have said, "Boy that is strong."  So, it is not really a matter of weak or strong.  Any correlation from y-t-y or from 2 years to 2 years, or whatever, tells us that there is some "real" stuff going on (i.e., not all parks have the same K rate), within a margin of sample error of course, and the magnitude of the "r" for any given sample size (in this case, David quoted us an "r" using one year as his sample size for the x and y variables in his regression) simply tells us how much to regress a sample PF given the sample size (1 year in this case), if we want to estimate how much of that observed PF is "real".

    And as I said, .32 is pretty "good" for a PF and a little surprising to many people who don't even think much in terms of there being a K PF other than random year to year fluctuations among parks.

    Posted 3/22/2008 [reply] [flag]
  • philosofool philosofool
    +1

    The thing you didn't include which seems to me a really important park factor to affect pitching is foul ball rates. I conjecture (but don't have the data or knowledge to test it) that foul balls account for a large amount of variation in pitching PF's like strike outs and walks.

    Also, is there such a thing as a foul ball pitcher, i.e. a pitcher that is capable of producing a larger than ordinary number of foul balls? And could foul ball pitchers be a hidden but significant varriable affecting this data? What about foul ball batters? Could those be unaccounted for in the data so that the rate that certain batters produce foul balls makes certain parks look strike out likely?
    Posted 3/25/2008 [reply] [flag]
  • philosofool philosofool
    +1

    Oh, one other important question:

    The home team pitches 9 innings in every game, but bats only about half of the time in the ninth: how does this affect PF calculations? Is this as simple as stating all statistics in per plate appearance terms? 

    Posted 3/25/2008 [reply] [flag]
    • David Gassko David Gassko
      +1
      Yeah, every number is expressed with the correct denominator: PA for Ks and BBs, batted balls for IF/OF/LD/GB/Bunts, and the batted ball type for events on batted balls.
      Posted 3/26/2008 [reply] [flag]

Links (7)

Batted ball park factors
Published 3/20/2008 by MB at Friar Forecast
... David Gassko has an article up at the Hardball Times on batted ball park factors (as well as other stuff like k’s and bb’s). ...

Batted balls and park effects—The Hardball Times
Published 3/20/2008 at BBTF's Baseball Primer Newsblog
Batted balls and park effects—The Hardball Times But if we’re talking about outfield flies, the most important possible event is a home run of course. So let’s take a look at the park factors for home runs per outfield fly. Team HROF White Sox 1.26 Rockies 1.22 Blue Jays 1.19 Phillies 1.16 Cubs 1.16 --- Cardinals 0.87 Angels 0.87 Mets 0.87 Giants 0.86 Padres 0.86 How’s that for a surprise? The Cell makes more outfield flies into home runs than Coors Field. At first, I thought this might be due to the installation of the humidor, but the park factor for ...

Thursday Morning Rockpile:
Published 3/20/2008 by Rox Girl <info@purplerow.com> at Purple Row: Front Page Posts
... The Rockies have been watching both he and Morales this pre-season to see how they respond to adversity -as pitching at Coors, that's going to be an unavoidable job hazzard. Jimenez passed his test, much moreso than Morales in his last start, let's see if Frankie can make things up today. ...

Daily Link Roundup
Published 3/20/2008 by R.J. Anderson <info@beyondtheboxscore.com> at Beyond the Box Score: Front Page Posts
I'm proud to announce that BTB has a new addition to the staff. His name is Dan Turkenkopf and you can read some of his work, including his great research on catcher's defense over at his blog Stealing First. He'll be debuting here within the next week. David Gassko over at The Hardball Times looks at batted balls and park effects. Remember the Obama baseball themed shirts? Yeah, the MLB finally ended that. Did the Yanks never actually offer Phil Hughes or Melky Cabrera for Johan Santana? Jim Callis hints at it during a new podcast with No Bias ...

Spring Training Game #24: White Sox @ Dodgers -- Mark Buehrle vs. Brad Penny
Published 3/20/2008 by thewizardsofoz <info@southsidesox.com> at South Side Sox: Front Page Posts
... John Shelby will also start the year at Winston-Salem at 2nd base. ***** In other news, Jim on the Uribe waiver speculation. THT: Batted balls and park effects. StatSpeak: Liveblogging `Moneyball'. Dick Vitale says Dusty told him he'll start Corey Patterson and not 20 year old No. 1 prospect Jay Bruce at Centerfield (hat-tip BBTF). UPDATE: The Reds did send Bruce to AAA indeed (hat-tip BBTF).

Diamondbacks 9, Dodgers 8 - Chris-mass at Easter
Published 3/22/2008 by Jim McLennan <info@azsnakepit.com> at AZ Snakepit: Front Page Posts
... David Gassko takes a look at Batted balls and park effects over at the Hardball Times, and it turns out that parks affect more than just runs scored. For example, Petco is very K-friendly - 3rd best in the majors - while Chase is much less so, being 26th. If you adjust the strikeout numbers for Peavy and Webb to take this into account, Peavy fanned 222 last year, and Webb 209, compared to the raw figures of 240 and 194. On the hitting front, Chase did a good job of suppressing singles, but on outfield flies, doubles were 8% above average and triples a monstrous 42% higher ...

Back to Miami Gardens
Published 3/28/2008 by photi at Fish Chunks
The Marlins return to the best strikeout park, the third-worst park (tied with Jacobs Field) for walks, the park tied for second-highest in producing infield flies, and tied for second-worst in inducing groundballs, in the majors. (And it's not even among the 5 worst parks for home runs. Hmmm...) (HardballTimes) ...

Leave a Comment Comment