Submit a Story!
Get the BallHype iPhone App
Batters and BABIP
Batters and BABIP
Why Nick Swisher will rebound in 2009: Looking beyond Line Drive Percentage to build a better model of predicting BABIP. Order the Hardball Times Annual 2009 today !

Blog Reactions

The Crawfish Boxes:  Ty Wigginton: A Sell High Opportunity for the Astros

It's About The Money - A New York Yankees Blog:  BABIP and baseball's luckiest/unluckiest

THE BOOK--Playing The Percentages In Baseball:  Estimating BABIP

33 Comments
  • JinAZ JinAZ
    +2

    Nice work--reminds me of JC Bradbury's work on PrOPS in a lot of ways.

    Any chance that you'll be willing to release your regression model so that we can make use of this tool ourselves?  Otherwise, it's not much help, and we'll have to continue to use either the old BABIP model or PrOPS.

    Thanks,
    Justin

    Posted 12/2/2008 respond (flag)
    • CSDutton33 CSDutton33
      +2

      Hey Justin,

      I will gladly send you the regression model that we used.  However, the downside of building a more accurate model is that it tends to become less flexible for casual use.  To use the complete model, you would have to do a lot of calculations using stats that aren't readily available (to get spray, speed score, etc).

       I realize it would be nice to provide readers with a tool they can actually use, so I'm planning to release a regression model using only the handful of common stats.

      Posted 12/2/2008 respond (flag)
  • tangotiger tangotiger
    +2

    I think providing a "basic" estimator as you are intending is very nice.

    I would also say posting the full estimator would be very good too, even if the particular components are not necessarily easily available.  I'd love to see what you have, either via email, or right here.

    Posted 12/2/2008 respond (flag)
  • Mike Podhorzer Mike Podhorzer
    +2
    Agreed, great stuff, and I'd love to see some sort of formula made available so we could use it for 2009 BABIP projections.
    Posted 12/2/2008 respond (flag)
    • CSDutton33 CSDutton33
      +2

      I'll aim to post both the full and "basic" estimators tonight. 

      FYI - The original, academic paper (which includes a full regression model as well as more technical details) includes more information for anyone interested, just send me an e-mail to request a copy.

      Posted 12/2/2008 respond (flag)
      • tangotiger tangotiger
        +1
        I could post it on my site, if you like.  That would certainly make it easier for that research to get it in people's hands, rather than one-by-one emails.
        Posted 12/2/2008 respond (flag)
      • mitchiapet mitchiapet
        +1
        Did you submit your original paper anywhere outside of a classroom setting?
        Posted 12/2/2008 respond (flag)
  • APV APV
    +2
    I'd be interested in getting the details, too. For example, I'm not sure what you mean by your use of independent variables, as some of the factors you identify don't seem to be conceptually or statistically independent. Cool stuff, though.
    Posted 12/2/2008 respond (flag)
  • blcartwright blcartwright
    +1

    In your xls file of the results, could you at sort the batters by last name, first name, or even better include their RetroSheet ID? Thanks

     

    I would like to compare your results to other numbers, but it's hard in the current sort order.

     

    Posted 12/2/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1
      I don't have RetroSheet ID numbers on hand, so I dont' know if i can help you there.  As for sorting the names, you can use the Excel sort function (in the "data" menu) to sort alphabetically by first name.  Last name is a bit trickier but there are functions to break out the "name" column into first and last
      Posted 12/2/2008 respond (flag)
  • CSDutton33 CSDutton33
    +1

    So there have been several requests for a simplified model that people can use on their own.  The problem is that to make the model simpler, you must remove factors that have a proven influence on BABIP, thus sacrificing predictive power.  For instance, speed score is a key variable that is quite a pain to calculate.  Removing it makes it easier to plug numbers into the equation, but the expected BABIP will be considerably less accurate than the original model.

     I can remove the team and year binary variables, as well as pitches_per_extra_base_hit, speed_score and spray, and still get a model more than 3x as accurate as using ld% alone, but I can't guarantee results.  The *very* simplified model follows:

     x-BABIP =  (.0104)*hitter_eye + (.0952)*ld_per - (.0357)*fb_gb_per - (.1046)*contact_rate + .3863

     

    Posted 12/3/2008 respond (flag)
  • pizzacutter pizzacutter
    +1

    A couple of things:

    1) I've done some reliability checks on hitter BABIP by type of ball hit.  BABIP is more consistent (split half r = .29) for grounders than for say pop ups (IIRC, almost nil).  Some guys are good at beating out grounders.  A base hit on a  popup is usually just good luck.

    2) SO/BB might be a coherent measure, but can we not call it batting eye?  SO come from a bad eye.  BB come from not swinging.

    Posted 12/3/2008 respond (flag)
  • Logodaedalus Logodaedalus
    +1

    Chris and Peter: Very cool work -- nice job.

    A few technical-type questions for you which I guess I'll post here in case there are others who are interested:

    I'm wondering how you selected your list of variables... Were these all a priori choices or did you do an exploratory regression with a larger set first?  I'd be really impressed if you were able to come up with a list a priori and have every one be significant, but I'm guessing that's not what happened...

     If you whittled down your variables to the present list from a larger one, how did you protect against inflated significance levels?  Was there any explicit cross-validation of your variable sets and coefficients from season to season, or from player subset to player subset?

    Finally, I noticed by counting that the actual BABIP - xBABIP residuals in 2007 were uncorrelated (in sign at least) with those for 2008 for the 28 players you listed as having the most extreme deviations from their xBABIP in 2007.  Have you done any actual tests on the residuals from season to season to explicitly check if there's any remaining correlation?

    If you discuss these things in the full paper, feel free to just refer me there.

    Posted 12/3/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1

      Thanks for the feedback.  To address some of your questions:

      - Our original dataset contains dozens of additional variables not included in the final regression model.  We spent a significant amount of time selecting independent variables (based on significance and multicollinearity tests) and testing variations of the model to identify which factors seemed to be most appropriate.  Our adjusted R-squared is over 33% (vs. 34.8), which suggests that our significance levels aren't inflated by the sheer number of explanatory variables included.

      - Haven't done much testing based on residuals, but I think that the fact that they are uncorrelated helps to validate the accuracy of the model.  We hypothesized that even the most "lucky" players would have an equal chance of being "unlucky" the following year (and vice versa) due to random chance.  If there was a strong correlation of residuals from year to year for particular players, this would suggest that these players share some omitted variable that we failed to account for.  

      Hope this answers your question.  I'll keep working with the model this week and run some additional tests. 

      Posted 12/3/2008 respond (flag)
      • Logodaedalus Logodaedalus
        +1

        Thanks for the quick response!

        Glad to see your adjusted R^2 is still as high as it is.  I was actually more concerned about "cherry picking" of the variables that ended up in the final model -- if you chose them based on significance levels in the initial regression, there's a greater-than-0.01 chance for each one that it's a Type I Error.  My suggestion would be to randomly split your dataset in half (or maybe just do something like "odd numbered seasons" vs "even numbered seasons"), and run the regression on both halves separately, keeping only the variables that are significant in both halves.  If that's all of them, great.  If not, you end up with a somewhat smaller R^2 value, but it should do a better job of generalizing to future seasons.  If you really want to be sure you could try splitting it different ways -- by seasons, by players, etc.  But that's what I mean by cross-validation.

         And yeah, I was just wondering if the residuals truly were uncorrelated in the entire dataset, using more than just the sign.  If not then that's an even stronger validation of the model.

        Thanks again for indulging me!

        Posted 12/3/2008 respond (flag)
        • Logodaedalus Logodaedalus
          +1
          I have some double negative problems in that third paragraph -- should read "If the residuals are not correlated, then that's an even stronger validation of the model"
          Posted 12/3/2008 respond (flag)
  • Bindlestiff Bindlestiff
    +1

    Curious why the "actual" BABIPs seem to be a bit different (typically lower I think) than those posted elsewhere (which are themselves not always consistent)?  Are there several different formulas for BABIP? 

    For example:

    2008 Giambi

    Yours:  .234

    Hardball Times:  .250

    FanGraphs: .257

    Posted 12/3/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1

      There seem to be a few calculations for BABIP floating around out there... I think the main difference is that we used PA - BB rather than AB's, which doesn't capture IBB or sacrifices.

       I think it might be worth changing in the next iteration of the analysis - although I doubt it would change our regression estimates much.

      Posted 12/4/2008 respond (flag)
  • pizzacutter pizzacutter
    +1

    Guys, one other question.  Did you test for any interaction effects?

    Posted 12/4/2008 respond (flag)
  • lookatthosetwins lookatthosetwins
    +1

    Thanks for the great read!  I've been going nuts since the season ended and actually having something to read besides trade rumors is very nice.  Anyway, here's a couple questions from someone who's never taken a statistics class, but still finds this stuff very interesting.

    I really don't understand the inclusion of batting eye into the formula.  I would agree that batting eye helps a player, but I would think it would only help the player walk more and maybe hit more line drives... so it would be covered in the line drive rate.

     Second, don't you think some of the categories make more sense when they are combined?  What I mean is, that being a pull hitter is more detrimental if you're left handed than right handed, because they can shift easier if you're left handed.  Also, the speed score is much more important if you hit a lot of balls on the ground, but won't factor in as much if you hit a lot of balls in the air.  This could explain why ichiro outperformed his two years in a row.

     

     

    Posted 12/4/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1

      Good points brought up by pizzacutter and lookatthosetwins...

      We only did a limited amount of interaction testing, so I think it's definitely a good next step.  (Speed_score)*(FB_GB_ratio) seems like a very plausible one, and I'll also test some others 

      Posted 12/4/2008 respond (flag)
      • lookatthosetwins lookatthosetwins
        +1
        So what you're saying is, Pizza cutter asked the same thing as me, in 400 less words.  At least now I know what interaction effects are.
        Posted 12/4/2008 respond (flag)
        • Peter Bendix Peter Bendix
          +1

          One more small thing to add to this: all of the components we used were significant at the 1% level. That doesn't necessarily mean that there wasn't interaction, but batting eye, for example, was statistically significant.

          It's possible, I imagine, that some players are able to be selective enough at choosing which pitches to swing at that they can improve their BABIP by the hitting the ball to a certain area, not necessarily just hitting more line drives.

          Posted 12/5/2008 respond (flag)
  • blcartwright blcartwright
    +1

    we used PA - BB rather than AB's, which doesn't capture IBB or sacrifices.

    Make sure you accurately represent balls in play. The shortest calculation is AB-SO-HR+SF. This excludes SH. I'm not sure if your formula excludes HP or IBB. I want to compare my results with yours, so we need the same denominator.

     Pitches_perEBH

     

    Pitches per extra base hit, which is a measure of how often a hitter makes solid contact (pitches/(doub+trip+hr)).

    Contact_Rate

    A measure of the ability to make contact and avoid striking out, simply calculated as ((ab-so)/ab).

    One measures contact, the other solid contact. Shouldn't they have the same denominator? (measure of opportunities). To me, the basic opportunity to make contact is a swing.

    Contact Rate = (ab-so+sf)/swings

    Solid Contact = (do+tr+hr)/swings

     

     

    Posted 12/4/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1
      Is swing data available?  Might be easier to use pitches...  I may update the BABIP formula so I'll also play with some variations on Pictehs_perEBH and Contact_Rate as well, although I don't think the lack of a common denominator is necessarily a problem
      Posted 12/4/2008 respond (flag)
  • blcartwright blcartwright
    +1
    Swing data is not available at BPro, but it is in the RetroSheet play by play for about the last 20 years. Pitches are coded as balls, called strikes, swing and miss, foul ball, in play
    Posted 12/4/2008 respond (flag)
  • WillH WillH
    +1

    That's really interesting. I think you hinted at something else that was interesting when you asked if Matt Kemp does anything particularly special to explain his higher than expected BABIP.

    I took your spreadsheet and manipulated it a little bit to figure out which players have consistently outperformed expectations over the last 4 years (I arbitrarily chose a 5% threshold to determine whether they outperform). Intererstingly, only Magglio Ordonez outperformed in all 4 years. The players that outperformed in the 3 of the 4 years are Chipper Jones, Chone Figgins, David Wright, Derek Jeter, Derrek Lee, Ichiro Suzuki, Jason Bay, Mark Teixeira, Michael Young, Miguel Cabrera, and Willy Taveras.

    Intererestingly, despite Jason Bay being among the 20 "luckiest" hitters in 2008, he was only 10.4% above expectations, while he was over 12% over expecations in both 2005 and 2006, so maybe he's doing something that is not being captured in the model and it's not an issue of luck.

    On the other side, certain players consistently perform below expectations (which I have again arbitrarily chosen as 5% below). Players who performed below expectations in all 4 years are Adrian Beltre (who is singled out in the article as being unlucky in '08 but maybe it wasn't all luck after all) and Craig Counsell. 3 year underperformers are Adam Dunn, Andruw Jones, Austin Kearns, Brandon Inge, Carlos Beltran, Chris Snyder, Gary Matthews Jr., Jack Wilson, Jason Giambi (maybe not such a free agent steal as a result of the defensive overshift), Jay Payton, Jose Bautista, Jose Castillo, Juan Uribe, and Rickie Weeks.

    I don't know if there are any other extensions of the research that could be done with this, but I find it really interesting to think about what players may be doing differently to achieve consisteny performance above expectations. I haven't done anything difficult with the spreadsheet, but if anyone is interested in seeing the sorted data, I'd be happy to e-mail it.

    Posted 12/5/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1

      We went through the same process as we continually tweaked the model - looking at guys who consistently outperformed or underperformed their expectations and brainstorming what quantifiable factors these players share that isn't accounted for.

      That's how we discovered the importance of the "spray" variable, as initially we found a lot of dead pull hitters consistently underperforming.   Once that variable was included in the model, we saw the accuracy improve significantly.

       Glad you brought this up, I would love to hear some ideas about what makes these particular players unique so that we can try to account for it.

      Posted 12/5/2008 respond (flag)
  • greenstampede greenstampede
    +1

    >One more small thing to add to this: all of the components we used were significant at the 1% level. That doesn't necessarily mean that there wasn't interaction, but batting eye, for example, was statistically significant.<

     You tested for correlation of your predictors right?  Ie. if Batting eye and LD% are highly correlated there is no point in keeping both in your model and you if you do it will bring down your adjusted R^2.  I'm probably telling you somthing you already know, but it didn't seem to be brought up above or in the article.

     

    Other wise great work, this is a good step in the right direction to answering a very important question.

    Posted 12/5/2008 respond (flag)
    • CSDutton33 CSDutton33
      +1
      Good question - Yes we ran a correlation matrix on all independent variables, and the only one that stood out at all was pitches per plate appearance (which has since been removed from the model, with virtually no change to the R^2)
      Posted 12/5/2008 respond (flag)
  • ceolaf ceolaf
    +1

    These are the results for all of MLB in the last few years.
    I'm curious how the coefficients might have shifted in the so-called "steroid era." Well, actually I am curious about differences across era generally.
    I also am curious about subpopulations. Do the best hitters do it differently than the MLB average? More specifically, do the hitters who are especially good at hitting for *average* put together that ability?
    (I know, I am killing your sample sizes, making it largely impossible to run regressions with that many variables.)
    Posted 12/6/2008 respond (flag)
Blog Reactions

Ty Wigginton: A Sell High Opportunity for the Astros
The Crawfish Boxes — ...  and Chris Dutton (though I'm not sure he's one of SBN's own), posted one of the most comprehensive updates and inquires in to BABIP I've seen in awhile at The Hardball Times. ...

BABIP and baseball's luckiest/unluckiest
It's About The Money - A New York Yankees Blog — ... This one today about BABIP (batting average on balls in play) and xBABIP (predicted BABIP) is a perfect example. Here are their list of baseball's luckiest and unluckiest hitters of ...

Estimating BABIP
THE BOOK--Playing The Percentages In Baseball — ... Good stuff.  Unfortunately, it is presented as a black box, but I like all the different components that were presented. ...

Bendix's Insanely Good xBABIP Piece
DRaysBayBendix's Insanely Good xBABIP Piece Outstanding work.

Bendix's Insanely Good xBABIP Piece
Beyond the Box ScoreBendix's Insanely Good xBABIP Piece Now for him to spill the beans on it.

A's moving in the direction of possibly contending
ESPN Feed: neyer rob — ... but just to review a few key points: •He started 30 games this year, and in fact has started at least 30 games in four of the last five seasons •His underlying stats were even better than his 3.91 ERA •Relatedly, his strikeout rate this year was the sixth best in the National League. I don't know that I've written about Giambi, so I'll note here that Giambi hit 32 homers and drew 76 walks in 2008. As a hitter, his only weakness was that he batted only .247. … But according to this important study (about which more later), Giambi was terribly unlucky in 2008 and was ...

Friday Filberts
ESPN Feed: neyer rob — ... . • I've referenced this already, but I have to again recommend Chris Dutton's and Peter Bendix's newfangled analysis of batters and BABiP . I'm not smart enough to know if their work is essential … but I'm fairly sure that it is. And while maybe not essential, Josh Kalk's answer to the question ...

Willy Aybar
FanGraphs Baseball — ... 745 plate appearances, or a little more than one full season’s worth of playing time. His career wOBA is .339, thanks to a good contact/gap power skillset, making him an above average major league hitter. 2008 was his worst year from a raw statistics perspective, with his .321 wOBA and -0.18 WPA/LI. However, he was remarkably unlucky in terms of batting average on balls in play - a .267 BABIP that simply wasn’t supported by how he hit the ball. Chris Dutton’s BABIP predictor had Aybar’s 2008 expected batting average on balls in play to come in at ...

Straws Within Our Grasp
Ghostrunner on First — ... .nobr br { display: none }The good people at The Hardball Times recently took to changing the way we look at Batting Average on Balls into Play. A neat little stat we can use to determine who may have been lucky, who may have been unlucky, and who may be terrible. When predicting BABIP, one adds .120 to the batter's line drive rate. Where the .120 comes from I honestly have no idea. But these fine gentlefolk decided to dig a little deeper into BABIP, adding many more factors to create a more telling predication of a batter's success. ...

AK, TAWH due for offensive explosions in 09?
Federal BaseballAK, TAWH due for offensive explosions in 09? Don't click through unless you're a stat-geek. It's a longish article about finding a better way to predict a hitters batting average on balls in play (BABIP), which is a way of measuring "luck" with where balls get ...

Premium Tease - Is Evan Longoria the Next Ryan Braun?
FantasyPros911.com Latest News — ... An old, reliable rule of thumb is that a batter's BABIP can be expected to equal about his line drive rate plus .120. So, using that rule, Braun's BABIP in 2007 and 2008 "should" have been about .283 and .293, respectively. However, there are a number of exceptions to this rule, which have been the subject of some very deep and intriguing analysis by The Hardball Times' Chris Dutton and Peter Bendix, among others. I won't presume to summarize their findings, but the upshot of their critique of this rule of thumb is that a number of players consistently out- or ...

BABIP, PROJECTION, AND NEW STATISTICS
The Good Phight — ... means is that far less of BABIP can be explained than other outcomes of plate appearances from simply figuring out previous BABIP numbers.  In fact, the best prediction of 2008 BABIP that BABIPs from 2005-2007 that could be gotten with a regression only has a .3787 correlation with BABIP.  However, we can probably do better.  There is a lot of other data out there.  Chris Dutton and Peter Bendix recently posted THIS at The Hardball Times explaining some major correlates of BABIP, including BB/K ...

Premium Tease - Is Evan Longoria the Next Ryan Braun?
FantasyPros911.com Latest News — ... An old, reliable rule of thumb is that a batter's BABIP can be expected to equal about his line drive rate plus .120. So, using that rule, Braun's BABIP in 2007 and 2008 "should" have been about .283 and .293, respectively. However, there are a number of exceptions to this rule, which have been the subject of some very deep and intriguing analysis by The Hardball Times' Chris Dutton and Peter Bendix, among others. I won't presume to summarize their findings, but the upshot of their critique of this rule of thumb is that a number of players consistently out- or ...

Best for Yankees to Keep Swisher, Not Nady
Dugout Central — ... Swisher, it seems, was done in by sheer bad luck, causing his average to hover near the Mendoza line. He posted an unusually low batting average on balls in play, producing a .251 mark. His ’08 BABIP was by far the lowest of his career – his previous totals fall at .277, .266, .287, .308 from ’04-’07, respectively. Interestingly, his line drive rate of 20.9% was a personal best. According to an article co-authored by Peter Bendix and Chris Dutton in the Hardball Times, he was the fifth-unluckiest player in the Majors in ’08, based on his xBABIP ...

More on BABIP and Player Performance
The Crawfish Boxes — ... This article at Hardball Times is over a month old  (OK, I get behind in my reading sometimes).  So maybe you have already seen it.  But I find this subject area very interesting.   In a previous ...

Top Five Stories For 2009--Chicago Cubs
FantasyPros911.com Latest News — ... . Clearly Geovany took offense to those who doubted him as he won the NL ROY by proving that his 2007 bust out was no fluke. There really is nothing in his statistical profile that screams out that a huge fall is in order or that further growth will occur in 2009. However, if you look hard enough, there are a couple of small warning signs. His BABIP was .337, yet his xBABIP was just .295 based on new research by Chris Dutton and Peter Bendix on THT. Soto's OPS also dropped nearly 50 points and his Power Index dropped 32 points in the 2nd half, which could have been the result ...

Bracing for Disappointment
Mets Geek — ... If anyone thought Tatis could repeat his awesome 2008 performance (123 OPS+) there wouldn’t be nearly as much discussion about signing Manny Ramirez, Adam Dunn, or Bobby Abreu. Unfortunately, it looks like we may see a decent decline from Tatis. He’s 34 years old and his .338 BABIP in 2008 was a bit lucky. According to a piece at the Hardball Times by Chris Dutton and Peter Bendix, Tatis’s xBABIP (expected BABIP, which is based on factors like line-drive percentage, plate discipline and baserunning) was .317, which is 21 points lower than his actual BABIP. CHONE projects a .766 ...

BABIP: Slicing and Dicing Groundball Out Rates
Baseball Analysts — ... for the past five years. Led by Dave Studeman, THT has written several articles on this subject, including two recent studies on BABIP by co-authors Chris Dutton and Peter Bendix and ...

Improving BABIP Estimation
The Good Phight — ... In December, Peter Bendix and Chris Dutton published an important article in this area of research at The Hardball Times discussing a regression formula for BABIP using many other statistics.  This article was meant to approximate BABIP in the same year as the data was recorded.  While useful as an attempt to describe who was lucky and who was not, there is added value to being able to predict future BABIP from historical variables.  This type of research can be used by general managers and fantasy players alike. ...

Linky Dinky Doo
Red Reporter — ... One of the big topics in sabermetrics this off-season has been BABIP.  It all started with the Batters and BABIP article at THT back in December.  Since then, lots of ...

Fantasy and Sabermetrics for Beginners - Hitting Skills Part II
Roto Savants — ... Until recently many used (LD% + .120 = BABIP) to calculate an expected BABIP, but there has been much more work put into this recently and found some more variables to incorporate. Here is a study by Chris Dutton and Peter Bendix and you can see the variables they use to estimate BABIP. Looking through their research you can see the important factors to be a player who shows an elevated BABIP consistently. High LD%, good Speed score, high number of extra base hits, ability to hit ...

BABIP: Progressing and Regressing Groundball Out Rates
Baseball Analysts — ... what variables account for extraordinarily low groundball out rates. So, using a similar method to that which Peter Bendix and Chris Dutton used to find expected BABIP, we dug deeper and ran a regression to find expected average on groundballs. ...

Alcides Escobar and the Case of the Mysterious BABIP
Brew Crew Ball — ... Thanks to a lot of great research by Peter Bendix and Chris Dutton, we know that there are a variety of other factors that can infleunce BABIP, meaning that it is unreasonable to simply expect a .300 BABIP for every batter, or just use the line drive rate +.12 method described in Jeff's post linked above.  ...

Talented NL East Shortstops
Amazin' Avenue — ... In 2008, his xBABIP per Peter Bendix and Chris Dutton was eighteen points lower than actual BABIP, so he was probably a bit lucky.  Jim Bowden might have been wise to work out a trade for some pitching and sell high on Guzman given the shortstop's unpredictable performance.  Then again, the Nationals ...

Will Robinson Cano Rebound in 2009?
Beyond the Box Score — ... Studies have shown that the average BABIP tends to hover somewhere around .300 and that batters tend to have some influence over their BABIP - although nowhere near as much as you might think. ...

A Stupid Position Battle: Yankees Right Field, or, How Much Better Would the Yankees Be With Manny Instead of The Swish?
Driveline Mechanics — ... , indeed, he was one of the most luckless hitters in baseball in 2008. Despite worries from even some more informed Yankees followers, he's offense more than good enough, given even average defense, to "play" in the corner outfield, where a +7.5 bat is good enough to be average, given the positional adjustment. But it's not all about the offense, is it? ...

Beware the BABIP
The Cub Reporter (TCR) | A Chicago Cubs Blog — ... People smarter than me and with way more time have been trying to calculate exactly what influences BABIP and a recent study posted at The Hardball Times has made some headway on the topic. Now with the advent of keeping track of batted ball types (line drives, ground balls, flyballs, infield flys, etc)  a bit of an illuminance has been shed on the subject.  At one time, ...

Can Albert Pujols Win the Triple Crown?
Baseball Analysts — ... Chipper and Pujols also excel at earning surefire hits by putting the ball out of play and over the fence. Low strikeout and high homerun totals give players a good chance at having a high average. The rest is dependent on BABIP. The factors that go into BABIP, according to an article by Peter Bendix and Chris Dutton, boil down to pitch recognition, speed, the ability to make solid contact, and the ability to spread the ball to all fields. Pujols hits a lot of line drives (20% career), and has incredible power (22.7% HR/FB, 84 XBH/year). He rarely swings, but when he does ...

Are the Nationals not only the Worst Team in MLB, but also the Luckiest?
FJB — ... There are more sophisticated ways to measure expected BABIP, but just eyeballing it, if the Rangers' 22.1% LD rate got them to a .329 BABIP over the course of a full season, even if the Nationals could maintain a 23.8% LD rate (and they can't, but let's just say), they'd be due for some significant regression from their current sky-high .345 BABIP. And if you knock 15-20 points off that number, that's a significant number of seeing-eye grounders, bloopers, and also sharply hit balls becoming outs. ...

What Does Vegas Think of the AL?
Vegas Watch — ... between the average of PECOTA and CHONE (77.5) and their closing O/U (83.0) was the highest in baseball.  The most impressive part is that all of this has come without their ridiculously valuable catcher.  Their 11-11 record would indicate that Minnesota has lived up to Vegas' high billing in the early going, but their -23 run differential tells us otherwise.  I'm not sure how relevant this is to their projection, but one of the commenters on their depth chart pointed out that traditional BABiP models underrate 2/3 of their lineup, so it's possible that is having a small ...

Player Profile: Jimmy Rollins by Marc Normandin and John Perrotto
Baseball Prospectus — ... home runs as well, going deep just 11 times and seeing his Isolated Power drop from a career high of .235 to a mark closer to his career rate. Part of this was because Rollins started to hit the ball on the ground as often as he had in the past (his non-2007 seasons), and he also hit line drives more often than he had since 2005. While liners correlate most closely with hits, Rollins ended up being hit-unlucky, with a BABIP roughly 30 points lower than it should have been ( based on recent work with BABIP models). Some of that poor luck from 2008 has spilled ...

Can David Wright Sustain His Torrid Pace?
Amazin' Avenue — ... Line Drives, however, are only one influence of whether balls put into play fall for hits. Examining some other factors that influence BABIP provides an opportunity to both determine whether David's BA prognosis is less grim than it seems, and also test the claim that he's become more of "speed guy". Dutton and Bendix's famous article on predicting BABIP doesn't provide an alternative formula to the .120 rule, but does identify other factors that should be considered. These factors are, roughly: batting eye, both pitches seen and BB and K; LD%; FB/GB ratio; speed; contact rate; ...

Tuesday Nats Stats: Bad Luck or Curse?
Federal Baseball — ... BABIP is something a hitter has some control over, based on how hard he hits the ball and how many line drives he can hit.  Predicting what a batter's BABIP "should" be (as opposed to assuming it should be the league-average .300 or so) is not the most reliable statistical endeavor out there, but I took at look at a simplified "expected" BABIP model based on this work at The Hardball Times.  Anyhow, according to the magic formula, we'd expect Guz, NJ, and Zimmy all to have BABIPs around .305.  They hit the ball hard, so we expect them to do better ...

Placido’s Quiet Lumber
FanGraphs Fantasy Baseball — ... and .344. So, what has caused Polanco’s forgettable 2009 season? The first thing that catches one’s eye is a .263 BABIP, leaps and bounds below his .321 mark in 2008. How much should we expect that figure to bounce back? To try and answer that question, let’s use a BABIP estimator from The Hardball Times. Derek Carty of THT developed a BABIP calculator, based on the great work that former Rotographs writer Peter Bendix (along with Chris Dutton) conducted this past winter. In their study, Bendix and Dutton included many more ...

The Taming of the Drew?
AZ Snakepit — ... analysis missed, is that not everyone deserves a league average BABIP: hitters have more control over it than pitchers. The classic example is Chris Young, whose BABIP this year is very low, because of all the infield pop-ups which are almost guaranteed outs. Different kinds of balls in play have radically different results, and other factors such as speed, ballpark, etc. also affect BABIP. The guys over at The Hardball Times have done some excellent work in this area - see this article for an explanation of xBABIP, an enhanced version of BABIP, which ...

Astros batters: Do you feel lucky?
The Crawfish Boxes — ... Two graduate students, Dutton and Bendix, developed a multiple regression analysis for predicting a player's "expected" BABIP, which they described in  a late 2008 Hardball Times article.  The spreadsheet and quick tool for applying their model to a given player was later provided ...

Granderson’s Just Fine
FanGraphs Fantasy Baseball — ... It would be easy to simply declare, “he’s been unlucky” and move on. But thanks to some outstanding work done by Chris Dutton and Peter Bendix on what factors influence BABIP for hitters, we can go much further than such a cursory statement. ...

Swisher’s Resurgence
FanGraphs Fantasy Baseball — ... and 2008: 2007 15.6 BB%, 24.3 K%, 17.5 LD%, 0.81 GB/FB, 9.5 IF/FB%, 16.6 O-Swing%, 85.8 Z-Contact% 2008 14.2 BB%, 27.2 K%, 20.9 LD%, 0.78 GB/FB, 11.1 IF/FB%, 18.9 O-Swing%, 86.2 Z-Contact% There are slight changes, but certainly nothing earth-shattering. Yet, Swisher’s BABIP plummeted from .308 in ‘07 to .251 in ‘08. According to this expected BABIP tool from The Hardball Times (based off research done by Chris Dutton and Bendix), Nick was terribly unlucky. Swisher’s rate ...

Player Profile: Nick Swisher by Marc Normandin
Baseball Prospectus — ... like Swisher does, it's a little easier to deal with in the short term. Almost immediately after that piece was published, Swisher caught fire: he hit .315/.402/.630 in June with seven homers. This didn't continue though, as his second half line was an ugly .191/.298/.427; the only good news there is that his power was clearly back, he just couldn't catch a break on getting a ball to land somewhere besides a defender's glove. Following the season, Bendix co-developed a system that blows both of our previous batted-ball adjustment efforts away . It's much more complicated and ...

Buy Low on Soto
FanGraphs Fantasy Baseball — ... Carty’s tool is based upon the excellent research of Peter Bendix and Chris Dutton. Their work found a positive relationships between BABIP and batter’s eye (BB/K rate), line drive percentage, Speed Score and P/PA. Dutton and Bendix’s XBABIP model ...

Related Content
Statistical Sleepers-Batter's BABIP
fantasypros911.com 1/2/2009 — Join a FantasyPros911 league and play against some of the best fantasy players around CLICK HERE FOR MORE INFO
Getting to know BABIP
riveraveblues.com 1/28/2009 — Sometimes in baseball things happen that we just can’t explain, and when it does happen we call it luck. Good luck, bad luck, whatever. One of the biggest statistical luck fiends in BABIP, or Batting Average on Balls in Play. Nick Swisher ...
What’s the best BABIP estimator?
hardballtimes.com 1/26/2009 — A head-to-head comparison of seven BABIP estimators. Which one should we be using? Click the title to read more. Order the Hardball Times Annual 2009 today !
Fantasy Phenoms - BABIP
fantasyphenoms.com 2/4/2009 — Batting Average on Balls In Play, or BABIP, is generally around .290 for pitchers. Looking at the 15% on both sides of the spectrum, we can identify pitchers who should have lower or higher WHIP's the following year...........
Nick Swisher: NOT Dating Danielle Gamba
fantasybaseballdugout.com 11/17/2008 — Now that Nick Swisher has joined the New York Yankees, the city must be buzzing about his “relationship” with Danielle Gamba. Thanks to a hott tip from “Trish” on August 25, we learned that Swish NEVER dated Danielle. So, your friends at Fantasy Baseball Dugout want to ...
Cleveland Indians Sabermetrics 101: BABIP
kankasports.blogspot.com 12/11/2008 — An introduction to pitcher's batting average on balls in play, using the Indians as an example.
2008 Statistical Sleepers - Batting Average on Balls in Play (BABIP)
fantasypros911.com 1/2/2009 — A look at 2008 players who had bad luck in average from a decreased BABIP.
Nick Swisher: Overrated?
hardballtimes.com 11/17/2008 — Monday, November 17, 2008 Nick Swisher: Overrated? Posted by Victor Wang at 1:06am For this article I was originally planning to write about value picks for next year. One of the guys I was planning to include was Nick Swisher . However, some recent discussion has caused me to ...
Swisher looking for a much better year in ‘09.
zellspinstripeblog.com 11/19/2008 — Everybody..let’s welcome Nick Swisher to the New York Yankees. [clap, clap, clap] It seems to me like Nick Swisher is very excited about wearing Yankee pinstripes. I have seen Swisher play, and many say he gives the game everything he’s got. He had an “off-year” in 2008, but he is looking to put th
Swisher Happy to be a Yankee and Wants CC to Join Him
slidingintohome.blogspot.com 11/20/2008 — From Pete Caldera : At Nick Swisher’s personal Web site, there is a photo of the Yankees’ newest acquisition with a smiling CC Sabathia, taken at a charity event in December of 2006. "I’ve known CC a couple of years now. He’s just a great guy," Swisher said during a ...
How the New York Yankees Can Rebuild Their DynastyBleacher Report - MLB 12/5/2008
The Yankees missed the playoffs this year, to the Rays.  The surprise told the Yankees "you need to do something." What is that something? The Yankees should aim young; at first base, sign Tex.  This will be the biggest ...
Re-Building a Dynasty: Yankees Round Table OptionsBleacher Report - MLB 12/5/2008
The Yankees missed the playoffs this year for the first time in over a decade, while the Rays had October baseball in Tampa for the first time. The surprise told the Yankees "you need to do something." What is that something? The Yankees should ...
Re-building A Dynasty; The Yankees Round Table OptionsBleacher Report - MLB 12/5/2008
The Yankees missed the playoffs this year for the first time in over a decade, while the Rays had October baseball in Tampa for the first time. The surprise told the Yankees "you need to do something." What is that something? The Yankees should ...