Submit a Story!
Get the BallHype iPhone App
topics:

Strike Zone: Fact vs. Fiction
Strike Zone: Fact vs. Fiction
Did you know that right-handed batters have to defend a bigger strike zone than lefties?
18 Comments
  • ChiDan ChiDan
    +1

    Great article; very thought provoking.  One question, though:  it doesn't seem immediately obvious that setting the "reality" line at 50% balls/50% strikes is analytically correct.  The "rulebook" line is set at (essentially) 0% ball/100% strike.  You clearly could not choose a reality line at 100% strike (because there is error, etc.), but perhaps a line at 20% balls/80% strikes would more accurately capture umpire consensus on what was a ball and what was a strike.  I note from eyeballing the charts, that would change the conclusions somewhat. 

    Either way, a great article -- good idea, great analysis.

    Posted 7/11/2007 respond (flag)
    • walshj58 walshj58
      +1

      Thanks, ChiDan.

      I suppose you could choose the 20/80 point to define your strike zone. But, if you did that, you'd be faced with the situation where just outside your defined zone, i.e. balls by definition, around 80% of pitches are called strikes.  

      Posted 7/11/2007 respond (flag)
      • pizzacutter pizzacutter
        +1

        John, 50/50 is the accepted cutoff in signal detection research, which this would fall squarely into.

        Posted 7/14/2007 respond (flag)
  • constancio constancio
    +1

    John,

    Very nice work. 

    Have you considered using the regular old scorer-inputted x,y values to help debug the erroneous enhanced gameday data (such as the intentional ball recorded as down the middle of the plate)? 

    Also, I would be interested in the interaction of pitcher-batter handedness on influencing the size of the strike zone.

    Posted 7/11/2007 respond (flag)
    • walshj58 walshj58
      +1

      Hey Chris,

       Thanks.  I have a few ideas for trying to figure out what's going on, using the stringer's x,y points was actually one of them.  I'm also working on a simple simulation to see if I can reproduce my ball fraction curves with some basic assumptions. 

       There are many ways you could slice and dice the data, of course, Pitcher handedness, home plate umpire, ball-strike count, etc.  I think once we've figured out the data quality issue, we can start tackling that stuff in earnest. 

      Posted 7/11/2007 respond (flag)
  • coryd1866 coryd1866
    +1
    As an NCAA umpire I deal with this on a daily basis. The college strike zone is much larger then our professional colleages. We are told to get high strike, which actually is a ball above the belt, and call strikes at the bottom of the knees. Also we call a much wider zone then the pros. Typically if the pitcher hits his spot and it is within the vertical boundry of the strike zone its a strike, as much as six inches off the corners. Otherwise we will be there all day.
    Posted 7/11/2007 respond (flag)
  • ultxmxpx ultxmxpx
    +1
    I have noticed strikes that caught the edge of the plate be quite a distance from the plate on gameday. As a matter of fact, most of those were to away from lefties or at least on the left side of the plate from the catcher's perspective. That's based on a small set of accounts (15 accounts?) when I've watched the game and gameday. It could be an error in the system and the location it's placed... it seems like they should be able to fix that though. Maybe it's located on right side of the field or something like that. It does seem like strikes up in the zone are rarely called though.
    Posted 7/11/2007 respond (flag)
  • thrower25usr thrower25usr
    +1

    great read, i just wish the data was more reliable. 

    coryd brings up the point of pitchers hitting there targets and I think that influences the umpire greatly. Even more is the missing of a target. If the catcher is set up on either corner and has to reach to the other side, sometimes pretty close to the middle, the umpire may call it a ball.

    Posted 7/11/2007 respond (flag)
    • walshj58 walshj58
      +1

      Thanks.

       Just one comment: the data, while not perfect, is light-years ahead of anything we had before. I, for one, am not complaining! It's important to understand the limitations, but even with imperfect data, we can learn an awful lot.  Plus, I'm convinced that things will improve: we are in the infancy of this industry, once some of the early glitches are corrected, things will be much better. 

       

      Posted 7/11/2007 respond (flag)
  • gdc gdc
    +1
    I would not expect a vertical line at the corners of the plate even with an electronic system, as the pitch location data is probably on a plane such as the beginning of the plate.  Some of the overhand pitches just off the plate would be a ball whereas some of the sidearm pitches with late horizontal movement would just miss the front edge of the plate and cut a bit afterwards.
    Posted 7/11/2007 respond (flag)
  • cthulujones cthulujones
    +1

    Great article! 

     If this is like other statistical samplings, based on the performance of several individuals, aren't the results also likely being skewed by a certain % of umps?

     Can you tell from the data who's more likely to call the off-the-plate strike... and if who's pitching has any correlation to those calls?

     

    Posted 7/12/2007 respond (flag)
    • walshj58 walshj58
      +1

      Thanks.

      Sure, this is an average of the umpires who worked the games in the sample.  When more data is available, I think we'll be able to break it down according to umpire.

      As for who's pitching, Dan Fox recently looked at this issue in an article on BPro.  He didn't look at the size of the strike zone, but rather the percentage of "missed" calls.  He didn't find a strong effect, either for hitters or pitchers.  

      Posted 7/12/2007 respond (flag)
  • t orssten t orssten
    +2

    Thank you for addressing a much needed discussion. Since I was catching as a kid in the 50's,the accepted as rulebook defined  zone was knees to letters(nipples)and width of plate; based on stance(think Rickey Henderson to Jose Canseco) Now we have umpires with an ever shifting personal disgression strike zone who are ruining the game!There have always been bad calls and BAD umpires,we're only human,and dirt does blow around the plate,and heat causes sweat to drip in the eyes,and the occasional foul tip will rattle your head,but the fact is you(WE)have a disproportionate number of very bad umpires.They set themselves up as infallible,can't look or stare or question them cause they'll toss you.The so-called scrutiny they're under is lacking teeth and needs to be more strictly enforced and refreshed.A refresher coarse and testing should be mandated especially directed at the umps who are always being barked at. When neither the pitcher ,hitter,umpire or fan knows the strikezone until the game starts,something is very wrong.We,re not talking about the soup of the day,this is America's Pastime ! Before things get worse is there a way to make our thoughts known to MLB?

    Posted 7/12/2007 respond (flag)
    • coryd1866 coryd1866
      +1
      @ t orssten I can't disagree with you enough. MLB umpires are the most professional, well trained, and most experienced people in their field of work. To say there are a disproportionate amount of bad umpires, just shows your ignorance, and lack of baseball knowledge. It seems you just don't like umpires in general, which is fine, but don't throw a blanket statement over all MLB umpires, it's just not warranted. And if you think you can do it better go out and umpire a Little League game and see if you have what it takes.
      Posted 7/13/2007 respond (flag)
  • pizzacutter pizzacutter
    +1
    John, your usual excellent work.  As someone who occasionally cleans data for a living (we all have to do something!), what are your general impressions about the "purity" of the data.  You mentioned the particularly awful example (the pitchout/intentional ball scored as a fastball down Main St), but how often does something like that happen?  What about other things that look fishy but might have a legit explanation? 
    Posted 7/13/2007 respond (flag)
    • walshj58 walshj58
      +1

      Thanks.

      I think we're still a ways from understanding the purity of the data and we probably haven't been doing due diligence in doing checks. The problem is, this data is so cool, it's too tempting to dive in and start doing stuff.

      I don't know how what percentage of pitches have problems, but I'm working on estimating that number.  Another problem I just noted today is the lower and upper limits of the strike zone, which is set for each batter by an mlb operator. I believe these should be very stable (especially the lower limit), but in fact I find a fair amount of variation. 

       I'm sure there are other things that will come up.  

       

      Posted 7/13/2007 respond (flag)
  • watercott watercott
    +1
    It appears from the data that umpires are calling the top of the zone correctly. I am 100% sure that this is not the case. I almost never see a ball more than 3-4 inches above the belt being called a strike. I'd guess that the method for defining the upper bound of the strike-zone on gameday is not using the rulebook definition.
    Posted 7/13/2007 respond (flag)
    • walshj58 walshj58
      +1

      Another person made the same observation.  There may be some problem with the data, but if you are judging the strike zone from TV broadcasts, you are always going to judge a pitch lower than it actually is. That's because the pitch is moving downward and the catcher is positioned around 2.5 feet back from the batter.  Many pitches will drop 3-6 inches (or more) over that  distance. So, your 3-4 inches becomes 6-10 inches and now we're in the ball park of the rulebook strike zone. 

       As I mentioned above, though, there are some issues with the setting of each batter's strike zone that need to be addressed. But keep in mind that it's very hard to call (vertical) balls and strikes from the TV. 

       

       

       

      Posted 7/13/2007 respond (flag)
Blog Reactions

Size and Shape of Strike Zone Dependant on Batter Handedness
Another Baseball Blog — ... Hardball Times: Strike Zone: Fact vs. Fiction ...

Tip of the Iceberg
Dan Agonistes — ... Strike Zone: Fact vs. Fiction. John Walsh totally steals my thunder by examining the actual dimensions of the strike zone as it is called by major league umpires. What I find interesting is that he notes that right-handed hitters end up having to defend a strike zone that is slightly larger while I've found that left-handers are getting 10% more strikes called against them on pitches out of the strike zone. In looking at John's data I think the reason for this is that left-handers have to defend more territory on the outside part of the plate and pitchers concentrate on this area throwing a disproportionate number of their pitches in that region. ...

The eye of the umpire
The Hardball Times — ... wrote about his ability to judge where a pitched ball actually goes, from his book The Science of Hitting : It's very likely that once you've made yourself sensitive to the strike zone, you'll be a little more conscious of what you think are bad calls by the umpire ... I would say umpires are capable of calling a ball within an inch of where it is. As a hitter, I felt I could tell within a half-inch. Well, I'm skeptical by nature, and those estimates seem a trifle too good to me. But Williams was a very smart guy and he wasn't one to throw a lot of bullshit around, so I wouldn't dismiss his claims outright. And it turns out that we can shed some light on the subject by looking at MLB's fabulous pitch data, the so-called pitch-f/x data. Today I'm going to build on some work I did last time ( Strike zone: fact vs. fiction ) on determining the size of the strike zone using pitch data. As we'll see in a f ...

Kameron Loe and the random strike zone
Go Rangers! — ... First of all, a little explanation of what I did. I broke down each start by the description of the pitch in the data. I then took the balls and called strikes and charted them, to get an idea of where the strike zone was being called on that day. This can differ quite significantly. By using just balls and called strikes, we’re not seeing how the batter influenced the call (by swinging his bat or hitting the ball), so this should be what the umpire’s influence is. John Walsh has done some great work on how the strike zone is being interpreted compared to Gameday, much more detailed that I am doing here. In my analysis of these four starts, I found the strike zone varied a little, but the left side (from the catcher’s viewpoint) was around 1.2 feet off the center of the plate, and the right side about 0.5 feet off center, except the last start where it was 1.1 feet off. This se ...

The slings and arrows of outrageous strike zones
Go Rangers! — ... The more I dig into the Gameday data, the more I find these annoying inconsistencies. If he’d struck out on pitch four, I’d have no problem with it. If he’d struck out on pitch five I’d have no problem. But pitch six I do. The Hardball Times recently reviewed the strike zone, and showed that while pitches are overall called fairly well, there is a huge gulf around the edges of the strike zone. Yes, you’d expect that, but not to the extent that it happens, and not horizontally as much as vertically (because the horizontal strike zone is fixed, while the vertical varies with the height of the batter). ...

May I have Seconds?
Baseball Analysts — ... There are a number of cases where pitches are badly tracked, and another problem with the system is that it occasionally picks up a ball transfer between the umpire and pitcher. I haven't done any digging into this, so this is pure speculation, but knowing more about how the values are calculated, I think perhaps these two problems are related. If the initial values are somehow wrong (they correspond with the ball exchange), the x,y coordinates for where the ball crosses the plate are going to be calculated correctly for the ball exchange, but will not match the reality of the pitch. ...

A.J. Burnett Pitch Analysis - Then and Now
Toronto Blue Jays — ... I tweaked the location a little- a nice feature of pitch f/x is that it measures the strike zone for each batter. I think it’s more important where in the zone the pitches are than the actual physical location, so each pitch is adjusted to the % height of the batters strike zone. 0 is at the knees, and 100 just below the letters. The Horizontal zone is defined as it’s actually called, a foot either way from the middle of the plate. Burnett actually threw 5 changeups in his first start as well, but for some reason the pitch f/x was having trouble recording them. ...

Eric O'Flaherty
Lookout Landing — ... I generated a pair of charts that I think should be pretty easy to understand. The first shows O'Flaherty's pitch location against lefties, split up by type (he pretty much only throws a fastball and a slider). The estimated strike zone is an approximation of that derived by John Walsh. ...

StatSpeak World Famous Roundtable: April 21
MVN RSS — ... it is still a strike?  I’ve seen it happen plenty and an umpire looking straight ahead would clearly see if a pitch crosses the plate.  I’m not too positive how it would not work, if the ultimate goal of umpiring is to be as accurate as possible, but some potential arguments against it would stem from a part of the majesty of chance being non-existent.  Pizza Cutter: Umpires don’t call the high strike and haven’t for several years, although there’s some evidence to suggest that they call the strike zone a little wider than it’s written in the rule book.  It makes sense.  If ...

Quite An Achievement for Adam Eaton
Baseball Digest Daily — ... : the Washington’s Jon Rauch. Two others are equally as tall as Johnson: San Diego’s Chris Young and the Mets’ Eric Hillman, whose brief Major League career started in 1992 and ended in 1994. Randy Johnson’s strike zone is freakin’ huge, but exactly how huge is it? We know home plate is 17 inches wide, but according to John Walsh’s excellent data-driven article on the strike zone: It appears from this data that the umpires' strike zone is about two inches too wide on each side [for right-handed hitters], compared to the rulebook strike zone. So, generally speaking, Johnson’s ...

18-33, Some Stuff
Lookout Landing — ... Yeah, that's a called strike three on a pitch well off the plate. Giambi obviously didn't think much of it, but then, this isn't a new phenomenon - umpires just have a crazy different strike zone for left-handed hitters. Check out this article by John Walsh, and, if you want to skip the meat, scroll down to the bottom. Hello, outside strikes. I don't know how or why this is the way it is, but a smart pitcher - that is, someone who's aware of PITCHf/x - that is, Brian Bannister - - should absolutely be using this to his advantage. Unfair or not, that's the reality, and it ...

40-65
Lookout Landing — ... And finally, even with all of those built-in excuses, Felix's performance just wasn't as mediocre as his pitching line would suggest. More telling than the four walks is that he threw 63% strikes - above-average - and had another eight pitches in the strike zone called balls by HP umpire Tom Hallion. He generated 12 swinging strikes, all against left-handed hitters, and whiffed six of the 29 batters he faced. And of the 19 balls the Rangers put in play, 11 were grounders and two were infield pop-ups. So the BIP distribution was right where you'd like it to be. Really, aside from ...

45-70
Lookout Landing — ... JJ got five fastballs up past 97mph tonight. All of them strikes. This was the best he's looked in a while (not that he's really made that many appearances in a while), and his strikeout of Navarro to end the inning was completely unfair. After getting him started with one of those lefty strikes just off the outer black, he came back with an outside fastball above the belt, then an outside fastball at the letters, and followed that up with a low-away splitter for the 0-2 swinging strikeout. That's the kind of thing that JJ gained by learning the splitter in the first ...

Do the A's take too many strike threes looking?
Athletics Nation — ... So there you go fans. Yes, the A's may be taking a few too many called strike threes they should be swinging at. And yes, a lot of them are pitches right over the plate, but probably not the amount you might think. It also doesn't help that they aren't getting the proper strikezone called, especially the left handed batters. This is likely a trend throughout MLB though, as this older THT article from John Walsh highlights. That article is outstanding by the way, and I suggest if you have the time to give it a full read. ...

Do the A's take too many strike threes looking?
Beyond the Box Score — ... So there you go fans. Yes, the A's may be taking a few too many called strike threes they should be swinging at. And yes, a lot of them are pitches right over the plate, but probably not the amount you might think. It also doesn't help that they aren't getting the proper strikezone called, especially the left handed batters. This is likely a trend throughout MLB though, as this older THT article from John Walsh highlights. That article is outstanding by the way, and I suggest if you have the time to give it a full read. ...

Are LHB being unfairly struck out looking on pitches away?
Beyond the Box Score — ... Well, the easiest thing is just to tell the umpires to call the correct strike zone and it would make things nice and simple. Considering this fishy LHB business has been going on for a while though, it may not change anytime soon. So what else is to be done? ...

Deconstructing the Fastball Run Value Map
Baseball Analysts — ... called strike zone compares to the rulebook strike zone. The inside and the top of the zone are called fairly well (the 50% contour runs along the rulebook zone on these edges), but the outside edge is shifted away a couple inches (the 75% contour runs along the rulebook zone's outside edge) and the bottom of the zone is shifted significantly up (the 25% contour is ABOVE the bottom edge). In addition, the strike zone is rounded rather than rectangular. These results are not new. John Walsh, ...