Does reliever over-use lead to poor subsequent performance?
|
|
The Hardball Times found this 12/19/2007 on www.hardballtimes.com [flag] |
Tags:
MLB
Comments (51)
-
-
John Beamer Sorry MGL -- just dropping comments as I go through this:
You say: Now, just to be clear, even with regression toward the mean, I am not saying that these pitchers did not pitch better in year X+1 than in year X. They certainly did. It is just that if regression toward the mean were the only “force” at work here, their performance in year X would have been, by definition, better than their true talent —in other words, these pitchers as whole, or collectively, were a little unlucky in year X and then regressed to or at least toward their true talent in year X+1.
In the bold can you pls explain how year X performance is better than their true talent. If you regress them to a lower ERA in year x+1 doesn't that mean that as a group those pitchers were worse than their true talent in year x? If I am not mistaken I think you imply that in your next paragraph.
-
tangotiger I noticed that too. I think it's a typo. I'm sure he meant 70 or more in year x.
-
-
etan8_today With all due respect to statistical genius and the baseball world's zeroing in on measuring reality through numbers, I wonder what has happened to the human arm. What happened to the prototype work horse?...the complete game?....the 300 strikeout season?
It seems strange that we have witnessed the rise of the reliever over the past 30 years and simultaneously witnessed the increase in personal trainers and medical breakthroughs. We live longer, physically healthy lives and yet arms are falling off like leaves. Something doesn't add up here. There was a controversial article about Mike Marshall printed on yahoo last season describing his Florida academy and alternative approach to pitching techniques. I say controversial because Marshall sees himself as the balck sheep of baseball because MLB clubs blow him off as a charlatan. I know nothing about pitching mechanics, but what do we lose by trying new methods?
-
skyking162 Sorry I don't have links, but there's been research done that shows pitchers these days aren't any more fragile that pitchers of yester-year. Going back forty years, pitchers threw fewer pitches per batter and recorded outs at a much higher rate. Pitchers these days throw the same number of pitches, but need more per hitter and record fewer outs, thus keeping their IP totals lower. Going back even further, top pitchers used to need less effort to record outs thanks to shallower leagues.
We also tend to remember the workhorses of the past and compare them to the babies of today, which isn't fair. Looking back in 20 years, we've got plenty of workhorses right now -- Randy Johnson and Roger Clemens are two all-time greats who seemed to last forever.
-
halejon Not to mention they're throwing a heck of a lot harder these days...
-
-
-
RedsManRick An interesting point etan. But in that vein, I have to wonder if in the past, our comparative ignorance to the health issues forced a natural selection of only those pitchers capable of handling such great workloads. Many great pitchers may have fallen by the wayside in college or the minors due to abuse. Also, I believe that career length, as measured by years has increased over time.
Perhaps it's not that pitchers today are weaker in some way, but that teams are choosing to use their pitcher career's worth of innings over a longer number of seasons, as doing so minimizes risk of catastrophic injury and helps maintain the highest level of effectiveness. I really don't know how many promising young careers in the 50's and 60's were ruined at age 27 or 28 after the strain of a few 280+ IP seasons. I'm sure there's been a study on this done by the guys at THT or BP.
-
MGL John, thanks, you caught 2 "typos." One, I meant "year X" as Tango says, and two, I meant "worse" rather than better.
When I first read the article quoting Fregosi, I thought, "Well, that is an easy thesis to test." After working on it though, I realized that with all of the regression and selective sampling issues, it is not so easy, especially if we are looking for small effects. Which is one reason why I "concluded" that there is no significant evidence that an "over-use" effect exists, rather than something stronger, like, "There is clearly no over-use effect."
Another thing that I thought was interesting was how many relievers do in fact pitch in one year and not in the subsequent year. In fact, there were more innings or relievers (I can't remember which) who did NOT pitch in a subsequent year than who DID pitch. That creates a huge (.2 or so in ERA) selective sampling issue if you want to look at things like aging or what I was looking at. I don't think you have nearly the same magnitude of the "dop out" syndrome with starting pitchers. The only starting pitchers who drop out from one year to the next, I think, are the truly awful ones who have not had a track record of being good, and the old ones who also probably pitched badly in their last year.
So, for pitching aging studies, I think you should only use starters (of course, then the results are really only limited to starters, although I am not sure that relievers should have significantly different aging patterns), or try and reconstruct what would happen the next year if the pitchers who dropped out were still pitching, or something like that.
-
fjm235 1) We all know ERA is a rather poor indicator of pitcher performance, especially for relievers (due to inherited/bequeathed runners). So why not use DERA or some other measure tied to the 3 true outcomes?
2) You could solve most (though not all) of your qualitative issues between the two groups by simply imposing a lower bound on IP for Group 1. In 2007 there were 43 pitchers who had 70+ IP in relief and 44 who had between 60 and 69 2/3. It's not critical that the 2 groups be that close in number, but it helps. The difference in ERA between them is modest (3.27 vs. 3.46). The biggest difference between the two is that the 70+ group contains only 6 full-time closers vs. 12 in the other group.
-
MGL 1) When you are doing a pitching study involving large groups of pitchers and your sample or samples is unbaised (with respect to parks, defense, opponents, etc.), ERA is just fine. Who said that ERA is a rather poor indicator of performance? That is not true. There is more noise as compared to something like ERC, but in the long run, ERA will converge with any better short-term measure. And the (old) notion that ERA is really bad for relievers is nonsense. Again, because some relievers come in or leave in the middle of innings, there will be more noise, but when working with large unbiased samples of relievers, ERA is fine, and just as good as it is for starters.
2) I'm not really sure what you are getting at here.
-
The_Real_Neal I had to check the calender on this one and check to make sure that it wasn't near the start of the season.
There's a few real 'huh?'s in here.
" any group of pitchers who show a better than average (for them) ERA in one year will always display a worse ERA in any other year "
Now what you're trying to say is if you have a large enough group of guys who performed well over their 'true ERA' for a given year, the next year their ERA will tend to rise. But it's certainly not true for any group of pitchers. For instance, my group is going to be guys who finished in the top 10 in Cy Young award voting who did not recieve any votes the previous year.
you find that their collective ERA is less than that of their true mean (the mean of the population of pitchers from whence they come),
I've read this article three times, and never once have you correctly identified a pitcher's 'True Mean'.
Let me very give brief example of what you're saying here.
Greg Maddux has pitched from 1987 to 2007, 21 seasons. During that time his ERA was 3.07 (making numbers up). The Mean ERA for all pitchers for that period was 4.12. Therefore Maddux's 'True Mean' is 4.12, and any season in which his ERA is under 4.12 is the vagaries of luck.
To do the study correctly, you can't say 'all pitchers have an average ERA of 3.88, therefore each pitcher's 'true' ERA is 3.88'. Each pitcher's 'true' ERA is going to be their own ERA. Whether you want to define it as the years before and after your data point or the career ERAs of the players, your definition of 'True ERA' is totally incorrect, so everything that you've done aftewards is, sadly, worthless.
1. Identify your players
2. Identify the the 'true mean' for each of those players
3. Analyze how each's player ERA changed in relation to their own individual 'true mean' in X+1
4. I think the bounceback effect Fregosi describes would be better illustrated by a three year cycle. 70+ innings, 60 or less inning, 70+ innings
4. Delete this and put up your new, corrected study, preferably with a little less of the 'Fregosi and his ilk' condescension.
-
MGL The_Real_Neal, thanks for the lessons in sabermetrics and statistics.
"Delete this and put up your new, corrected study, preferably with a little less of the 'Fregosi and his ilk' condescension."
I'll get right on it.
In all seriousness, my wording in some of the sentences you quoted, and probably others from the article, were not so great. However, anyone who has a mentality over the age of around 12 AND understands a bit about sabermetrics or statistics should understand exactly what I was saying, which was 100% correct. You apparently do not fall into that category, although I am not certain which shortcoming you have, only that it is one or the other.
-
The_Real_Neal MGL,
Very mature repsonse.
You said that the 'True Mean' ERA for all pitchers in your study is the same.
That's not 100% correct. That's not even 5% correct. Each pitcher has their own 'true mean ERA'.
If you want to compare IQ's or Statistic course grades (are we talking about undergrad or graduate school?) that's fine. The fact remains that you've misused a very basic concept. The fact that I've even given you a very easy to understand example to illustrate how you've misused it, which you can't or refuse to grasp, makes me feel like I'm Tom Cruise trying to explain to Rainman that 'Who's on First' is a joke.
You've bought a statistics book, skipped right to the formulae but failed to understand how to correctly apply the formulas. You'll notice that I didn't challenge any of the mathematics or statistics, I am sure you did all that reasonably well. What I am challenging is your underlying theory that all pitchers are equal. All pitchers are not equal and it baffles me how you could have ever come to that conclusion.
Let me make this very easy for you to understand, we are going to use a pitcher from your study.
Lee Smith. His career ERA was 3.03. The League average ERA for his career was 3.97. Which of these two values should represent his 'True Mean'? The value which includes other, lesser, pitchers or the one that only includes his performance?
In 1983 Smith pitched 103.1 relief innings and had a 1.65 ERA.
The following year he pitched 101 innings and had a 3.65 ERA.
The way you've done your study is saying that in 1984 Lee moved to towards the 'true mean ERA' of 3.97, which is what you should expect. This is incorrect. Lee's 'true mean ERA' should not be impacted by the ERA of Jeff Dedmon. Lee's 'true mean ERA' can only be derived by looking at Lee's statistics, not by looking at what an average reliever did over the given time period. If 3.97 was in fact the true mean ERA for Lee Smith, you would expect that he would have roughly the same number of innings pitched over a 3.97 ERA as he did under it. I am not going to add them all up now, but over his 18 seasons he had only two seasons where his ERA was higher than the league average. What are the odds of that happening, if his True Mean ERA was really 3.97? I think it's about 4000:1 That's more pitchers than was in your sample, and I could find dozens of others who would exhibit the same phenomina. How is that possible? Could it be that you can't derive a 'true mean' for a given player by aggregating the averages for hundreds of players? I think so.
-
halejon "You've bought a statistics book, skipped right to the formulae but failed to understand how to correctly apply the formulas."
Oh, that's just priceless. You have no idea what you are talking about or who you are talking to. Regression to the mean has nothing to do with saying every pitcher's true ERA is the league average. I guess grad school ain't what it used to be.
-
-
MGL Snarkiness (yours and mine) aside, as I said, some of my words and sentences were awkward, such as "true mean," however it should have been fairly obvious what I meant.
When a pitcher posts an ERA (or any other sample of peformance) of X, "our estimate of his true mean ERA," and by that I mean his "true talent ERA" over that same time period, or what his ERA would be if he were to pitch an infinite number of innings (thus removing all sample error) with the same average talent he had during that sample time period, AND what we would expect his ERA to be in any other time period (assuming that his true talent has not changed due to age or anything else), is X regressed toward the mean (will fall between X and that mean) of the population of pitchers (we are sampling from a population) from which this pitcher comes, whether we define that population as all MLB pitchers, all relievers, all LH pitchers, all LH relievers, etc. That also assumes that we know that there is SOME spread (variance) of true talent among the pitchers in the population. If there is not, then the ERA in that "other time period" will be exactly equal to the population mean of course.
That is all I meant. "A pitcher's true mean" makes no sense anyway. (We usually refer to a pitcher's "true talent.") If I stated that in the article, what I meant simply was "the mean of the population of which the pitcher is part."
All of the analyses and calculations were done based on the correct assumptions, regardless of whether I spoke (in the article) awkwardly or incorrectly, which I admit that I did.
An individual pitcher certainly has their own "true talent" which may or may not (probably not, again, assuming some spread of true talent among pitchers within that population) be equal to the population (of which the pitcher is part) mean. We never know exactly what that true talent is. Never. We can only estimate it from the statistical record of samples of his performance as well as scouting and other observational and other data. In the case of the article and most sabermetric research, we are only dealing with samples of statisitical data of course, in this case, ERA.
In any case, I am not arguing a controversial point or trying to defend a position of which you or I may be right (or wrong). I am presenting uncontrovertible facts above.
Honestly, I am not sure whether you and I are on the same page (that you understand how regression toward the mean works with respect to baseball - sabermetric - analysis, such as that done in my article) and that you are simply not understanding the true meaning of some of my admittedly awkward words and sentences, or that you do not fully comprehend how regression toward the mean works in the context of these types of analyses.
In either case, I am quite sure that I have a full and correct grasp of the issues at hand. I have been a professional sabermetrician for almost 20 years, I have an extensive background in statistics, and ALL of these issues have been vetted countless times among numerous other statisticians and sabermetricians. If you would like, send me your address and I will send you a free copy of "The Book."
-
The_Real_Neal Let me try another tact to explain it. I don't have any problem with the concept that pitchers who have exceptional seasons are likely going to regress towards their true talent level in the subsequent year (or pitchers who had exceptionally poor ones). It's the way that you're determining the pitcher's talent level.
what I meant simply was "the mean of the population of which the pitcher is part."
You should never use a mean of a population to determine regressing towards the mean in this fashion.
Lets say the study consisted of six pitchers.
In year one the pitchers had the following ERA's and innings:
1. 2.5 ERA 80 innings
2. 3 , 80
3. 3.5, 80
4. 4.5, 40
5 5.0, 40
6. 5.5, 40
Now the mean ERA for all these pitchers (non-weighted) is four. So we are going to use that as your 'mean of the population' to regress to. The 80 inning group had a mean ERA of 3, and the 40 innings one of 5. According to the theory we should expect to see both groups ERA's regress towards the mean in the next year.
In year two the pitchers had these ERA's
1. 2
2. 4.75
3. 3.25
4. 5
5. 5.5
6. 3.5
You group one pitchers have a mean ERA of 3.33 and your group two pitchers have a mean ERA of 4.67. Therfore both groups regressed towards the mean, just as we supsected and Elia is wrong... except that when we look at the individual pitchers we see something different.
1. Moved away from the mean
2. Moved away from the mean
3. Moved away from the mean
4. Moved away from the mean
5. Moved away from the mean
6. Moved towards the mean
This is the second reason you can't aggregate like you've done.
The first reason is that the 'true talent of the pitcher' for each pitcher just happened to be reached exactly in year 1. The 'mean of the population' of which they are part is 4 ERA, but none of their true levels was actually 4 ERA , so in year two all of the pitchers moved away from their mean. No pitcher in the study moved towards their mean, but because you have compared them to the entire population rather than to their own talent level, the study shows that the pitchers are 'regressing towards the mean'.
Clear?
You cannot aggregate to determine the 'true talent level' when applying the concept of regression towards the mean. You have to do each pitcher seperately, by first identifying their own true talent level.
I'd be happy to have a copy of your book (though since I live in Europe, I'll probably give you my brother's address), but keep in mind just because you two had that book published doesn't preclude you from making mistakes. There were many people smarter than you and I who published articles proving that the Earth is flat and that mice are generated by moldy rags.
Oh, I would stil like to see Elia's theory checked.
If a pitcher pitches +80 innings in year 1, then less than 60 in year 2, do they tend to be better in year 3 than year 2?
-
-
MGL Oy Neal! There is nothing to "try" (and tell me). Although I am not a pure statistician by trade, I am a trained and experienced "applied statistics person." At least you are persistent though!
You don't need to "clear up" anything with me. I know exactly what I am doing (at least with respect to the statistics stuff in this article). As I said, either you don't (know exactly what you are doing - no disrespect intended - just making a statement), or you are misunderstanding my words (and are apparently the only the one - that I know of at least - so my words cannot be THAT unclear). Let me see if I can help YOU (to either understand the concept of regression towards the mean or to understand what I wrote). Before you get all bent out of shape (if you haven't already), just be calm, and carefully (and with as much dispassion as possible) read the following:
"You should never use a mean of a population to determine regressing towards the mean in this fashion."
Without the "in this fashion" that is exactly what regression towards the mean IS. When we sample a population more than once, the results of the first sampling will "regress" towards the mean of the population in subsequent samplings. That is the DEFINITION of "regression towards the mean." Of course, we often do not know what the mean of the population is. In fact, we usually don't. In these kinds of baseball analyses, we often have an idea or we can estimate it. When we use the data to compute a regression equation, inherent in the equation will be the population mean, so we can infer that value from the equation.
I don't know what you mean by "in this fashion" but regardless - you are misinterpeting my analysis, as evidenced by the next thing you wrote.
"Lets say the study consisted of six pitchers."
OK we are sampling some pitcher performance and we get the following sample results. No problem there.
"In year one the pitchers had the following ERA's and innings:
1. 2.5 ERA 80 innings
2. 3 , 80
3. 3.5, 80
4. 4.5, 40
5 5.0, 40
6. 5.5, 40
Now the mean ERA for all these pitchers (non-weighted) is four. "
Still no problem there.
"So we are going to use that as your 'mean of the population' to regress to. "
Big problem there! I don't know if you mean "we" as in "you" or as in "what you think I would do." But that is not how regression toward the mean works. "We", or at least "I" will not be using the SAMPLE mean of those 6 pitchers to regress ANYTHING towards. Those 6 pitchers are not a "population," therefore that is NOT the mean we regress towards. "Regression towards the mean" involves taking a sample, which you did in your example. The sample is N pitchers with performance Y for X innings each, or whatever you want to call it. It is actually 6 independent samples, but it doesn't really matter.
Keep in mind as you read this that am reading each section for the first time and commenting one section at a time.
I did read ahead that you are going to look at another year (the next year, which is fine) and that we should expect to see each pitcher or group of pitchers regress toward the mean. That is absolutely correct.
The mean they would regress to is not necessarily known. As I said, it is NOT the mean of the sample of pitchers or groups of pitchers. That would make no sense. It is the mean ERA of the ENTIRE population of pitchers that this sample is drawn from. We don't know exactly what that is and you have no told us what "kind" of pitchers they are in your example. The population could be relief pitchers, starters, LHP, RHP, etc. What we usually do in baseball studies that require us to do a regression toward the mean is to simply narrow down as best as we can the population that we think a sample of players comes from, using pertinent, relevant, and material characteristics. In my article/study I basically used the mean ERA of all exclusive (no starts) relievers who pitched for 2 or 3 years straight. Sometimes it is not that important to "nail" the population that your sample comes from. In baseball studies, it usually isn't.
"The 80 inning group had a mean ERA of 3, and the 40 innings one of 5. According to the theory we should expect to see both groups ERA's regress towards the mean in the next year."
Again, that is exactly true, but not towrds the mean of this SAMPLE of pitchers.
OK, I lost you in the rest of the post so I cannot comment beyond this.
Of course, any individual pitcher can and will have any ERA in year 2. Each individual pitcher has a "true ERA" which as I said in my last post, is always unknown. We "expect" any individual pitcher's ERA in ANY year, regardless of what it was in any other year to be exactly equal to his "true ERA" (duh). However, since we don't know what is, we HAVE TO assume that all pitchers true ERA (and what we expect in any other year or time period) is somewhere between that which we see (in year 1 or whenever) and the mean of the population that we think the pitcher comes from. Using your example, we will expect all of those pitchers to have an ERA in year 2 (assuming that their true talent has not changed singificantly of course, from year 1 to year 2) of somewhere between their year 1 ERA and the mean ERA of the population the come from. Obviously for 6 pitchers, the chances of that occurring is not that great because of random fluctuation among only 6 pitchers. However, if each pitcher were a really large group of pitchers (with each GROUP having an average ERA of 2.5, 3.5, etc.), then it is almost certain (depending on how many pitchers there are in each group) that all 6 groups will show an average ERA of somewhere between what they showed in year 1 and their population mean.
I sort of take back what I said about not using the mean of all 6 pitchers as our mean to regress to. If that is all we have (we don't know the mean ERA of ALL pitchers or all reliever, or whatever), then the (weighted by innings) average ERA of all 6 pitchers IS in fact an unbiased estimate of the population mean AND we can in fact use it as the mean that we regress to in year 2. That is assuming that these 6 pitchers are randomly (or somewhat randomly in practice) sampled from a certain population. Again, because we only have 6 pitchers, that unbiased estimate is going to have a large uncertainty as well and could easily be off by quite a lot (e.g. even though it is 4.00, these pitchers could easily come from a population of pitchers with a mean ERA of 3.00, in which case their year 2 ERA's will tend to regress toward that 3.00 rather than 4.00, and we won't know it.
Other than that, I really don't know what you are trying to say. There are probably many mistakes in The Book, and I did not write the whole thng, but the way we handle regression toward the mean (which comes up a lot in sabermetric research) and the way I discuss and handle it in the article AIN'T one of them! Trust me on that.
If you write me at mglcardinals@yahoo.com and give me your brother's or your address, I'll be happy to send a book.
-
The_Real_Neal "Those 6 pitchers are not a "population," therefore that is NOT the mean we regress towards.
Incorrect. In my example those six pitchers are the population. You can expand it to include all the pitchers in your original study, and you would get the same result (in fact you already did!), which is my point.
Fregosi's original comment "The biggest reason is, when they have a good year, they're overused. When they have a bad year, they're not used at all. So then they can come back and have a good year. It's that simple".
Fregosi is only talking about your Group 2 pitchers. If you want to argue his point, take every thing you have for the group one pitchers and throw it out. He never said "relievers who pitch comparatively fewer innings will tend to be “fresh” or perhaps less prone to injury in the subsequent year, and will tend to pitch better. Fregosi is never talking about the pitchers who are never good. He's only talking about the pitchers who for at least one season are Group 2. Later you're still confusing his argument "Interestingly, for Fregosi, or anyone else, to say that relievers who throw fewer innings in one year will appear to improve the next year, is disingenuous."
This is, ironically, disigenous. Fregosi never says a poor pitcher will become good by pitching infrequently. What he says if a pitcher is overworked in year 1 then underworked in year 2 he will bounce back in year 3.
To refute Fregosi's argument correctly you have to do two things. First, prove that there are not pitchers who have good performance in year one, followed by bad in year two, and the good again in year three. If you can do that, you then need to show that the 'outlier years' for a pitcher, whether it is good or bad has no correlation to the amount of innings pitched in the prior year. You haven't done either of these things in your article. All your artcile really is is a comfirmation of the value DIPS.
(edited out a lot more blah blah here)
So down at the end of your article you turn your focus on to Fregosi's guys, sort of:
Finally, let’s look at the group II pitchers, the ones who were ridden hard in year X because they were having great years: a collective ERA of 3.16.
In year X-1, they had an ERA of 3.94 for 53 innings each. In year X, a 3.16 ERA with an average of 80 innings pitched.
OK, you're already off the road here. Fregosi is talking about years X, X+1 and X+2. X-1 is not part of his discussion.
Using the same 2/1 weights we used for the group I pitchers, that is a weighted average of 3.35. If we regress that 70% toward the mean (we have a bigger sample), we get a projection of 3.71.
This is a head scracther. What are you doing here? You seem to be trying to predict the ERA for year X+1, correct? Now Fregosi will tell you to throw year X-1 out, because it's not part of his argument, remember his argument starts with the fist year the pitcher has the exceptional season, X. If all the data availlabe to do a projection for X+1 is year X, then the projection for X+1 = X. Even if you're allowed to use X-1, which you shouldn't be, your whole theory is that year X is an outlier due to luck. If that's the case then your prediction for year X+1 should be equal to X-1. It makes no sense (and believe me I am not trying to be mean, snarky or condescending) to do your projection the way you have. You're theory is that the year X is an outlier due to luck, but when you project for year X+1, you not only intentionally include this outlier, but you double accrue it by weighing it more heavily than X-1! But then you do something even more bizarre, specifically, ' If we regress that 70% toward the mean'...
Year X-1 is Z
Year X is Y
Year X +1 = A
(2Y*80 + Z*53)/216+ (Z -(2Y*80+Z*53)/216)*7)/10 = A
A = (160/216)Y + (53/216)Z + (Z-((160/216)Y + (53/216)Z))*7)/10
A =160/216Y +53/216Z+ (7Z-((1120/216)Y + 371/216Z))/10
A = 160/216Y +53/216Z+ (7/10Z-(112/216Y + 371/2160Z)
A = 160/216Y +53/216Z + 7/10Z - 112/216Y - 371/2160Z
A = .74Y + .25Z + .70Z - .52Y - .17Z
A = .22(Y)+ .78(Z)
X+1 = .22(3.16) + .78(3.94)
X+1 = 3.75... hmm a bit off, I know the innings you've rounded off in the article, maybe the ERA's too? Anyway, by 'regressing towards the mean', and using the X-1 as your mean to regress to, all you do is swing your projetion back into being weighed almost 4:1 based on year X-1. We don't know what the true mean is, so you can't do that. You're just guessing, and the data you have on hand supports the idea that's a pretty bad guess, the pitchers performed better than that mean in 2 of 3 years and hit it exactly in the third, so that mean is almost certainly too high.
A minor quibble is the .1 you add for 'drop out' effect. If a pitcher pitches 70+ innings in one year and doesn't pitch in the big leagues the next season... wouldn't that support overuse theory if the pitcher can't make a ML roster the next year due to injury or innefectivenss in spring training?
So what can we conclude?
Nothing really. We know that taken collectively pitchers who pitch greater than 70 innings in a given year see their average effectiveness go down the subsequent season. But is that useful information? No it's not. Because we don't know if 20% of pitchers see their ERA go up 60% and their innings go down 60%, while everyone else remains constant thus ruining the averages for the Group 2. If that were the case, then you could say their's an 80% chance (give or take) that for a given pitcher high usage won't have any effect. We haven't even used the aggregated data to check Fregosi's hypothesis, good pitcher, bad pitcher good pitcher, since we haven't looked at year X+2 at all.
If you do decide to look at X to X+2, please don't aggregate the players, but aggregate the results. It should be something like:
412 of the players saw their ERA (or innings) change significantly over a three year cycle, which started with 70+ innings in year X
Of those 412, 17% had a Up, down, Up cycle
45% had no descernible change
21% had an up down, down
etc
Then you could say 'Fregosi doesn't know what he's talking about! That phenomena only is displayed in one of six pitchers'. All the original article shows is that pitchers who do very well, tend not to do so well the next season and bad/mediocre pitchers are bad/mediocre pitchers. Neither of these is life altering revelations, and regressing to the mean is part can be part of it, but we have no idea how big or small a part.
-
The_Real_Neal I just realized the fallacy that's being applied here.
The populations you use should be player specific, rather than league wide.
I don't have a stats book, but from Wikipedia:
"Consider an extreme example: a class of students takes a 100-item true/false test on a subject on which none of the students knows anything at all. Therefore, all students choose randomly on all questions leading to a mean score of about 50. Naturally, some students will score substantially above 50 and some substantially below 50 just by chance."
This is how you need to apply regression to the mean. All your data points need to be on a level playing field. Once some of your data points are Bob Scanlan's and some of them are Goose Gossage's you can't mix them get a 'true mean' and determine anything meaningful.
Suppose a true-false biology test was with 50 questions given to 100 Chinese people who can't read English and to 100 Pre-med students. The average score for the entire 200 population was 65%. The Pre Med students were equally intelligent and had the same eductation and averaged 80%.
Now you segregate the groups into people who scored above 70%, and retest the those with another test of the same diffculty. In this second sample you have 100 students, 10 Chinese language students and 90 Pre Med Students.
Your average score is going to be 77%.
Your study done the same thing - mixed populations with different skill sets and given them the same test, but expected them to perfrom at the 65% level, but the 65% level isn't the mean for anyone. The Chinese students have a mean of 50% and the Med Students 80%.
In the baseball world Eckersly had a mean of 2.85, and Smith 3.04 and K-Rod 2.37 and Buddy Groom 4.64. You can't just throw them all into one big bucket and say 'Regress to league average', because they won't do it. Your Chinese Students and Med Students will never regress to 65% just like Mariano Rivera is is not going to regress to 3.94 ERA.
-
-
cyberwulf [ Begin quote ]
Each individual pitcher has a "true ERA" which as I said in my last post, is always unknown. We "expect" any individual pitcher's ERA in ANY year, regardless of what it was in any other year to be exactly equal to his "true ERA" (duh). However, since we don't know what is, we HAVE TO assume that all pitchers true ERA (and what we expect in any other year or time period) is somewhere between that which we see (in year 1 or whenever) and the mean of the population that we think the pitcher comes from.
[ End quote]
I (and I think Real_Neal) disagree with the bolded part of the statement. A "bad year" for Johan Santana may still be better than league average for starting pitchers, so wouldn't we expect him, in subsequent years, to regress to his individual mean rather than to league average?
As for an alternative approach, it seems like a simple regression approach might go a long way towards answering the question of interest. For simplicity, consider only two consecutive years of data. Then regress Year 2 ERA on Year 1 ERA and innings pitched in Year 1. While there are some subtleties that this method does not fully address, it will quantify the effect of Year 1 innings pitched on Year 2 ERA among "similar" pitchers (in this case those with similar ERAs in Year 1).
Bear in mind that I am a student of statistics and not sabermetrics, and am not attempting to denigrate what was a serious, in-depth study of a difficult question.
-
-
cyberwulf I've just skimmed these latter comments, but I have to say that I think Real_Neal is correct. MGL, your approach appears to suggest that *all* pitchers should regress towards ONE common mean, which simply doesn't hold up to scrutiny; as mentioned by Real_Neal, this would imply that Greg Maddux has simply been "lucky" to avoid this "regression" for his entire career. In fact, a much more reasonable explanation is that his "individual mean" is drawn from a *population* of pitcher means (talent levels), and his particularly talent level is in the tail of *that* distribution.
More to come later, hopefully.
-
David Gassko Jonothan Papelbon had a 1.85 ERA last season. His career ERA is 1.62. Is your contention that his ERA next season will therefore be somewhere between 1.85 and 1.62 (his "true mean")?-
cyberwulf Yes, I believe so.
-
-
MGL "I (and I think Real_Neal) disagree with the bolded part of the statement. A "bad year" for Johan Santana may still be better than league average for starting pitchers, so wouldn't we expect him, in subsequent years, to regress to his individual mean rather than to league average? "
Nothing to disagree with. Honestly. What I wrote is exactly correct. You guys don't know enough about statistics (with all due respect) to be debating this. You are mixing up terms, for one thing (among many mistakes).
No one "regresses to their individual mean." No one has "an individual mean." If you mean his "true talent," (I think you do), or what his ERA would be if he pitches an infinite number of innings, then in subsequent years, we "expect" him (of course, he probably won't, but that is our best estimate) to post an ERA EXACTLY what his true ERA (I think you call this his "individual mean" or something like that) is. But of course, we NEVER know a pitcher's true mean. We can only estimate it from his sample performance, which is the whole point of these types of analyses and the whole goal in forecasting players. If you read up a little about sabermetrics and how regression to the mean and other statistical techniques are incorporated into it, rather than shooting off your less-than-fully-informed mouths (again, with all due respect - I'm sure there are plenty of things you know better than I - I am somewhat of an "idiot savant" with respect to baseball), you would KNOW that our best estimate of a pitcher's true talent (or individual mean or whatever you call it) is his sample performance regressed toward the mean of a population of players he comes from (whatever and however we determine that to be - absent any other information other than he is an MLB pitcher, we use the mean ERA for all MLB pitchers, perhaps from a certain era, if we know when this pitcher pitched).
You could do a regression (not to be confused with "regression" as in "regression to the mean" although they are related) of year 2 ERA on year 1 ERA and IP, and it might yield some useful information. I would have to think about that and/or do it (which I won't because I don't have the time) and see what comes up. The problem with regressions of course, in these types of analyses, is that they do not necessarily address cause/effect, only relationships, as you well know.
You are welcome to do that and see what you come up with. If you are a student at a University or high school (I assume the former), you might get some extra credit!
Cyberwolf, in your last post about Maddux, etc., you are simply not understanding what we mean in these kinds of analyses, and I don't think you fully understand regression toward (or towards) the mean in general. I suggest you look around the web for some primers on sabermetrics and statistics, particularly with respect toward regression toward the mean and similar statistical concepts. There is a difference between asking questions about something which you know a little about, but not enough to nearly be considered as an expert, and "arguing" with people who are genuinely experts in the field. David and I are truly experts in the field (sabermetrics and statistics), although neither one of us are "expert" statisticians (though I can really only speak for myself). Not to say that you can't add to the discussion. You both seem like intelligent, if not obstinate, fellows. It would be like if I took some courses on cosmology or astrophysics and then thought I could correct Stephen Hawking. Not that I am comparing Hawking to myself. He is way smarter than I (though he could not stand up to me in sabermetrics of course - I assume).
Santana and Papelbon will of course "most likely" pitch to their own true talent (ERA or whatever) next year. But since we don't know what that is, our best estimate (that is where all these projections come from!) is their prior ERA regressed toward the mean of the population we think they come from. That is how it works. Actually not too complicated.
You can test this quite easly! Remember that when you test things like this, you take lots of similar players (including similar in the sense of what you want to test - e.g. ERA, IP, etc.) in the past and then look at what they did in subsequent years. For example, look at all pitchers (you can limit it to relievers if you want or RH relievers, and/or of a certain age, or however you want to classigy your pitcher in question, as long as you don't classify him by the stat - like ERA - that you are testing) in history who had an ERA of less than 2 in their first 2 years (or any 2 years) and then look at their ERA in the next (or any other) year. That should approximately equal your projection for Papelbom.
If you do that you will find that all similar pitchers will in fact have an ERA of those 2 years regressed toward the mean of all simiular pitchers, similar pitchers pretty much being young RH relievers (NOT pitchers who posted low ERA's in their first 2 years or any 2 years!).
In the case of Papelbon, you could even use as your mean, all closers, although that is a bit problematic as one reason he is a closer is BECAUSE of his great first 2 years. Regardless of what you use, you will find that he (and all other similar pitchers in history) will likely post an ERA in the subsequent year of his prior years' ERA (say, 1.7), regressed toward something like 3.50 (the mean of the population of pitchers he comes from - NOT pitchers who are great pitchers - that is what we are trying to find out - to what extent he is likely to be a great pitcher! We don't KNOW that he is a great pitcher - there is a finite chance that, albeit very small, that he is an average pitcher who got REALLY lucky or even a better than 1.7 ERA pitchers who got a little unlucky, etc., although we don't think a pitcher like that exists, so that won't be part of the equation).
How much we regress the 1.7 toward the 3.5 (or whatever that mean is) depends on two things: One, how many innings of data we are using for his sample performance (the more IP, the less we regress, because our sample error on that 1.8 ERA would be less), and two, our estimate or knowledge (which is never perfect, but we can estimate it with a decent degree of reliability) of the spread (variace) of true talent in the population of pitchers we think he comes from (i.e. how many "true" 1.80 pitchers there are - probably close to zero - how many true 1.90, true 2.00, true 3.00, true 4.00, etc.).
Read, listen, stop arguing, and learn! I try to tell my son this all the time.
-
The_Real_Neal Let me put it very simply for you MGL,
You are wrong.
You do not understand the difference between random samples and non-random samples.
We've already established that:
1. I am better educated than you.
2. I am better at mathematics than you.
3. I undestand statistics better than you.
I broke down your math and you went running like a seven year old to hide behind a trite cliche 'You guys don't know enough about statistics (with all due respect) to be debating this.'
We already proved statistically that your analysis was incorrect. You haven't be able to articulate a defense or even with your 7th grade algebra skills been able to defend any of your points.
I know at one point you were a consultant for the Cardinals, who brought in Kip Wells of the 5.70 ERA this off season. How impassioned was your argument that he would regress towards the ML Mean in 2007? Oh, wait, he did regress towards the mean, and the Cardinals dropped two places in the standings.
You simply don't undertand when statistical analysis is appropriate and when it's not. I called you and Tony the Tiger out on it three years ago when you tried to apply your 'UZL' adustments to sample sizes that were less than 35, and pretty much any non-imbecile (you're obviously exempt from that group) would realize it was time to back off. Still, you continue to mis-use statisical analysis (you're like the Joe Morgan of Sabermaticians, in case you're confused) to try to prove non-sensical hypothesis.
Seriously, the baseball comuntity and the world as a whole would be better served if you go back to being a postman or librarian until someone invents PED for your cranium.
So here you go, fucktard.
" you would KNOW that our best estimate of a pitcher's true talent (or individual mean or whatever you call it) is his sample performance regressed toward the mean of a population of players he comes from"
So, here's your chance. Prove it. If you do, I will full apologiz,e and admit that someone with a sub-average IQ can do Sabermatics just fine. When you can't prove it, retire.
-
The_Real_Neal One more quick quote of the moneky from Friends, talking out of his ass:
" If you mean his "true talent," (I think you do), or what his ERA would be if he pitches an infinite number of innings, then in subsequent years, we "expect" him (of course, he probably won't, but that is our best estimate) to post an ERA EXACTLY what his true ERA (I think you call this his "individual mean" or something like that) is"
Yet when Marcel the Monkey actually goes to 'regress' the ERA's he uses the league mean. He never explains why he uses the league mean, even though he has just stated, "we "expect" him to post an ERA EXACTLY what his true ERA is". Again it goes back to his underlying theory that all pitchers are equal even though we have irrefultable statistical evidence that they are not.
Hey, Marcel the Monkey, Roger Clemens was a better pitcher than Paul Kilgus.
-
-
cyberwulf OK, I'd rather not start a flame war here, so I'll skip over the ad hominem attacks on our statistical expertise (which is just as insulting as you feel our under-informed questioning of your methods is).
For me, the crux of the problem is *not* an understanding of regression to the mean as a concept, but *why* it provides the best prediction of future performance for an individual player. From a statistical perspective, if we think that a player's N seasons come from a symmetric distribution with (unknown) mean M and (unknown) variance V, then that player's mean performance over those N seasons gives an unbiased estimate of M. Regressing that mean performance towards some other mean L (say, league average) is a biased estimate of M.
Intuitively, it would seem to be logical to use an unbiased estimate of a player's "true talent" M than a biased one to predict future performance. However, I'm prepared to believe that the regression to the mean method (i.e., biased estimate), does better in predicting future performance.
So, can someone perhaps point me to a reference (or give a good intuitive explanation) which suggests why regression to the mean outperforms simple averaging of a player's past performance?
I'm prepared to learn, as long as that learning isn't preceded by unnecessary personal attacks. I don't think it does the sabermetrics community any favors to alienate people who question the way in which an analysis was performed - isn't that ivory tower attitude the same one that sabermetricians have fought so hard to overcome in traditional baseball organizations?
-
The_Real_Neal I have to admit that I am drunk.
Even so, is this not the crux of Marcel's argument?
The average salary for the US employee in 2002 was $36,764.
In 2002 Alex Rodriguez (signing bonus excluded) made $21,000,000 so in 2003 we would expect his salary to regress towards the mean, let's call it 50% So in 2003 A-Rod should have expected to make somewhere around $11,518,382.
I have to agree with Cyberwulf, that this mis-application of statistics to baseball is what gives 'professional sabermaticians' a bad name. I don't have any problem what-so-ever with using statistics to analyze baseball, but this "square peg, round hole" kind of approach certainly contributes nothing to the overall wealth of baseball knowledge.
This is the point where I give up with Marcel, though. I've hit the "What is an idiot doing when you're arguing with him? The same thing as you" point..
MGL, I will be happy to logic check things for you in the future (because, by God you need it), to explain to you when the rules of statistical analysis can be reasonably applied and when they can't. Just send me an email.
-
GuyM I have to admit that I am drunk.
Now there's a shocking revelation. Neal, you're in way over your head. If you want to understand these issues, take MGL up on his offer of a free book (assuming that offer's still on the table). If you don't, that's fine, but then stop embarrassing yourself with these posts.
Cyberwulf: A player's own mean would be our best estimate of his talent if that's all we knew. But we also have other information, namely the mean and variance for all MLB pitchers. So we know, for example, that no pitcher in the modern game has sustained an ERA as good as Papelbon's over a career. By regressing to the league mean, we're saying that it's more likely that he's a good pitcher who got lucky than that he's the greatest pitcher who ever lived. We don't know that for sure, it's just much more likely. On the other hand, if league average ERA were 1.80, our estimate of Papelbon's real skill would then be quite close to his own mean.
It's easy to check this yourself. Just do the work.
-
GuyM Cyberwolf: Here's an example for you. You'll find all relievers who had an ERA under 2.50 over their first 3 seasons here: http://www.bb-ref.com/pi/shareit/Nn3c. If you look at their subsequent performance -- let's say, seasons 4-6 -- you'll find that some managed to maintain a comparable ERA, a lot of them got worse, and virtually none improved. If you want to figure out their overall average performance in the second 3-year period, you'll find that it is significantly higher than in the first 3 seasons.
It's certainly a fair question to ask what the correct population mean is to regress a player to. Papelbon isn't just a "reliever," he's a closer and throws 100 MPH. So maybe he should get regressed to a lower mean. On the other hand, Bob Wickman is a "closer" and Derek Turnbow throws 100 MPH, so it's not clear how much extra credit we should award for these characteristics. So you can certainly debate which "population" of pitchers is most relevant, but it still won't be true that a pitcher's own mean is the correct estimate of his talent.
-
cyberwulf It certainly makes intuitive sense to regress for outliers, i.e. those whose performance don't appear to be maintainable. However, in regressing to the mean, you will (in some cases) regress "away" from a pitcher's "true ERA" if they happen to have a bad season in year 1. For example, if a pitcher's true ERA is 3.0, league average is 4.0, and he posts an ERA of 3.5 in his first year, then his predicted ERA in year 2 is going to be somewhere between 3.5 and 4.0, which is further from 3.0 than if we had simply used his first-year performance.
Now, this being said, it's of course possible that the corrections we make for outliers are more "important", statistically speaking, than the errors of the type described above that we introduce when we use regression to the mean.
In fact, I realized this morning that the statistical justification for regression to the mean is Stein's paradox, which states that if you want to estimate separate means for K individuals, then you do better (in a precise statistical sense) by "shrinking" (i.e. regressing) each individual's mean towards the group average of all the individuals. A quick Google search on Stein's paradox will yield some more in-depth explanations.
I'm sure I'm rediscovering the wheel to some degree, but I wasn't finding the "we regress to the mean because, well... look at this example to see that it works!" completely satisfying.
-
GuyM For example, if a pitcher's true ERA is 3.0, league average is 4.0, and he posts an ERA of 3.5 in his first year, then his predicted ERA in year 2 is going to be somewhere between 3.5 and 4.0, which is further from 3.0 than if we had simply used his first-year performance.
Yes, that's true. But here's the trick: among all 3.50 pitchers, there are far more 4.00/lucky pitchers than 3.00/unlucky pitchers. Why? Because there are a whole lot more 4.00 than 3.00 pitchers in general (while good and bad luck are randomly distributed). So if you were to place bets on whether any given 3.50 pitcher will improve or get worse the next season, you'd quickly impoverish yourself by betting on "improve" (unless someone gave you odds, of course).
-
-
-
-
-
-
-
skyking162 If you have the mean ERA for one pitcher, why would you regress any one season's ERA toward's that mean? You'd just use his mean (or a weighted mean).
The ARod example is ridiculous.
You regress an individual pitcher's ERA towards the population he's a part of, which is MLB pitchers. The more data you have, the less you regress. We're not projecting Johan Santana to post a 4.50 ERA next year because that's MLB average. He only gets regressed a touch, because there's a very small likelihood he's been extremely lucky over the past four seasons. If you had evidence that a certain set of MLB pitchers performs differently from MLB pitchers overall (say, knuckleballers), then it would make sense to regress pitcher is that group towards the group mean. If we had scouting data on a player, it might make sense to regress past performance towards the mean of players with similar scouting reports.
-
cyberwulf Now that I've got my head on straight about the utility of regression to the mean, I've still got a bit of an issue with the methodology of this study, with the main issue being the choice of ERA to regress towards.
Your analysis assumes that all relievers will regress towards 3.86, a number arrived at by averaging the performance of all relievers in the study. However, in your introduction, you point out (correctly, I believe), that relievers who pitched 70+ innings in year X may have a "population" ERA lower than those who pitched < 70 innings. If this were the case, then wouldn't you want to regress them towards this mean (say, 3.5) than towards 3.86? Despite the fact that you consider "comparable" pitchers in year X-1, I would argue that they are *not* comparable for the purposes of projecting future performance by virtue of the fact that one group pitched 70+ innings and the other <70, which likely correlates strongly with pitcher skill.
In other words, it seems reasonable that in your "experiment", group I pitchers are those regarded as having relatively little talent in year X-1 and hence are not used extenstively in year X. Group II pitchers, on the other hand, are regarded as talented/promising, and are given more opportunities not only because of their better-than-average performance, but because they are thought to be performing up to their actual talent level. If this is the case, then Group I pitchers constitute a sample from the population of "journeyman" pitchers (say, with a true population ERA of 4.0), but Group II constitute a sample from the population of "better-than-average" pitchers (say, with a true population ERA of 3.6).
So, with this in mind, does regressing towards 3.86 for both groups still make sense? I'd be interested in hearing an explanation.
-
MGL Thanks for the help, Guy and SkyKing. Another example for cyber is if you take a coin out of your pocket and flip it 10 times and come up with 6 heads and 4 tails or 6 tails and 4 heads (which is fairly likely of course). That is an unbiased estimate of the the coin's true heads/tails ratio, yet you would NOT use that to estimate the coin's true ratio and you would NOT predict that to be the most likely result ifyou flipped it another 10 times. Why is that?
Because, as Guy explained, you have other information which makes it a Bayesian probability problem (look up "Bayesian probability" on the web). The information you have is multi-fold. One, you can see that the coin has two sides that appear to be equally likely to land on, two, you know from past experience that coins tend not to be biased, etc. Quantitatively, you KNOW that pretty much all coins are fair and that the mean heads/tail ratio for all coins is around 1.
So in estimating the true ratio for this particular coin that you flipped 10 times, you would actually regress the 6/4 ratio that you got (your sample result) toward mean of the population (of all coins or even all coins that might be in your pocket), which is around .5 heads and .5 tails (you can also establish a good estimate of that population mean by flipping lots of coins lots of times).
How much do you regress? Remember, we said that how much you regress a sample result depends on two things: One the sample size of your sample result (in this case, 10) AND the variance (apread) of true heads/tails ratio in the population (in this case, zero). If the variance is zero in the population, as it is in this case, then the regression is exactly 100%.
Keep in mind that if we are sampling only one element from the population (one coin or one pitcher) regression occurs only if we have "other" information, which we do with pitchers and coins (this "other" information is called "a priori" probabilities in Bayesian terms). If we have no other information (a priori probabilities), then yes, that one element's result becomes our best estimate for that population and for that element itself.
Now, if we change the coin flipping scenario a little, we can create a situation which is exactly like a pitcher. Let's say that we do in fact have some coins, even all coins, that are biased. Let's say that all coins are biased and that the true heads/tails ratio of coins vary from coin to coin with a mean of 1 (.5 heads and .5 tails). So that is the mean of the population of coins. However, this time we have some coins that actually are .6/.4 heads/tails, .4/.6 heads/tails, .7/.3, etc. IOW, if we flipped on of those .6/.4 coins an infinite number of times, it would land on heads 60% of the time.
Let's also say that the distribution of true heads/tails ratio is approx. normal, with a mean and median of .5.
Now what happens if we flip a coin 10 times chosen randomly from this population and come up with 8 head and 2 tails. You might be tempted to say that our estimate of its true heads/tails ratio is 8/2, since there are indeed some coins in our population that are true 8/2 coins. No! Why is that? Because in a normal (or roughly normal, or at least one where that around the mean is overrepresented) distribution there are many more coins that are near true .5/.5 than .8/.2. Mathematically (and this can be easily cacluated using Bayesian probability and the properties of normal distributions, namely z scores) it is much more likely that we picked a .5, .45. .4, etc. (close to .5) coin and just got a little lucky or unlucky (with our 8/2 flips) than we picked a true .8/.2 coin and got what was most likely in 10 flips. Why is that? Because there are many more close to fair coins in our distribution to chose from and it is not all that hard to get 8/2 in 10 flips with any coin.
Now, mathematically, if we use Bayes Theorem and all the einformation we have, we can actually estimate the true head/tails ratio of that coin by regressing the 8/2 that we got in 10 flips, or .8 heads, toward the mean of all coins that we drew this coin from, which is .5. How much we regress depends on 2 things: One, the 10 flips. Why is that? Because, as I said, it is not that hard to get 8/2 in 10 flips whether I have a true .5 coin or a true .8 coin. If I got 80/20 in 100 flips it would be a lot harder to do that with a .5 coin than with a .8 coin, so there would be a much greater chance that I happened to pick a .8 coin even though there were many more .5 (and close to that) coins to choose from in my population of coins.
The second thing that affects how much we regress is the variance in head/tails ratio in my population of coins/ Why is that? Because if my variance is small, then there are WAY more coins near .5 to choose from so the chances that I have a near .5 coin and just got 8/2 as a fluctuation is WAY larger than the chances that I chose a true .8 coin (or true .9, .85, .78, etc.) simply because there were hardly any of those coins in my population.
But, the larger the variance of true heads/tails ratios among my coins, the more of those very unfair coins there are (as a percentage of the population), so there is a much better chance that I actually chose a very biased coin and that is why I got a .8/.2 sample result (even though it is STILL more likely that I chose a coin closer to .5 than .8.
This is EXACTLY the same scenario we have with pitchers. We have a population of pitchers (coins) with a mean ERA of whatever (say, 4.50), which is the same as the coins' .5. We have a distribution of true ERA's among that population of pitchers with a variance of whatever. So when we randomly select a pitcher from that population, no matter what ERA we get in any period of time it is always more likely that we chose a pitcher who has a true ERA closer to the population mean that his own sample ERA, simply because our assumtion is that there are many more pitchers with a true ERA around average than above or below average, which is a true assumption, more or less.
-
fjm235 After all the flamewars you've gotten back to the point I tried to make days ago. Yes, you have to regress to the mean. But it's not just a question of how far you regress (which depends upon the amount of data you have on each individual pitcher) but also what population mean you regress toward. If you know nothing about a pitcher but his 2007 ERA you have no choice but to regress to the MLB average ERA. But if you know he throws 95+ you will almost certainly do better by using the mean of the population of pitchers throwing 95+ than you will with the overall population. And if you know he throws 95+ AND he's a lefty you will probably do even better by looking only at that population mean.
Of course, for each characteristic you add you cut down the size of the population. Before long you will find yourself with nobody left but the pitcher you started with. So it is a constant balancing act: you need enough data to calculate a reliable mean for the population as a whole while at the same time excluding pitchers that having little in common with the one you're interested in except their MLB employment status.
In this particular case all we really know is that we have a group of pitchers who may (or may not) have been overworked by throwing 70+ innings. We now need a control group of pitchers who are similar to them but who did not throw as much to create the population. Clearly we can't use all MLB pitchers or even all relievers because there are major qualitative differences between them. The combined ERA for our 70+ group was 3.27. The combined ERA for all relievers (20+ innings) was 4.22. The combined ERA for all those between 20 and 70 innings was 4.40.
Now you have to ask yourself: what is the likelihood that an ERA differential of 1.13 was caused by "luck", defense or other factors having nothing to do with the true talent of the pitchers? Simple intuition tells us the probability is very low. Sometimes intuition is wrong. But in this case statistics strongly supports us in rejecting the null hypothesis.
So now we need to find a population larger than our 70+ group which includes them and which does not appear to be significantly different in true talent. That might be very hard to do. Fortunately in this case it's quite easy. Here are the ERA's by IP groups.
70-94 3.27
60-69 3.46
50-59 4.26
40-49 4.34
30-39 4.91
20-29 5.05
Aha! We have a pattern. Pitchers who pitch a lot do better than those who don't. No surprise there. But what was surprising (at least to me) is that

MGL
You say: only included those who had no starts in back-to-back years. First I split each pitcher included in the study (back-to-back years with no starts) into two groups: One, less than 70 innings in year X (and any number of innings in year X+1), and two, 70 or more innings in year X+1.
And then you say: Group I, the low innings pitchers, averaged 33.7 innings per pitcher in year X and 35.5 in year X+1. In group II, the high innings pitchers, each pitcher averaged 86.7 innings in year X and 64.9 iinnings n year X+1.
How can group 2 only average 64.9 innings in year x+1 if your criteria specifies that to fall in that group IP > 70