Saturday, November 05, 2005

Run Production

I recently charted some seasonal statistics from baseball history into a huge spreadsheet. I looked at every major league season and determined the following stats, some of which were easy, others I had to compute myself:

Runs per Game: This is the most basic indicator of run production in baseball. It can be applied to a team or to an entire league. 5 runs/game is a lot, 4 isn't very much, and 3 is basically dead-ball.
League ERA: This isolates pitching as part of the equation. Runs per game indicates all runs, those allowed by the pitching and the defense. The League ERA tells us, roughly speaking, the quality of pitching in the league. Although I must point out that defense plays a part in ERA as well.
League AVG/OBP/SLG: These separate raw offense into a finer understanding. It's one thing to say that a lot of runs were scored; how were they scored? Was it high-average and low-slugging, which would indicate a singles-happy, power-less offense? Or was it low-average and high-slugging, which would indicate raw power? Or some variation of the two? This helps us isolate just what part of offense is prevalent.
League ISO (Isolated Power): Isolated power is the most accurate way to measure just that, power. Slugging percentage is a tally of how many bases an offense accumulates. But it doesn't say how they accumulate those bases. A high-offense team of singles hitters would rack up a high slugging percentage, because they accumulate a large number of total bases, even if nobody tops 15 HR.
Isolated power is calculated by subtracting batting average from slugging percentage. How does this measure power? Batting average measures hits/at bat. Every hit is weighted equally, so that a home run is the same as a single. A .350 hitter might be Babe Ruth or he might be Tony Gwynn; there's no way to tell. Whereas slugging percentage compounds batting average; a single is worth 1 base, so it looks the same as batting average. But a double is worth 2, etc. So if a hitter only hit singles, their batting average would be the same as their slugging percentage. Subtracting batting average eliminates singles from the equation; it lets you know how many extra bases a batter accumulates. So Ichiro Suzuki might have a .450 slugging percentage (which looks good), but when you subtract his .370 batting average, you get a .08 ISO, which is low. Dave Kingman might also have a .450 slugging percentage, and I don't need to tell you that Kingman and Ichiro are polar opposites in terms of hitting. Kingman's .450 slugging percentage looks great when you factor in his .220 batting average. Kingman's ISO is .23, which is fantastic. Other hitters, such as Adam Dunn, don't have such terribly high SLG, but when you consider that their average is in the .250-range, you see them as they are: batters who don't often get hits, but when they do, they're big ones.
ISO is a much finer stat; the difference between great and awful is low. When looking at a whole team (or a league, in this case), an ISO of .15 would be very high. Whereas anything below .1 would be pretty low.
Home Run Percentage (PA): This is the percentage of plate appearances (walks + at bats) which result in home runs. We can't just look at the raw number of home runs hit. There are more home runs hit today than in 1930, because there are 30 teams today, up from the 16 back in '30. We could look at HR/G, but not all games are created equal; some are rain-shortened and others are extra-innings. The most precise measure is HR/PA, which lets us know simply how often home runs are hit, as a percentage of total at-bats. The all-time leader in home run percentage is Mark McGwire, who hit a home run in 7.77% of his total PAs. Babe Ruth's mark is 6.83%. For an entire league, a 3% HR percentage is very high; anything below 1% is very low.

Looking at these numbers over each season of baseball history, what conclusions can we draw? I'll start with a look at the season we just completed, 2005. It's refreshing to see the stats from 2005 placed in a historical context. We know we're in the midst of a hitter's era, but how much of a hitter's era? And how much of that is power?
Well, in 2005, there were 4.76 R/G scored in the AL and 4.45 R/G scored in the NL. We would expect this, because the DH ensures a higher level of offense in the AL. While the actual difference between the two leagues varies, the NL hasn't outscored the AL since 1974. The DH was introduced in 1973. How the NL managed to outscore the AL even with the DH is beyond me at this moment.
How does this level of offense compare with the past few years? Both levels are actually down from 2004. In 2004, the numbers were 5.01 R/G in the AL and 4.64 R/G in the NL. That's a not-insignificant decline. Could this decline be traced to the new steroid policies? It's possible, but before we go a-leaping to conclusions, I have to tell you that these numbers fluctuate madly for reasons no one can fathom. It's possible that the steroid policies resulted in the drop in offense in 2005, and it's just as possible that it was pure chance, that the levels would have dropped either way.
The downward trend in R/G has been going on for a while. The hitter's era that I referred to earlier hit its peak from 1996-2000. The AL peaked at 5.39 R/G in 1996, making it back up to 5.30 in 2000 before starting to decline. The NL peaked at 5.00 R/G in 1999 and 2000. It, too, has since declined. So the hitter's era I referred to earlier appears to be over -- at least the worst of it is. It's impossible to determine exactly what the trend is. We might be in the middle of a temporary trough getting ready to sail back up again, or this might be a definite decline back to normalcy.
How does this hitter's era rank against other hitter's eras in the past? Well, while this is certainly one of the highest run-scoring environments ever, it is not the highest run-scoring environment in history, which is a bit of context I'd like most baseball columnists to grasp. Before the 1990s explosion, the American League last topped 5 runs/game in 1950, with a 5.04 mark. The NL was close behind at 4.66. But this did not compare with the '90s. You have to go back to the 1930s to find a run-scoring environment comparable to the '90s. And folks, would you believe that runs were scored at a higher rate in the 1930s than in the 1990s? While the 1990s featured just a few seasons above 5 R/G, the AL stayed above 5 R/G every season in the decade of the 30's. The NL topped the mark just once, in 1930 itself. In fact, the 1930 National League mark of an amazing 5.68 R/G is much higher than the 1996 peak of 5.39 R/G in the AL. The American League in the 1930's was a noticeable higher run-scoring environment than the 1990s.
What if I told you that that wasn't even the peak? The 1930's was the deadball era compared to the 1890's. Nobody remembers the 1890's as an era heavily favoring hitters, but it absolutely puts the 1990's to shame. Would you believe it if I told you that the 1894 National League averaged 7.36 R/G? The average team scored more than 7 runs per game! This is partly because the league ERA was 5.32 that year, and partly because defense wasn't what it is now. Rudimentary gloves and pre-historic groundskeeping made errors and unearned runs a common occurence.
Due to the chaotic nature of leagues and league organization at this time, run production fluctuates much more wildly than at any other time in recent history. Four years before the historic 1894, the NL averaged a pedestrian 3.56 R/G. This was in 1890, and must be partly explained by the brief emergence of a third major league, the Players League, which would naturally have an effect on league run production.
The story of run production in baseball can be explained, briefly, thus: After the chaotic years of early baseball and the hitter-happy 1890s, the emergence of the American League served as the knock to enter the game into the "Deadball Era." Too often, people think of the deadball era as an all-encompassing pre-history; sort of as "Everything before Babe Ruth." But the Deadball Era didn't begin until the AL and NL had made peace, somewhere around 1904. In 1901, both leagues were hitter-happy. In 1902, just the AL was, and then in 1903 it switched, with just the NL posting high run production. But in 1904, both leagues settled down into what we know as the DeadBall Era. The AL averaged 3.54 R/G and the NL posted 3.91. It basically stayed this way until the end of World War I. (There are two seasons, 1911 and 1912, where both leagues' run production rose, before falling back in 1913. I don't have a clue as to why this happened, but there has to be some historical event in these two seasons to explain it, if I can just dig it out of the history books).
And then Babe Ruth made baseball a hitter's paradise. Well, it didn't exactly work out like that. Babe Ruth didn't cause baseball to become more hitter-friendly in 1920, any more than Bob Gibson and Sandy Koufax caused it to become pitcher-friendly in the 1960s. Babe was the biggest symbol of an overall trend. While it is true that Babe's fame and success helped spread the new offensive philosophy of pure slugging, he cannot claim credit for an entire era. Historians would suggest that the banning of the spitball and other trick pitches was a prime reason. There was also the unfortunate death of Ray Chapman in 1920. Whereas before teams would use one ball until it literally became unusable, Chapman's death at the hands of a dirty, nearly-invisible ball caused baseball officials to use more fresh, clean baseballs that jumped off hitter's bats. All these factors came together to bring about the hitting renaissance.
Offense didn't reach its peak in the 1920s; that would come in the 30's, as I said earlier. But it all began in the 1920s. The slow rise out of the DeadBall into a hitter's era took place during this ten years. In 1918, the two leagues combined for about 3.63 R/G, a very low figure. In 1920, the two leagues together scored about 4.3 R/G, which is an incredibly large increase in just two seasons. It would continue to slowly climb until it peaked with the average of about 5.5 R/G in 1930.
Offense dropped steeply in the 1940s, an event which was probably caused chiefly by World War II. Again, it's impossible to isolate one factor, but it can't be a coincidence that run production fell off so steeply when all the superstar hitters went off to war. Sure enough, they rebounded after the war was over, with the 1950s a pretty stable decade, altogether. You could look back at the 50's as somewhat of a middle ground; halfway in between the homer-happy 30's and the DeadBall teens.
Of course, things fell off quite steeply in the 1960s. For whatever reason, offense dwindled drastically, until it reached its nadir in 1968. Again, we have in '68 two memorable figures, Bob Gibson (1.12 ERA) and Denny McLain (31 wins), who represent fully the era in which they played, although they can't be said to have caused it. The AL averaged 3.41 R/G in 1968, whereas the NL sported a 3.43 mark. This is about even with the lowest point of the DeadBall Era.
The leagues reacted, taking measures to favor the hitter. 1969 was a sharp increase, with both leagues back over 4 R/G, although not by much. Whereas the early 70's were very pitcher-friendly, offense rebounded in the late 70's. This was due partially to the introduction of the DH in the AL. After a 1972 (3.47 R/G) that was almost as bad as 1968, the AL voted to institute the DH rule. Did it increase offense? All the way to 4.28 R/G, one of the biggest single-season increases on record. But this probably wasn't due entirely to the DH; the NL jumped from 3.91 in 1972 to 4.15 in 1973. This was either because of some phantom-DH-effect or the evidence of other forces at work.
This evened out in the 1980s to another sort of middle-ground. Not only was the 1980's a competitively-balanced decade not dominated by any one team, it was also a moderate run-scoring environment, usually staying in the mid-4s.
The exception would be 1987. No one in the mainstream media would remember 1987 for any specific reason; the Twins won their first World Series against the Cardinals in 7 games, although no one remembers that Series at all. But in Sabermetric circles, 1987 is known as the Year of the Hitter. It manifested itself on the individual level, with Cecil Fielder's 51 HR the first time someone had topped 50 since George Foster in 1977. Journeyman hitters looked like stars and pitchers suffered. This was part of a baseball-wide increase. In 1986, the two leagues averaged about 4.35 R/G. In 1987, that average jumped to about 4.75. It was one of the biggest increases on record, especially considering that it happened for no apparent reason. It was mainly a jump in slugging; there were small increases in AVG and OBP, but league slugging percentages shot up a combined 41 points in both leagues. After years of an ISO at 1.45 and 1.46, the AL ISO shot up to 1.60, which was (at that time) a record. The NL mark of 1.43 was the highest since 1961. For the first time in history, a league HR% rose above 3%, as the AL stood at 3.08%. The NL mark was 2.68%, a record for that league.
And just as quickly, it was gone. After the AL and NL posted 4.90 and 4.52 R/G in 1987, respectively, the two leagues dropped to 4.36 and 3.88 in 1988, both historic drops. The AL and NL SLG were .425 and .404 in 1987, but in 1988 they were .391 and .363. I don't think anyone has yet discovered why the levels in both leagues rose so dramatically. Some have said that the MLB tried out a new type of baseballs and pulled them after the season when they proved so lively. Someone (I think it was Ken Macha) has actually kept a box of 1987 baseballs to test them in the future. But I must say that's it's just possible that it was the randomness of luck. Sure, it's nearly impossible that such an increase could happen due to coincidence. What are the odds that that could happen? Less than 1%, surely. Well, MLB has played more than 100 seasons now, so we'd have to expect at least one season that was completely f***ed up simply due to chance.
But back to the timeline: after 1987, things normalized, as I said. And everything in the early 90's looked just like the 80's: pretty normal. In 1992, the AL scored 4.32 R/G, and the NL scored 3.88 R/G. Both were a bit low, but not out of line. Here's what happened after that:
AL NL
1992 4.32 3.88
1993 4.71 4.49
1994 5.23 4.62
1995 5.06 4.63
1996 5.39 4.68

And we were off to the races. If you look at a graph of run production across history, the biggest and steepest increase is, by far, that from 1992-1994. In just 2 years, both leagues increased production by nearly a full run. We'd gone from a quiet, semi-pitcher friendly environment into the second coming of the 1930's.
Why did it happen? There may not be a why; things just happen sometimes. I'm sure there were 100 factors, all of which were instrumental. The biggest ones? New bat technology; stricter umpire guidelines; more new balls used per game; expansion; Coors Field; boom in free agent money for big hitters, etc. The hitter's era perpetuated itself, as everyone in the lineup became a home run hitter. Whatever purists may think of this approach, it helped the movement toward high run production perpetuate itself. The day of the light-hitting shortstop was all but gone. Keep in mind that some 75-90% of shortstops were basically without offense during the early 80's. Now any shortstop who is all-defense and no-offense goes by another name: utility. Someone like Pokey Reese would have gotten a long-term job with any team in the 80's; now, he's a backup. It wasn't so much a change in technology as a change in ideology. The emergence of shortstops like A-Rod, Jeter, Nomar, and Tejada, along with catchers like Mike Piazza and Ivan Rodriguez led more teams to realize that wasting a lineup spot on a defensive specialist might be just that: a waste. Now a player who can't hit has to be a truly excellent defender (Orlando Hudson) to get a full-time lineup spot. And maybe that's how it should be. The days of an average defender who hits .250 with 2 HR earning a spot in the lineup may be over. And anyone who remembers Jackie Gutierrez should be happy about that.
So what about the 2000s? We appear to be through the worst of it. The 2005 levels of run production are roughly equal to that of, say, 1961. So we're still in a hitter's era, but we appear to be through the worst of it. And, as I've illustrated above, this is not the best run-scoring environment in history.
But what about home runs and power? Okay, so this isn't the biggest run-scoring era in history. But is it the biggest home run-hitting era in history?
The short answer is: yes, and it's not even close. A chart of run production over baseball history has its ups and downs, as I've said. And while we're on a big up right now, it's not much higher than other eras.
But what about a chart of Isolated Power? Or a chart of HR%? Folks, that chart is almost a straight line going up, up, and up. All those high-scoring run eras of the past I told you about? They didn't even approach our current level of power. In 1894, when the NL scored over 7 R/G, their ISO was .126. This is equal to the AL in 1992, which as I said, was pretty pitcher-friendly with just 4.32 R/G. Their HR% in 1894 NL? An even 1%. Home Run levels have been consistently higher than that ever since the 1920s.
What about the 1930s? Surely the peak of the Babe Ruth years rivaled ours in home runs? Not quite. The ISO in the NL in 1930 (5.68 R/G) was .145. That's very high. But it's lower than both the AL (.156) and NL (.152) rated in 2005, which was a step down from the 90's peak. The HR% in 1930 NL? 1.89%. That's roughly equal to the 1981 AL, and 1981 was a pitcher-friendly year even by the standards of the 1980s.
So what is the all-time record for a league's ISO? That goes to the 1996 AL, who sported an ISO of .168. Nothing in any other era tops .150, although the 1987 AL did make it to .160. So, although Isolated Power is down in 2005, both leagues are still historically high. In fact, both the AL and the NL were more powerful in 2005 than any league in history prior to 1987. That's sobering.
What's the all-time high in HR%? That would be 3.13%, again in the 1996 AL. The 2005 marks were 2.87% in the AL and 2.67% in the NL. But again, both leagues in 2005 hit home runs at a better pace than any other league prior to 1987.
How can we explain this? How can we be hitting so many more home runs and not be scoring more runs, period? It has to do with the nature of run scoring. Whereas modern offense is all about homers, the 1890s were all about singles. The NL batting average in 1894 was .309. The average hitter hit .300 in 1894. What was the league batting average in, for example, the 1996 AL, the highest-scoring league known to man? It was a pedestrian .262. This concept is also true in the 1930's. The 1930 NL hit .303 as a league, which is the last time any league has hit above .300. Or for that matter, the last time anyone came close.
But what about the difference in OBP? Maybe modern hitters are making up for the lack of singles by drawing more walks! Well, this is true to a certain extent; walks in the 90's were at an all-time high (as were strikeouts). The AL OBP in 1996 was .348, which is darn good. In the 1894 NL? It was .373. That closes the gap somewhat, but it still leaves the past hitters getting on base at a far higher clip than their modern counterparts.
And while modern hitters are reaching base on walks more often than ever, hitters in the past reached base on errors more than ever. This doesn't show up in AVG or OBP, but a run scored off an error is still a run. And more errors were made, by far, in the 19th century, and even in the 1930's, than are now. 31% of runs were unearned in 1894 NL. By the 2004 NL, just 8% of runs were unearned. This is a striking trend in baseball, and it helps explain how those old-timers were scoring so many runs without hitting homers.

So what is the point to my story? It helps to understand things in context. When you hear that Bill Terry was a lifetime .341 hitter, remember that the league average was .291 during his career, Whereas Tony Gwynn was a .338 lifetime hitter compared to a measly .265 league average. When you hear that Sandy Koufax's career ERA was 2.76, consider that the league average for his career was 3.70. Pedro Martinez's career ERA (through 2004) was 2.71, compared to a league average of 4.48. If you just looked at the stats in the record books, you might think that Pedro and Koufax were equally good at preventing runs. But when you put them in context, you find that it's not even close. Mainstream historians are inept at separating players from their context, which is why a lot of hitters from the 1930s got elected to the Hall of Fame despite not being that good. It also explains why everyone considers Sandy Koufax to be one of the best pitchers ever, when he just simply wasn't. Koufax was very good, but a lot of his quality didn't come from himself: it came from Dodger Stadium and the fact that he was lucky enough to pitch in the 1960s.
Consider this, and my history lesson shall not have been in vain.

No comments: