Monday, December 11, 2006

The Whiz Kid's Glossary

It recently occurred to me that it would helpful to add a glossary of sorts to this site. It gets in the way sometimes to re-explain a concept again and again when it would be easier to have one central resource that I could refer people to. This way I wouldn't have to explain WXRL every time I mention it; just link to the glossary.
I'd like to include to not just advanced stats in the glossary but different concepts that I refer to from time to time. These can be old baseball concepts such as the "5-tool player," new sabermetric concepts such as TINSTAAP or the LOOGY, or humorous concepts of my own creation. Hopefully this will make the blog more accessible for old readers and newcomers alike.

I'd like to keep updating this, as new ideas develop, or if I just remember something I've forgotten. I'll try to link to the glossary from articles to make the reading experience smoother. Because I'm just that nice.
NOTE: I'll refrain from defining the more basic, i.e. baseball-card stats such as RBI, batting average, or saves. These definitions can be easily accessed at mlb.com's glossary of basic stats.


STATS
AVG/OBP/SLG
This is the standard way to denote the three essential hitting stats in baseball, in this specific order. So if I say that Justin Morneau hit 321/375/559 last season, it means that his batting average was .321, his OBP was .375, and his slugging percentage was .559. In some early entries on this site, I used the MLB.com notation of OBP/SLG/AVG, which is not the standard way. I'm working on going back and changing the early entries, but bear with me.
BABIP
Batting Average on Balls In Play. This stat can refer to a hitter or pitcher. It denotes the batting average on "balls in play," which excludes home runs, strikeouts, walks, hit batsmen, etc.
For a pitcher especially, BABIP is very informative. As Voros McCracken first illustrated (see concepts), a pitcher has almost no ability to control his BABIP (later research found that knuckleballers do have some control). So, other than strikeouts, walks, and HR, a pitcher's BABIP is determined almost entirely by chance and the defense behind him.
Therefore, if a pitcher has a particularly high (or low) BABIP, we should account for this good (or bad) luck when evaluating him. An average BABIP is between .280 and .290, depending on the context.
A pitcher with a better-than-expected BABIP is said to be "hit-lucky," that is, relatively few of his balls in play are going for hits.
DER
Defense Efficiency Ratio. A very basic concept that measures what percentage of balls in play a defense turns into outs. This is a very rough measure, and does not account for ballpark factors or the nature of balls in play (it's harder to turn line drives into outs than ground balls). But as a rough tool, there's no better measure of a team's defensive efficiency.
This stat can also be used for a pitcher, to determine the quality of the defense behind him and to see whether he was getting better or worse defense from his team than expected.
ERA+
ERA+ is the ratio of the league ERA to that of the pitcher (ballpark adjusted). ERA+ takes ERA and puts it in perspective, both for the pitcher's ballpark and for his league environment. 100 is average (the pitcher's ERA is the same as that of the league), while anything over 100 is above-average, and anything below 100 is below-average. Anything above 150 is pretty excellent; anything below 80 or so is untenable.
Bob Gibson's 1968 ERA of 1.12 adjusts to an ERA+ of 258 (he was 258% better than the league average). Pedro Martinez's 2000 ERA of 1.74 adjusts to an ERA+ of 285, the best mark of the century. 2006 AL Cy Young Winner Johan Santana's ERA+ was 161; NL winner Brandon Webb's was 154.
EQA
Equivalent Average. A Baseball Prospectus metric that condenses a player's entire offensive contribution into an average. The scale is similar to that of batting average; 300 is very good, 200 is bad, and 400 is historic. 2006 AL MVP Justin Morneau posted an EQA of an even .300; teammate Joe Mauer's was .313. Cleveland DH Travis Hafner's was .350. Barry Bonds' 73-home run performance in 2001 rated an EQA of .427.
FRA
Fair Run Average. Fair Run Average takes a pitcher's Earned Run Average and adjusts for runners inherited and runners bequeathed.
Regular ERA gives all credit (and blame) for inherited runners to the pitcher who leaves them behind, and none to the relief pitcher who inherits them. FRA portions out credit and blame to both the starter who leaves the runners, and the reliever who either strands them or lets them score.
A pitcher with a very good bullpen will end up with a better ERA than they deserve, whereas the same pitcher with a bad bullpen will have a higher ERA due to no fault of their own. The difference between a good bullpen and a bad one can have a significant effect on a pitcher's ERA.
FRA also applies to relief pitchers, making them bear the brunt of any inherited runners they let score. But it also gives them credit if they enter in a "jam" and strand the runners.
FRAA
Fielding Runs Above Average. FRAA condenses a player's defensive output into a number of runs, Fielding Runs. That is compared to the league average of 0, with a good fielder being above 0 and a poor one below. A similar measure is FRAR, or Fielding Runs Above Replacement, which sets the bar lower for comparison (see below to read about the concept of Replacement Level).
FRAA usually differs widely based on position. A first baseman or left fielder has relatively little effect on the defense, and so isn't likely to be much above or below average. Shortstop, on the other hand, is a key position. The difference between a good shortstop and a bad shortstop can be more than 20 Fielding Runs, whereas there is rarely such disparity between a good and bad first baseman.
Different sources use different adjustments to determine FRAA, but I go by Clay Davenport's numbers at Baseball Prospectus. According to Davenport, the most valuable shortstop in the AL last year was Jhonny Peralta of Cleveland (+23 FRAA); the worst was Detroit's Carlos Guillen (-13 FRAA). Derek Jeter rated at +4.
There are many different systems to rate defense and, unlike offense or pitching, there's no clear consensus as to which is best. There is Range Factor, Adjusted Range Factor, Zone Rating, Ultimate Zone Rating, and John Dewan's new plus/minus system. I tend to use FRAA as a default, while recognizing that it's not the absolute authority.
ISO
Isolated Power. This is a more basic measure; it seeks to define a player's power, measured in extra-base hits. Slugging Percentage measures a player's total bases, but doesn't really measure power. Pete Rose and Ichiro Suzuki accumulated a lot of total bases, but neither one were by any means powerful. ISO accounts for the difference.
It's taken by subtracting a player's AVG (batting average) from their slugging percentage. The reasoning is this: batting average measures every hit as 1; either a hit or not. Slugging percentage measures hits 1-4, with one being a single and 4 being a home run. If all your hits were singles, your slugging percentage and batting average would be equal. By subtracting AVG from SLG, you're left with how many extra bases a hitter accumulates.
Ichiro Suzuki's 2006 SLG of .416 would indicate a player with a little power. However, when we subtract his .322 AVG, we're left with an ISO of .094, not much at all. This accurately reflects Ichiro's 9 HR, 9 triples, and 20 doubles in over 700 PAs (plate appearances).
Adam Dunn's 2006 SLG of .490 is good, but not quite as good as you would think, given his reputation. We would expect him to be much better than Ichiro in this regard. And once you subtract his AVG (.234), he is: his ISO is .256, which is truly excellent. He's about three times the slugger Ichiro is, which is a much more accurate statement than their SLG would suggest, given that Dunn hit 40 HR and 24 doubles (but no triples).
K rate
Rate of strikeouts. For a batter, this is usually determined per at-bat or plate appearance. Ichiro Suzuki's K rate is tiny (71 K in over 700 PAs). Adam Dunn's is monstrous (194 in just under 700 PAs).
For a pitcher, this is determined per inning pitched. A pitcher's strikeout rate is one of the most important pieces of information you can get. It's more predictable and constant than almost any other basic pitching statistic, including ERA. K rate is much less susceptible to luck, chance, and environment than almost any other pitching stat. Therefore, it is much more indicative of a pitcher's true level of talent and is invaluable.
Generally speaking, a K rate of about 6 per 9 IP is average. A pitcher with a K rate below that is likely heading for danger. Truly excellent pitcher often have a K rate of more than 9 per 9 IP. So if you see a pitcher with more strikeouts than innings pitched, you're almost definitely looking at a star.
LEV
Leverage. Leverage indicates the importance of the situation a pitcher faces. If a pitcher only appears in the 1st inning with the bases empty, they'll have a very low leverage. If they only pitch in the 9th inning with the bases loaded, their leverage will be very high. Leverage also takes the score into account; a 1-run game is a higher-leverage situation than a blowout.
Leverage tells us how important a pitcher's innings were. This is especially important for closers or middle relievers; an ERA of 1.00 may be meaningless if it was compiled in easy chances, but if it was done in high-stakes competition it's fantastic.
Ideally, a manager would deploy his best relief pitcher to face the higher-leverage situation. Many analysts complain that modern managers aren't doing this, though -- at least not to the extent they used to. The save statistic has become more important than the leverage of a situation, when it should be the other way around.
It should be said that leverage is biased toward opportunity, so a pitcher's "leverage" score doesn't so much indicate inner talent as it shows how he was used.
PAP
Pitcher Abuse Points. I spoke about this in length during the blog about managers and, specifically, Dusty Baker.
PAP was developed by Rany Jazayerli and Keith Woolner in an attempt to quantify how much stress a pitcher was placed under. The formula for PAP has been recalibrated in an attempt to reflect reality, so that a high number of PAP strongly correlates with injury risk.
Jazayerli and Woolner defined five different "categories" of abuse for a starting pitcher, with Category I being little abuse and Category V being strong abuse. PAP not only lets us examine managers and their abuse of pitchers, it's also a great tool to examine and predict a player's future.
2006's leader in pitcher abuse was Livan Hernandez. Livan has led the category on and off for years now and appears to be the rare pitcher with a true "rubber arm." In 2nd place was Carlos Zambrano, who has also appeared on the leaderboard for years now. Zambrano, unfortunately, is still quite young, and so we won't know if he can handle such high stress until injury strikes. Thank you, Dusty Baker.
It should be noted that the AL's leader in PAP was free agent Barry Zito. Zito also led the league in 2005, and was 3rd in the AL in 2004. Buyer beware.
PECOTA
Baseball Prospectus' tool to predict a player's performance. PECOTA works by establishing a three-year baseline for a player's established level of performance and makes adjustments based on age, position, playing time, level of play, etc. PECOTA works by comparing the player to every other player in history to find comparables, and then looks at how those comparables performed. PECOTA is ever-changing to provide an accurate prediction tool.
PF
Park Factor. Determines how often runs are scored at the ballpark compared to the league average. Can be calculated for hitters or pitchers. A Park Factor of 130 indicates that runs were scored 30% more often than at other parks (think Coors Field in its heyday). A Park Factor of 85 indicates that runs were scored 15% less often than at other parks (the AstroDome at its most cavernous). Park Factors should always be kept in mind when viewing raw stats, as any Rockies fan can attest.
Park Factors are usually calculated with data from more than one year, to increase the sample size and improve accuracy. Park Factors are usually either based on 3-year or 5-year data sets. The trouble is that expanding your data sets increases your sample size, but also further distances you from the year you're actually measuring.
Most Park Factors are broad, indicating only run scoring. But the same principle can be applied to almost any other event. You can get a Park Factor for doubles, foul outs, and triples. These can be very important when looking at individual hitters. J.D. Drew, for example, is moving from a pitcher's park (Dodger Stadium) to a hitter's park (Fenway Park). Does that mean his offense will increase? Not exactly. Dodger Stadium is actually a friendly park to left-handed home run hitters, whereas Fenway Park is certainly not.
Pythagorean Win Totals (pWins or pLosses)
This metric was originally developed by Bill James, who established that a team's win-loss record is a predictable result of their runs scored and runs allowed. This strong correlation flies in the face of more traditional thinking that a team is more than just their stats. James found that on the other hand, they are for the most part. In fact, a team's Pythagorean Won-Loss Record is a more accurate predictor of future performance than their real won-loss record.
The formula itself is named because it bears a resemblance to the Pythagorean Theorem of Geometry.
The Pythagorean Totals come with some caveats. It's been shown that a team with a good bullpen can succeed despite a deceptively low pW-pL total; this is because they are timing their runs and runs prevented well. The Pythagorean Theory is based on the concept that a run is a run, but reality is that runs scored in later innings are more valuable. A team that is more talented at scoring (and preventing) runs in the late innings can succeed beyond the appearance of their pWin totals.
A further extension of the theory, accounting for some of these factors, was developed by Clay Davenport. These are known as the Pythagenport numbers.
Playing Time Requirements
In order to be considered on the leaderboard for a rate statistic (ERA, AVG, OBP), a player has to meet a minimum of playing time. This is obvious, because otherwise, the guy who went 1-for-1 would win the batting title.
Generally speaking, these requirements are:
For hitters, at least 3.1 PA/team game (502 plate appearances, if your team played 162 games).
For pitchers, at least 1 IP/team game (162 IP).
QS
Quality Starts. The number of starts where a pitcher a)pitched at least 6 innings and b) allowed 3 or fewer earned runs. The Quality Start was developed as a way to determine just that -- how many quality starts a pitcher made, as opposed to how many wins they were credited for.
With the development of more advanced pitching stats, Quality Starts is of less importance. But it is a good answer to a simple question: how many times did the pitcher keep his team in the game?
RA
Run Average. The same as ERA, except it takes into account all runs a pitcher allows, earned and unearned. Since the rules for separating earned and unearned runs are so antiquated and ineffective, many analysts choose to take all of a pitcher's runs into account. We know how to account for a pitcher's defense without the ineffective concept of "errors."
Runs (broad)
This doesn't refer to runs in the strict baseball sense (as in, Ichiro scored 100 runs or Johan Santana allowed 2 runs in the game). It refers to the attempt to take a ballplayer's total contribution and express it in one number: runs.
We've already discussed fielding runs, which attempt to express a ballplayer's defense in runs. There are also batting runs and pitching runs. In this context, all runs are created equal. A hitter worth 100 batting runs and -20 defensive runs contributes 80 total runs.
These are often qualified by determining how far they are above average or above replacement level. Hence, BRAA and BRAR; PRAA and PRAR; and FRAA and FRAR.
Generally speaking, 10 runs equals one win. So if one player is worth 50 runs above replacement, and another is worth 70, we can see that the latter player is worth 2 wins above replacement.
VORP
Value Over Replacement Player. Building one the entry above, VORP seeks to boil down a player's contribution into one number: the runs they provided above replacement level. VORP accounts for league, ballpark, and many other contextual factors.
VORP can be determined for hitters or pitchers. VORP does not take into account a player's defensive contribution, but it does take their position into account when determining value (i.e. a good-hitting shortstop is more valuable than a good-hitting DH).
Derek Jeter led all AL position players in VORP last season with 77.9. AL MVP Justin Morneau finished with a VORP of 50.9, ranking him 16th in the league, behind teammates Joe Mauer (65.4) and Johan Santana (79.7).
Seattle pitcher Joel Pineiro ranked near the bottom; his 6.36 ERA in 165.2 IP gave him -13.4 VORP. This means that the Mariners would have gained more than one full win had they replaced Pineiro with a readily-available replacement-level player.
WARP
Wins Above Replacement Player. WARP is essentially VORP translated to wins, plus accounting for defense. WARP1 includes a number of adjustments to take into account a player's season. WARP2 adjusts the player's performance for all-time. WARP3 is essentially WARP2 with an accounting for season length; it credits players who played in shorter, 154-game seasons.
WS
Win Shares. Amount a player contributed toward his team's wins. 3 WS = 1 win. A statistic developed by Bill James that accounts for offense, defense, and pitching. It multiplies a team's wins by 3 to determine its Win Shares and then apportions out credit or blame accordingly. James developed the statistic to work the opposite way from conventional statistics. Instead of trying to take HR, BB, etc. and turn them into wins, James started with wins and then parcelled out credit.
Win Shares is probably not as accurate as other models, and James is admittedly not a statistician, per se. The stat has been tweaked somewhat by the staff of The Hardball Times, who still use it as their basic measuring stick.
WXRL
Win eXpectation above Replacement, Lineup-Adjusted. See below on Win Expectancy for more background.
In short: WXRL measures the likelihood that a team will win a game based on the state of the game, the teams and players involved, etc. Every event in a baseball game changes the win expectancy. WXRL determines the net positive, or negative, effect that every pitcher has upon a ball game. If the pitcher comes in with a 50% chance of his team winning and leaves with a 70% chance of his team winning, then that pitcher would get (roughly) a WXRL of 0.2 wins.
This is a rough explanation of the concept, but the pitcher's net effect on his ballclub in terms of wins is expressed by WXRL. It seeks to take into account not just the total number of strikeouts, etc., but when they accomplish these things and what effect they have on the game. Leverage (see above) plays very strongly here.
Leaders in WXRL are usually closers, since they tend to pitch when the game is most on the line. The 2006 leader in WXRL was Francisco Rodriguez at 7.301, far above second-place Jonathan Papelbon (6.605).
WXRL is, like leverage, biased for opportunity. It doesn't just measure how a pitcher pitches, but when he does it. This isn't just a reflection upon them, but upon their manager (especially for relievers).
CONCEPTS
5-Tool Player
Baseball scouts measure 5 different "tools" to evaluate a prospect. The tools are: hitting for average, hitting for power, running speed, arm strength, and fielding ability. A 5-tool player is still a term to describe the very best prospect.
Partially because of the statistics revolution and the book Moneyball, the concept of a 5-tool player isn't as prevalent as it was. Scouts don't tend to go strictly by tools anymore, and more and more teams are using statistical analysis. This makes sense; having the "tools" to be a good baseball player does not indicate the likelihood that you will become one. The minors are full of former 5-tool players who never mastered the most important tool: playing good baseball.
The tools themselves are a bit flawed. They indicate that offense and defense are 40% of a player's game, and baserunning the other 20%. A more realistic weighting would be: offense 80%, defense 17%, baserunning 3%. There is no tool for plate discipline or batting eye; in fact, many failed 5-toolers are guys whose power never survived to the majors, because their poor plate discipline led to too many strikeouts (think Bo Jackson). The defensive requirements are less clear: what exactly is fielding ability? And while throwing arm is important, it's totally irrelevant if you're a first base prospect.
But, like most scouting principles, the 5 tools were based upon a good idea taken to impractical extremes. A well-rounded, athletic prospect -- one who does many things well -- is more likely to make it to the majors and thrive than a one-dimensional, unathletic prospect with the exact same statistics.
But athleticism and being well-rounded aren't everything. As Billy Beane said, we're not selling jeans here. Baseball is full of unathletic superstars and ultra-athletic washouts. But as a general idea, it's not bad.
As a note, the idea of the 5-tool player and the notion of being multi-dimensional are very important in the scouting world but irrelevant when discussing the past. This becomes evident when baseball writers -- most of them pro-scouting -- end up voting for the Hall of Fame. A player's past accomplishments are already there; it doesn't matter whether they used 5 tools to get them or zero. But you'll still hear guys like Mark McGwire criticised as being one-dimensional. But, looking at the past, dimensionality is irrelevant; how much did they produce?
This is a good example of how a good idea can be taken too far and used too much.
10-and-5 Player
Under baseball rules, a player with at least 10 years of major league service time and at least 5 years with the same club. Under current rules, such players automatically gain the power to veto trades.
There was some noise recently that the Braves would try to trade Andruw Jones before he reached the 10-and-5 threshhold and was able to veto them. But it didn't happen.
Age 27 Peak
Refers to the conventional wisdom that the average baseball players peaks at age 27. For years, the age of 30 was considered to be a baseball player's peak, but recent research by Bill James (basically confirmed by other studies) indicates that it is much earlier, around age 27. It's different for every player, especially different types of player (see "Old Player's Skills). But this is generally true.
Many people reflect an ignorance of the age 27 peak when they expect 31- and 32-year-olds to continue performing at their peak level for a multi-year contract. In fact, one basic fact of free agency is that almost every free agent who enters the market is past his peak; most in fact are over the age of 30. This further illustrates the danger of the Big-Name Free Agent and the importance of developing your own talent.
AS
All-Star. If I refer to a player as All-Star quality, it usually means that they're an above-average player among the two or three best at their position. They might be in the MVP race once or twice, but they're a step below what I consider to be an MVP-caliber player.
Examples include: Melvin Mora, Julio Lugo, Michael Barrett, Brett Myers, and many more
BB:K ratio
Like K rate, a good BB:K ratio is key for a hitter or a pitcher. For either one, you want to keep it on the right side of the 1:2 margin. A hitter who strikes out more than twice as often as he walks is probably trouble. Ditto for a pitcher who fails to strike out twice as many batters as he walks.
The 1:2 (or 2:1, if you prefer K:BB ratio) is a generalization and the rate of strikeouts and walks must be taken into account individually.
BBWAA
Baseball Writers Association of America. The group (often spoken of derisively by yours truly) that votes on the MVP, CY Young, and Rookie of the Year awards and inducts players into the Hall of Fame. Referred to recently (and more accurately) as the BBRAA -- the Baseball Reporters Association of America. Television and internet commentators need not apply.
Career Year
One year, which most players have, when they play above their established levels of production. Most everyone has one, just as most people will have off-years (the opposite of a career year). Most career years are simply the result of chance and good luck and are rarely the sign of any meaningful change. They are, however, usually mistaken for a sign that the player has undergone a miraculous improvement. I could mention dozens -- perhaps hundreds -- of examples of players who had a career year that the club mistook for a new, higher level of production.
You'd think that after being proven wrong so much, otherwise intelligent baseball men would wrap their heads around this concept. But it hasn't happened yet.
Examples from 2006 include Alfonso Soriano, Gary Matthews, Jr., Mark DeRosa, and a few others.
When in doubt, remember the concept of the Age 27 peak. Also consider the type of player to see if they really have ascended to a new level.
CBA
The Collective Bargaining Agreement. Ever since Marvin Miller came along and brought baseball's labor relations into the 20th century, labor and management have lived by the terms negotiated in the CBA. Whenever a CBA expires, it is usually a chance for labor (or management) to demand harsh terms for renewal. Every CBA renewal in baseball history led to a work stoppage of some sort until the 2002 negotiations were completed at the last minute. The 2006 negotiations were historically peaceful.
The terms of the CBA dictate all baseball policy that affects management and the players, from the amateur draft to the pension plan.
Clutch
The concept that certain players perform better under pressure conditions. Thrown about to label pretty much every player in the game as either "clutch" or "not clutch."
It could be said that the cause celebre of sabermetrics is to refute the idea of clutchness. Some of the earliest sabermetric studies looked for evidence of clutchness (they didn't find it), and some of Bill James' early work was to look for evidence that certain people consistently performed better in the clutch than others (he couldn't find it).
Later, more sophisticated, research has indicated that there is a certain level of clutch performance that can't be explained by true chance. That level is somewhere between 2 or 20%, depending upon whom you ask. But even the most optimistic sabermetric studies indicate that good (or bad) clutch performance is due more to chance than skill. In fact, much of the magical "clutch" performance is probably just good situational hitting; changing your approach to fit the situation.
But the idea of clutch is one of the most treasured in all of mainstream sports. The very mythology of baseball was written with clutch as its ink; almost every baseball film or glorious baseball memory has to do with someone "coming through in the clutch." It's a fantasy that the public (and the media) will not easily be disabused of.
But it doesn't hurt to try.
Davenport Translations
Clay Davenport of Baseball Prospectus specializes in "translating" stats; that is, removing all situational and contextual biases to provide a "true" sense of the player. Clay translates minor league stats to reflect what they would be in the majors; he translates stats from one ballpark or league into another; he can even translate the stats of a player in the Japanese League to the majors. His translations are at the heart of the PECOTA system.
Referred to as "DTs" for short.
Defensive Spectrum
An idea put forward by Bill James to establish the idea of defensive skill as a spectrum for most difficult to least difficult. It reads thus:
C -- SS -- 2B -- CF -- 3B -- RF -- LF -- 1B -- DH
Some have argued over certain aspects of it (whether to switch center field and third base), but the basic concept is still in use today. You'll often hear players referred to as being from the "right side" of the defensive spectrum, meaning that of little skill, or value. Therefore players from the left side are more valuable. It's rather like mounting a horse: the left side is the right side, and the right side is the wrong side.
James observed that, as players age, they drift from the left to the right side of the defensive spectrum. This holds true in almost every way. In fact, you can almost trace Pete Rose's career from a young second baseman to an old first baseman, hitting most every stop along the way.
James also remarked that a player is rarely moved left on the defensive spectrum. The teams that have tried often meet with disastrous results.
A great deal of discussion has raged recently over where the catcher's position should be on the spectrum. This may be worth an entry of its own, but there's been a great deal of research lately trying to prove the theory that the catcher is the most valuable defender on the field. Now, it's hard to argue that catching is the roughest position on the field, or that catchers have short careers. Therefore, a person who can man catcher is more valuable, since that position experiences higher attrition.
But the other aspects of catcher defense -- mainly that a catcher has a positive effect on a pitching staff -- are amazingly elusive to researchers. Recent research by a number of sources couldn't find any evidence of a catcher's effect on a pitching staff. Try though we might, we're coming more and more to the conclusion that the catcher isn't nearly as valuable as the shortstop.
But, since the concept of a catcher's defensive superiority is central to our understanding of baseball, it would be wise not to make up our minds just yet. Obviously, just because we can't find something doesn't mean it's not there. But this is something that merits further study. And from what I've seen, we may have to drastically revise the way we view a catcher.
GG
Gold Glove. Refers to a fielder who is one of the best -- if not the best -- defenders at their position. Our understanding of defense is still quite incomplete, and what wisdom we do have is rarely utilized by those who vote for the awards. So Gold Glove winners aren't necessarily the stars they're made out to be.
HOF
Hall of Fame. Also, HOF-er for Hall-of-Famer.
LAIM
League-Average Innings-Muncher.
Acronyms are all the vogue, and they are admittedly fun to use. LAIM refers to a pitcher who isn't really that productive, but whose chief quality resides in "munching" innings and keeping his team in games.
This term isn't meant as an insult. A LAIM has value; being average is still better than replacement-level, and guys who can throw 175-200 innings with a league-average ERA are few and far between.
While Greg Maddux used to be an elite pitcher, now he's become just another LAIM.
League Average
When looking at any raw stat, it's important to keep in mind what the league average is and compare a player's performance to that. See the above entry on "ERA+" to see why Pedro Martinez's 2000 ERA was better, in context, than Bob Gibson's 1.12 mark in 1968.
Catcher Matt Nokes had a .536 slugging percentage in 1987 as a catcher with the Tigers. In 1988, it fell to .424. Did Nokes experience a major power outage? Not exactly:
1987: Nokes' SLG -- .536; AL average -- .425
1988: Nokes' SLG -- .424; AL average -- .391
Nokes' power outage certainly happened, but the league changed a lot more than he did. A shift of .034 points is very big for one season.
If a player has an abnormally good or bad year -- or experiences a sudden increase or decrease in productivity -- first make sure that it's really the player that's changing and not the league.
And of course, always account for the league average when looking at any raw stat. The last NL player to hit .400 was Rogers Hornsby, who hit .403 in 1925. Are all the players since then pansies (Hornsby would think so)? The NL hit for a .292 average in 1925; Hornsby was .111 points above average. In 2006, the NL hit .265. Freddy Garcia hit .344; .079 points above average. Is Garcia really just .30 points away from becoming Rogers Hornsby? Not exactly; he had a "career year" (see above). But Hornsby doesn't look so immortal when viewed in context.
LOOGY
Left-handed One-Out GuY. It refers to the practice, now endemic to baseball, of carrying a left-handed reliever in the bullpen who is used to pitch to a single left-handed batter and then removed from the game. The practice was popularized by a young Tony LaRussa in Oakland, but now almost every team has one.
The concept of the LOOGY is often criticized; yes, left-handed pitchers perform better against left-handed hitters, but that's such a broad generalization that it shouldn't require us to breed a new species of relievers. It also serves to make games longer and much more boring, as the manager now uses three pitchers (R-L-R) whereas in the past he would use one (R). It has also contributed to the ever-growing bullpens, some of which have reached 13 men!
The LOOGY is an example of a good idea gone too far; although it did have the added benefit of giving Jesse Orosco a job after his 46th birthday. Luckily, the concept of the ROOGY hasn't caught on yet.
Mallpark
Pejorative term for the modern ballparks, which are often a part of broader "entertainment complexes," signalling the death of the Wrigley-era "neighborhood ballparks." It also refers to the absurd amount of cash owners make from ballparks, most of which are funded and subsidized with public money.
Old Player's Skills
Another Bill James concept. James observed that players with a certain skill set (walks and power, but with a low batting average and little athleticism) tended to lose their value much faster than players of a different skill set (high batting average but relatively little power or "patience," good athleticism resulting in strong defense and/or stolen bases). James noticed that as a player aged, their skills tended to evolve from the more athletic, average-driven game ("young player's skills") to the slower approach typified by walks and homers ("old player's skills").
James noted that if a young players arrives in the majors already displaying "old player's skills," they're not likely to have a long career. In fact, many big-time prospects have flamed out in precisely this manner (Kevin Maas, Pete Incaviglia). Any player who possesses the dreaded "old player's skills" is not a safe investment after age 30 and might even be out of the majors by 35. On the other hand, players who retain their "young player's skills" can stay in the game into their 40's (Rickey Henderson, Craig Biggio, Pete Rose).
If your list of comparable players include any members of the "old player's skills" Hall of Fame -- guys like Greg Luzinski, Ron Kittle, Nate Colbert, and Jim Gentile -- plan on drawing your pension before your 40th birthday.
And yes, I'm talking about Adam Dunn.
PEDs
Performance-Enhancing Drugs. I already mentioned these in the Mark McGwire comment, but these can refer to steroids, nutritional supplements, amphetamines, etc. -- anything that gives a ballplayer an extra, artifical edge over his opponents.
All steroids are PEDs, but not all PEDs are steroids. The two terms are often used interchangeably, but they're not the same (see Will Carroll's book The Juice). Human Growth Hormone is not a steroid, but a substance produced naturally by the body.
The line between an illegal PED (such as a "greenie" or amphetamine) and a legal PED (caffeine) is uncomfortably thin.
Peripherals
Analysts will often refer to a pitcher's "peripheral" stats; they are referring to his walks, strikeouts, and home runs allowed. Since these are "peripheral" in the mainstream consciousness compared to Wins and ERA, they are referred to as such. But, as the research of Voros McCracken and others have illustrated, they are anything but secondary. (See "Three True Outcomes.)
Postseason Halo Effect
This is the only entry of my creation -- so far (I'm trying). The concept isn't new, but my term for it is. The Postseason Halo is similar to the career year, but is more enduring and built of less substance.
The idea is that a player who plays a key role in the postseason will enjoy a postseason halo that obscures their shortcomings for months, or even years to come. One could also call this the Don Larsen effect, after the Yankee pitcher who threw a perfect game in the 1956 World Series but was never really good in the regular season again. None the less, his career lasted for 14 seasons, until 1967. (To be fair, Larsen enjoyed some good seasons later in his career after converting to relief).
Another example could be Scott Brosius, who combined his Postseason Halo with his Yankee Halo (needs no explanation) to earn himself a totally unmerited spot on the Hall of Fame ballot.
The concept is related to the idea of "clutch," with the theory being that if you can win in October, then you must be really good. But good players in October are no less lucky than good players in April.
David Eckstein is probably the best example from the past postseason. Bobby Jenks is a candidate from 2005. And the Boston Red Sox' unlikely championship in 2004 contributed at least half a dozen postseason halos (Kevin Millar, Derek Lowe, Orlando Cabrera).
NOTE: It's not a postseason halo if the player really is that good (Albert Pujols, Manny Ramirez).
Regression to the Mean
This is a statistics principle that states that an extreme performance -- good or bad -- is likely to regress back to the average in the future. This is true in baseball; MVPs rarely repeat their performances in the following year. Teams rarely win (or lose) 1oo games for several consecutive years.
There are many other factors that determine a baseball team's performance, of course, but a strong dose of regression must be taken into account.
Replacement Level
This is the backbone of the advanced VORP and WARP statistics and one of the more recent additions to the baseball ideology.
For years, many metrics compared a player to the league average. This was nothing new, as it was common to classify someone as above- or below-average. But recent studies have stressed that the baseline shouldn't be set at average. As we've seen, a below-average player is not just valuable on the field, but valuable in terms of salary. A player should be compared to the replacement level -- that is, their contributions are only as important as they are difficult to replace.
This isn't just a cosmetic difference. If you're consistently below-average, then an average-based (such as FRAA) metric would rate you in the negative numbers. But you can be below-average and still be valuable; a player shouldn't be punished for being consistently below average. The concept of "average" doesn't really agree with our concept of value, so we'll have to look elsewhere.
There is, across baseball, a pool of readily-available talent that can be had for a minimal amount of money (the waiver price, the league minimum). If your performance drops so low that it can be replaced by an easily-available, low-price alternative, then you are no longer valuable to the team. So our basic definition of "valuable" is how difficult you are to replace.
And so we introduce (and try to measure) the "replacement level." The replacement level differs by position. Offensively, it's much higher for a first baseman (easy to find a replacement) than it is for a shortstop (hard to find one). This also works on the positive side of the ledger. A great-hitting shortstop is more valuable than an equally great-hitting DH, simply because it's a lot harder to find a great-hitting shortstop. Oddly enough, although this is a new concept to sabermetrics, the free agent market has understood this for years. Good-hitting center fielders make a lot more money than good-hitting first basemen. And the biggest free agent contract of all time was given to a shortstop who hit like an MVP (A-Rod).
I don't do the concept justice here. It's best explained in the book Baseball Between the Numbers in the chapter: "Why is Mario Mendoza so important?" That's the basic idea: picture the Mendoza line (a .200 average), but draw it out for a player's entire offensive game. If you're above that line, you're worth having around. If not, you can be replaced.
Teams consistently make mistakes by paying big money to players who aren't far above replacement level. The most gratuitous example this offseason might be Gary Matthews, Jr.
Sample Size
Another concept from statistics. The idea is that the larger the sample you draw your data from, the more closely it will reflect the underlying reality.
Let's say I flip a coin ten times. It comes up 10 heads. Do I have an other-worldy coin? Of course not, you say; 10 flips doesn't prove anything. I agree. So why do we think that a hitter who goes 8-for-10 against a pitcher "owns" him?
Baseball men make mistakes every day owing to small sample size. They usually happen early in the season, when you only have a month or two of data to draw from. Our star player hits .200 in April? What's wrong with him? Let's try benching him or giving him extra batting practice on Mondays. Then we can fix what's wrong with him!
Of course, the idea that nothing's wrong with him isn't usually mentioned. This error is more often made in newsrooms. Granted, beat reporters make their living by taking what happens in one game and looking for the meaning of life. But no truly colossal decision should be made upon 10, 25, or even 100 at-bats worth of data. Because what you're looking at might not be a true reflection of reality; it could just be a bad time to be flipping coins.
Similarity Scores
What started as more of a toy than a tool is now a key part of performance analysis. It's fun to take one player's stats and see how closely they compare with another's. It's also neat to take a player's stats and see which players in history they match up with. (Baseball-Reference.com has the 10 most similar players for every major leaguer in history).
But -- as PECOTA has shown us -- when you know who a player is really similar to, you can start to project his future. If a player's top-10 list is full of guys who suffered injuries and were out of the majors by age 30, that's a pretty big red flag.
It should be noted that most similarity scores make no adjustment for era. A player with 400 home runs is similar to another player with 400 home runs, whether they're playing in 1966 or 1996.
Stolen Base Break-Even Rate
It's long been accepted baseball wisdom that a caught stealing is more harmful than a stolen base is helpful. So you have to do better than 50/50 in stolen bases to actually provide value to your team.
The general rate to break even in stolen bases is 70%. If you're successful less than 70% of the time, then you're hurting the team rather than helping it. However, the breakeven rate changes depending upon the offensive environment. In a high offensive environment (Arlington), it's not worth stealing a base when you're likely to score anyway. The out is much more harmful than the base is helpful. However, in a low-run environment, outs are plentiful, so it's a lower risk to try and take the valuable extra base.
So the break-even rate hovers between 60-80% between the far extremes. However, in general, 70% is a rough benchmark.
It's interesting to apply this principle to most basestealers and see the results. Almost every newspaper in America reports stolen bases, and when players are mentioned, they're usually mentioned with their raw number of steals. But that's as useless as knowing how many hits someone has without knowing their batting average. Stealing 20 bases is great if you're Carlos Beltran and can manage to go 20-for-22 or something. But if you've gone 20-for-30, then you're just hurting your team.
Even the most prolific basestealers rarely succeed often enough to make a really significant difference for their team. In 1982, Rickey Henderson set a new record by stealing 130 bases in a season. Every "serious" baseball fan knows that. But he was also caught 42 times, for a success rate of 76%. Now, 76% isn't bad. In fact, given that 1982 was a lower run-scoring environment, it's safely above the break-even rate. But even if we put the rate at 67% (roughly), every time Rickey was caught stealing would cancel out two stolen bases. So when you take the CS into account, Rickey only gained a net total of 48 bases. Wait a minute -- Rickey Henderson, the greatest base-stealer of all time, was only worth 14 home runs worth of stolen bases at his all-time best?
Well, yes, roughly. And 14 home runs is a hell of a lot, especially when you're already an MVP like Rickey was. But if you look at any basestealer in the modern game and think they're making a major dent in terms of runs scored -- compared to the sluggers and true hitters -- you're just fooling yourself.
Three True Outcomes -- Voros McCracken
I've already mentioned McCracken and his ground-breaking claim that a pitcher has almost no control over the balls put in play against him. It doesn't matter if you're Greg Maddux or Greg Jones -- once the ball leaves the bat, it's up to the defense. And while this isn't literally true, it's more true than we ever knew, and it changed the way we view pitchers.
And so, considering what little control pitchers do have -- we were left to consider the Three True Outcomes that a pitcher does have control over -- walks, strikeouts, and home runs. I've already mentioned that these are the three most important basic statistics of any pitcher anywhere.
But some fans took a liking to the idea of Three True Outcomes. In fact, the Rob Deer fan club soon sprang up to honor those hitters who give opposing defenses absolutely nothing to do. In an 11-year career, Deer came to the plate about 4500 times. Of that, he hit 230 home runs, drew 575 walks, and struck out 1,409 times. That comes to a Three True Outcomes % of nearly 50%, which is phenomenally high. The opposing defense was irrelevant nearly 50% of the time Rob Deer came to bat.
There's nothing especially good or bad about being a 3TO sort of guy -- although they tend to be "old players' skills" types -- some statheads get a kick out of those that turn baseball into a game of 1-on-1. Deer's heir apparent in today's game is either part-timer Russell Branyan or our old friend Adam Dunn.
TINSTAAPP
There Is No Such Thing As A Pitching Prospect.
This is a relatively new sabermetric trusim designed to confront people with the harsh realities facing pitching prospects. The difficult of predicting a young pitcher's performance, combined with the high probability of debilitating or career-ending injury, make it very dangerous to count those pitchin' chickens before they hatch.
Walk Year
The term reserved for the last year of a player's contract before they reach free agency. Conventional wisdom was that players tended to try harder in their walk year -- stay in the lineup and play through minor injuries -- to make their big payday. It doesn't sound that significant, but a study in Baseball Between the Numbers showed that players in their walk year did tend to do better than expected.
So GMs should be aware of this when signing free agents. And players should try their hardest to make sure that their "career year" coincides with their "walk year." They will become filthy rich, although they probably won't make their future employers very happy (Adrian Beltre).
Win Expectancy
See above: WXRL. Win Expectancy measures the "state" of a baseball games and determines the likelihood of a team's scoring a run or winning the game based on the current state. Measuring a change in the state of the game -- and the players who bring about those changes -- can be very enlightening. It can also tell us a lot about when stealing and sacrificing really do help you win games -- and when they're wasted outs.
Win Nexus
I've already dedicated one blog entry to this. But this concept (known by many other names) is that certain wins are more valuable to a team than others; therefore, they can be excused for spending more to achieve them. The Win Nexus is roughly between 87-92 wins -- depending on the league and division -- where anything below the nexus is out of the playoffs and anything above is in. The difference between making and not making the playoffs represents so much income for a team that they can understandably overpay to achieve those wins.
It's important to know which side of the Nexus you're on. Many teams -- the current Chicago Cubs, for example -- think they're much closer than they really are, and end up spending money for wins that are, as it turns out, irrelevant (the Cubs will thank Alfonso Soriano for bringing them all the way to 75 wins).
More to come, as I will be updating and improving this glossary in the future.

No comments: