Monday, March 27, 2006

Baseball Prospectus 2006

I recently picked up a copy of Baseball Prospectus 2006, the yearly annual of analysis/predictions put out by the staff of Baseball Prospectus, the premier baseball online "think tank." The book is fascinating and overwhelming at the same time; it's such a great amount of information and wisdom that it takes a while to digest. I'm only about halfway through it right now, but couldn't resist stopping to discuss it.
  • Having just made my own preseason predictions, it's a nice feeling to get reassurance from the book. While I'm not always right (an understatement I'll get back to in a minute), many of my ideas -- sprung not from the mainstream (un)consciousness but from my own head -- have some merit. While my predictions on individual players are hit and miss, my analysis on a team level seems to be pretty accurate. I haven't come across a general team prediction in the book that I strongly disagree with -- it's usually just a matter of degree. So it's a mini ego boost.
  • Now that I've said that, let me just say how very humbling it is to be confronted with such superior information and analysis. BP has access to a lot of inf0 and stats that I don't, it's true, but I keep coming across mistakes and inaccuracies in my thinking that are just the result of p0or logic or laziness. I'm not going to be making the staff of BP anytime soon -- shocking, I know -- but hopefully I can just soak up as much of this as possible and do better next time.
  • The book does a brilliant job of combining statistical analysis with traditional scouting. As much as BP has assumed the mantle of the stathead/Moneyball movement, the writers shy away from the extremes of iconoclasm and scout-hating that typifies many neo-sabermetricians (myself included). This is not to say that they're just willing to accept someone's word for it. They hold everything up to the reasoning power of evidence, it's just that not all of that evidence is numbers; much of it is the anecdotal information provided by scouting. Put the two together and you have a sense of realism that is superior to any other examination of the game that I know of.
  • The key difference that I come across in this book is that so much of it is concerned with predicting the future, especially the future of players currently in the minor leagues. Much of my own stats work has been with analyzing the past and the present -- not with predicting the future. It's much easier to say who is good or who used to be good than it is to say who will be good. It takes another set of skills and standards entirely to project the career of a young baseball player. I was only moderately aware of these standards in the broadest sense, so hopefully this book will help me improve in this regard.

But the most impressive part of the book is the PECOTA forecasting system. Simply put, the statheads at BP have developed a system of predicting a player's future performance. This system takes into account everything: playing circumstances, age, injury history, usage patterns, etc. What it then does it to find similarities among players of the past. By looking at every player in the history of the game, patterns emerge. So while we don't know what Joe Smith will hit in his age 30 season, we can find the 100 most similar players to Joe and find out what they hit at age 30. This is essentially what PECOTA does; digging through the thousands of player seasons in history to find similar players and determine trends.
For example,
I've always known that, generally speaking, a player peaks at age 27, with his top performance coming from 25-29. Players tend to improve pretty quickly before 25, then decline slowly after they turn 30. After 35, the decline accelerates, and past age 40, they drop like flies.
This is generally true; but generally true is pretty useless when looking at an individual. If %57 of players (I made up the number) peak at age 27, it also means that %43 do not. So which is Joe Smith?
PECOTA narrows down the "type" of player to find a more exact match and determine a more specific career path. If Joe is a catcher, he's going to see his decline come much earlier. If he has "old player skills," i.e. just walks and homers, he's likely to decline soon. If he has more "young player skills," i.e. just batting average and speed, he can be counted upon for more. If he's playing a skill position, such as catcher or shortstop, he will likely have to be shifted elsewhere as he gets older, if he's a good enough hitter to justify keeping himself in the lineup. PECOTA takes all of this into account.
The BP writers also list 4 different factors: Breakout, Improve, Collapse, and Attrition with a percentage next to each. So a player's line might read like this:
Moises Alou: Breakout (11%), Improve (32%), Collapse (34%), Attrition (19%).
Breakout represents the likelihood a player's performance will increase 20% above his weighted performance over the past 3 seasons. So there's an 11% chance (pretty low) that Alou, at age 39, will be 20% better than he was over the past 3 years.
Improve represents the likelihood that a player's performance will improve at all over his weighted performance over the past 3 seasons. Alou scores 32%, not awful, but it means he's more likely to get worse than get better, which makes perfect sense.
Collapse represents the likelihood that a player's performance will decrease by at least 25% over his weighted performance over the past 3 seasons. Alou has a 34% chance of collapse, which isn't scary high, but dangerously high; again, not unusual for a 39-year-old.
Attrition is a different sort of measure entirely. It represents the likelihood that a player's plate appearances (or innings pitched) will decrease by at least %50 relative to his weighted performance over the past 3 seasons. This could be due to injuries, age, or several other factors. It's a pretty darn important thing to know if one of your stars is suddenly going to be limited to 80 games. Most young position players, if they don't have a history of injury, will score about 1-2% attrition. Alou scores 19. It's not as high as a has-been marginal guy like Marquis Grissom (%49), but still uncomfortable.
With this in mind, the 2006 guide lists PECOTA predictions for every player in 2006 -- adjusted for the home ballpark they will presumably play in -- as well as refined measures of quality that are park-adjusted. Here are some random examples of the BP guide's PECOTA predictions for 2006:

Andruw Jones: 276/353/525, 35 HR, 2 FRAA (Fielding Runs Above Average)
BP predicts that while Andruw isn't going to hit 50 HR again, he will keep up a strong level of production. But he is also no longer an elite center fielder -- 2 years ago, Jones produced an ungodly 19 FRAA in center.
Derrek Lee: 298/383/570, 38 HR
BP thinks that 2005 was a career year for Lee, but also thinks that he has ascended to another level as a hitter. I was hesitant to say so -- PECOTA thinks Lee will hit far above his career numbers next year -- because Lee was 29 last year, an odd (but not impossible) time to ascend to superstardom. BP also lists a player's 3 most comparable players -- the 3 players from history most similar to this player at the player's age. So while it's great to be a 19-year-old Dwight Gooden, not so much if you're 30. But Lee's 3 most comparable players are Dave Winfield, Cliff Floyd, and Eric Karros. All good players, with Winfield especially a star long after he turned 30.
A.J. Burnett: 3.76 ERA, 201.1 IP, 72 BB : 170 K
Although BP also thinks that Burnett's contract was profligate, they're much more generous in their predictions for next year -- thinking him less likely to get injured than I do.
Alex Rodriguez: 298/391/576, 43 HR, 0 FRAA
A-Rod will still be a superstar, but will trend slight downward -- prefectly natural for an MVP to regress a bit. A-Rod's defense at third isn't quite as good as its reputation, but who really gives a damn when he hits like that?
A-Rod's 3 most comparable players? Eddie Mathews, Dave Winfield, and Frank Robinson. Not only are they 3 Hall-of-Famers, but 3 guys who kept producing long after they turned 30. This is good news for someone with A-Rod's skill set, but not exactly a surprise.

Perhaps the best part of it is that the staff writes with a straightforward, unpretentious style. They're married to the facts and the results, not to their own egos. They're not above mixing in some humor as well, as sampled here:

  • And I thought I was hard on the Reds: "The Cincinnati player development system is worthy of the term 'farm' only in the sense that the Stalinist collectives of the 1930s Ukraine were a 'farm system.'"
  • Talking about Reds slugger (and strikeout king) Adam Dunn: "The Brobdingnagian slugger can't carry the team to a pennant if every angry peasant in the Ohio Valley is poking him in the ankles with their pitchforks because they think he strikes out too much."
  • A lesson in park factors in the minor leagues: "The right field wall in Asheville is closer to the plate than this book is to you right now."
  • On a Florida prospect: "Johnson is so tall that the Marlins have replaced the button on his cap with a winking red beacon lest low-flying aircraft run into him."
  • This isn't funny, but it's the honest truth: "Few teams in history have been as consistently bad, with as few extenuating circumstances, as the 21st-century Royals."
  • On the Royals' overestimation of SS Angel Berroa's skills: "The Royals consider Berroa to be the anchor of their defense, which is true on a purely literal level."
  • On the platoon split of left-handed pitcher Mike Myers: "Mike Myers is limited to being a one-out guy; bring a righty pinch-hitter into the game and he kneels before Zod."
  • On slow-footed Jason Giambi: "Giambi runs as if knee-deep in lobster bisque and at top speed can stretch a double into a close out at first base."
  • On the idea that the Yankees need a "traditional leadoff man" i.e. pure base-stealer, instead of Derek Jeter at the top of the order: "It's rank stupidity . . . The suggestion seems to be that the Yankees would be better with a leadoff man who stole more bases but scored fewer runs. Only in professional sports journalism, where many of the writers makes Judith Miller look like a paragon of evenhanded, informed reporting, could that make sense."

I've focused on the funny, off-the-wall insights, but once I finish the book, I'll take a look at the many interesting ideas and spot-on analyses the BP staff express about different teams in the book.

No comments: