Monday, October 03, 2005

My stats background

If you are new to me and my blog (as most of you are), I wanted to just share with you a short tale of my background in baseball analysis. This is not just a biography (although partially that), but a way to let you know how I approach my work, so you can get a sense of where I come from and what my particular biases or advantages may be.
I've been a baseball fan as long as I can remember. I've just always enjoyed the game on a basic level, apart from statistics, as a game to played, watched and enjoyed. I played baseball in my backyard in Somerset, KY all through childhood. I played in my hometown all the way through little league before having to quit due to my theatre commitments (which is another blog entirely).
My first interest in baseball statistics is rooted in the 1986 Fleer baseball card set. If I had a prized possession as a precocious little 5-year-old, it was the complete set of 1986 Fleer that my brother and I collected. Oh, we had Topps as well, and even a few Donruss, but it was the Fleer that stuck out in my mind.
This would be the first evidence of my mania for organization and understanding. My brother and I would go through every team (Fleer being the only company that organized its sets by team, tho why others did not I'm clueless) and make out starting lineups based on who we thought the best player was. We'd pick a hitter from each position with the best batting average, and the pitcher with the most wins. But after a while, we altered this system. We noticed that some utility guys hit .350 in 12 games, and they certainly weren't what we had in mind. Brad Komminsk wasn't going to take Dale Murphy's place in my lineup, by God. So we instituted a 100-game minimum. We also started to notice that ERA was maybe as important as wins (and we had to take losses into account as well). That's where my stats mind started. The 86 Fleer set still rests in the cupboard with thousands of other cards. It is not in any sort of mint condition, but it holds a special place in my heart. And my brother and I can still recognize the names of even the most random players from the '86 set (Jeff Dedmon pitched for the Braves, and he's holding his glove in the picture; Ronn Reynolds was a backup catcher for the Mets; he's squatting in his picture).
Other than the fact that I would sort the cards into alphabetical order and back again, my stats passion stayed dormant for a while. We compiled a complete set of every major card company in 1989 (except Upper Deck, which was more expensive. Of course, that means that it's worth more now while the 1989 Score set is less valuable than the gum that came with it). My next foray into stats was when I bought Ken Griffey, Jr. Baseball for my Super Nintendo. It had the real stats of every major league player from the 1993 season. But (I guess due to contracting issues), they didn't have the players' real names. Darren Daulton looked like Darren Daulton and had Darren Daulton's stats, but his name was Rocky Balboa. So I went through and corrected every player (to the best of my knowledge) to the real name, with the help of a well-thumbed 1993 Athlon Sports Annual. I played a full season on that game as the Chicago White Sox, and I think I enjoyed keeping track of my stats as much as playing the actual games. I couldn't get Frank Thomas to break Maris' HR record, but he did hit .400 one year.
My baseball interest was still no more than any other fans with a slight interest in stats. And it pretty much stayed that way until 2002. I was getting back into baseball and started compiling lists of statistics myself just for kicks. I also started reading some baseball books. I noticed that my Dad had a book called "Whatever Happened to the Hall of Fame?" by some guy named Bill James. I read it, and it changed my baseball life.
I guess most people can recall the moment they were "born again" as baseball fans, with a Bill James book usually the key. I came along too late for the abstracts, but Bill's Hall of Fame book was great. I enjoyed the common-sense analysis, the historical appreciation, and especially Bill's straightforward tone and sense of humor. But the thing that got me most was that he gave me new ways to quantify baseball that I never had before. I was frustrated by having to use the same old HR, Wins, etc. that I had used in 1986. Bill had things like the Black Ink Test, that I calculated myself for all the Hall of Fame hopefuls. Soon I had to have these new baseball books, with Bill's Historical Abstract at the top of the list. He introduced me to Win Shares, a form of measurement that I loved at the time and still enjoy (tho to a lesser extent) now. Of course, as some of you may remember, this was about the same time as a book called "Moneyball" came out. I read (devoured) Moneyball. I read it twice. Critics of the book get distracted by the specifics; I saw the essential story of the book, which was that of someone who didn't just take conventional wisdom at face value; he did his own thinking, pioneering a new way of baseball, and using it to great effect. The use of enlightening logic and evidence to dispel nagging old wives' tales about baseball gave me energy. It made me want to do it myself.
Well, to make a long story short, I started keeping track of players' statistics, compiling and updating them myself, halfway through 2003. I'm still doing it now, although I keep learning more and more about how to get wiser. Reading and watching Peter Gammons has given me new insight into the real world of baseball, and Rob Neyer's column kept teaching me new things while it never stopped entertaining me.
Now, I'm entering into the more advanced scientific level of analysis. I consider myself to be pretty good at math; I got an A+ in AP Calculus in high school and got a 5 on the AP Test. But some of this statistical work (the book Curve Ball, Tangotiger's essay on how runs are created, sorting out VORP, WARP1, WARP2, and RCAA and DHK3.5c2) is just mind-boggling.
My approach is to use these stats for a greater understanding, but still stick to the basics (HR, OBP, ERA, etc.) to communicate a basic understanding of my argument. The problem in baseball isn't that people are using the wrong statistics so much as that people's basic understanding of statistics is wrong. That and the fact that many people refuse to even accept logic and evidence make me feel sometimes like Galileo trying to convince the Pope that the earth revolves around the sun. It's not a matter of analysing evidence differently. It's a matter of analysing it at all. If we can just get people to start looking at baseball from an entirely new angle, then the road to the acceptance of OPS, RCAA, VORP and their descendants will be much smoother.
I get a kick out of a .400 OBP. And I get a kick out of seeing my Braves win. I hope I never lose either one.

