The Hall of Stats was conceived because the Hall of Fame voting process has become a political nightmare. A massive backlog of worthy candidates is piling up—some because of association with PEDs (or simply suspicion), but some because voters just don’t realize how good they were. There seems to be a false perception of what the Hall of Fame actually is. It’s not all Babe Ruth, Christy Mathewson, Ty Cobb, and Honus Wagner. For every Walter Johnson in the Hall of Fame there’s a Jesse Haines. For every Hank Aaron there’s a Tommy McCarthy.
Should each player better than Haines and McCarthy get in? No. But a player shouldn’t have to be Babe Ruth—or even Bert Blyleven—to get into Cooperstown.
The Hall of Stats uses a formula called Hall Rating to rank every player in baseball history. Hall Rating combines the value of a player’s peak and longevity into a single number that represents the quality of that player’s Hall of Fame case. It’s not perfect, but there’s a lot to be said for rating all players in history according to the same objective criteria.
There are 220 players in the Hall of Fame based on their MLB careers. According to the Hall of Stats, Blyleven ranks #33 among eligible players. He should have breezed into the Hall, but instead it took fourteen tries. Curt Schilling ranks #47. Jeff Bagwell ranks #55. Kenny Lofton—who received less than 5% of the vote—ranks #97. There’s no reason to keep these players out of the Hall of Fame… if you look at things objectively.
That’s what the Hall of Stats does. It ignores anything that happened off the field. The Hall of Stats takes the current number of players in Cooperstown (220), kicks everybody out, and re-populates itself with the top 220 players according to Hall Rating. What you get is an objective Hall free of politics, grandstanding, and double jeopardy.
Hall Rating is based on Wins Above Replacement (WAR) and Wins Above Average (WAA) from Baseball-Reference. A series of adjustments are made to deal with shorter 19th century schedules, greater 19th century pitching workloads, the grueling act of catching, and more. The adjusted WAR component represents longevity while the adjusted WAA component represents peak. They are combined and indexed to 100 so the Hall of Stats borderline is represented by a Hall Rating of 100.
Babe Ruth has a Hall Rating of 399. Blyleven’s is 190. Lofton’s is a robust 132 while Hall of Famer McCarthy’s is merely 28. Of the 220 players in the Hall of Fame, 67 (just about one third) are removed from the Hall of Stats.
The Hall of Stats aims to show how run- and win-value statistics can be used to measure a player’s Hall of Fame case. It evolves to reflect the best data currently available (players can be added and removed, meaning this Hall doesn’t cling to its mistakes). The Hall of Stats also visualizes what a “default Hall” would look like if it were populated simply by the numbers. Should numbers be the only arbiter of who gets into Cooperstown? Certainly not. The Hall of Stats is merely meant to serve as a conversation starter. That objective starting point is one thing that’s sorely lacking in the Hall of Fame voting process today.
The Formula Back to Top
The Hall of Stats is populated by Hall Rating, a mathematical formula based on the Baseball-Reference versions of Wins Above Average (WAA) and Wins Above Replacement (WAR). WAA combines all aspects of a player’s game—hitting, pitching, baserunning, fielding, positional value, and more—and estimates how many more wins that player was worth than an average player. WAR takes that a step further and estimates how many more wins the player is worth than a replacement player. (I wrote an article with more detail about Wins Above Average vs. Wins Above Replacement.)
The precursor to the Hall of Stats was called the Hall of wWAR. wWAR stands for “weighted Wins Above Replacement”, which basically means the formula starts with WAR and applies a series of weights. wWAR is still a big part of the Hall of Stats, but it now has a completely different formula.
wWAR = adjWAR + (1.8*adjWAA)
Before I go into what adjWAR and adjWAA are (and where the 1.8 comes from), I want to explain what Hall Rating is.
Hall Rating is simply wWAR expressed in a more intuitive way (you’ll see Hall Rating displayed on the Hall of Stats, but not wWAR). The Hall of Stats borderline for induction is represented by a Hall Rating of 100. This is similar to how 100 represents league average in OPS+ or wRC+.
With a Hall Rating of 399, you could say that Babe Ruth’s career was worth about four Hall of Fame careers. Meanwhile, Billy Pierce essentially sits on the Hall of Stats borderline with a Hall Rating of 101. Hall of Famer Lou Brock is not included in the Hall of Stats because his Hall Rating is just 71.
adjWAR (Adjusted Wins Above Replacement)
adjWAR attempts to capture the value of the player above a replacement player. It starts with a player’s WAR and undergoes a series of adjustments:
- Position player WAR is adjusted for schedule length. In this case, a hitter gets more credit for a 3.0 WAR season during an 80-game schedule than he does for a 3.0 WAR season in a 162-game season.
- This same adjustment is not given for pitchers, since shorter schedules allowed pitchers to be used more often. The exception is strike- or war-shortened years (where both pitchers and hitters are given an adjustment for how long the schedule would have been).
- It is important to note that this adjustment is made based on the schedule length and not the number of games the player appeared in. A player who appeared in 120 games of a 162-game schedule does not receive any extra credit. But a player who appeared in 120 games of a 140-game schedule would receive some credit—but only for the difference between 140 and 162 games (not 120 and 162 games).
- Players are not given 100% of the credit for games they did not play. Instead, they are awarded the average of their actual WAR and their projected WAR. This keeps us from over-adjusting for 19th century players.
- Catchers receive a generous positional adjustment from WAR. But this adjustment only rewards them for time actually spent on the field. Catchers play fewer games in a season and have shorter careers. Therefore, catchers are given an extra 20% boost by adjWAR. Without this adjustment, there would be very few catchers in the Hall of Stats. And that just wouldn’t be right.
- Relievers are similar to catchers in that they get a boost from WAR (via the leverage index), but it is not nearly enough to bring their WAR values close to their starting counterparts. I’m actually not sure what type of adjustment relievers should get (if any). Without an adjustment, we would have no relievers in the Hall of Stats. I decided to simply use the same adjustment I used for catchers. (This helped Hoyt Wilhelm gain induction while Rich Gossage fell short).
- Most recently, I added an adjustment for 19th century pitchers (specifically, before the mound moved back to its current distance in 1893). These pitching seasons also get a 20% adjustment, but this one impacts them negatively (because of the ease of compiling a ridiculous number of innings).
adjWAA (Adjusted Wins Above Average)
While adjWAR measures total career value, adjWAA aims to measure peak value. It begins with Wins Above Average and also undergoes some adjustments:
- Seasons with negative WAA are ignored. adjWAA only wants the seasons where the player was above average. For example, Pete Rose has a huge discrepancy between his WAA and his adjWAA. This is because he hung on for several years as a below average player pursuing the all-time hits record. adjWAA doesn’t penalize him for this as it is already captured in adjWAR.
- In cases where a player’s WAR and WAA are very close to each other, no
WAA is counted. The cases where this occurs is where the talent level is low, for example:
- The 1884 Union Association had the lowest talent level of all Major Leagues. For this reason, the league average is essentially replacement level. I didn’t count any Wins Above Average from the Union Association at all.
- League average for pitchers batting value is also typically at replacement level.
- Catchers, relief pitchers, and 19th century (pre-1893) pitchers are adjusted the same way as they are for adjWAR.
The Hall of Stats equally weighs a player’s career value (adjWAR) and peak value (adjWAA). These numbers, however, are on different scales. adjWAA is multiplied by 1.8 to adjust for this.
To get 1.8 (actually 1.7824241636184), I collected all Hall of Fame inductees and divided their total adjWAR by their total adjWAA.
More About Baseball-Reference’s WAR
If you are interested in what exactly goes into Baseball-Reference’s implementation of WAR, they have written about the calculations in incredible detail.
- It pains me, but the Hall of Stats does not include Negro League players. The data simply isn’t reliable yet—but it is getting better. As soon as I figure out a way to do it, Negro League stars will be recognized by the Hall of Stats.
- The Hall of Stats doesn’t adjust for time lost to military service. This is something I go back and forth on. I’d love to hear your feedback about this.
- I’m also not happy yet with how the Hall of Stats handles relief pitchers. Your feedback is welcome here as well.
- The Hall of Stats doesn’t recognize postseason performance (as the Hall of wWAR did). I’m thinking of a good way to do this, but again feedback is welcome.
- There are some quirks I’d like to improve with the franchise pages. I documented them in the launch announcement.
Similarity Scores Back to Top
Baseball-Reference uses Bill James’ similarity scores on their player pages. While Baseball-Reference and Bill James are both wonderful, I don’t think their similarity scores are all that useful.
What James’ scores show is that two players’ raw numbers were similar. Here’s an excerpt from the point system used to identify a pair of "similar" batters:
- One point for each difference of 2 home runs.
- One point for each difference of .001 in batting average.
The issue here is that these numbers are not adjusted for era, park, or anything else. A .300 batting average with 8 home runs in the deadball era made you a star. A player with those same numbers in the steroid era actually may have been a below average player, depending on his position.
Speaking of position, here is part of James’ positional adjustment:
- 240 - Catcher
- 168 - Shortstop
- 132 - Second Base
The 240-point adjustment is applied to all players who primarily caught, regardless of the player’s time spent behind the plate or at other positions.
How We Do It
The Hall of Stats similarity scores are calculated with one thing in mind: value. We don’t care how many home runs a player hit or what his batting average was. We care how many runs above average his total offensive game was. Similarly, we don’t care what his primary position was. We care about the run value of the time he spent at each of his positions.
Our similarity scores are calculated using:
- WAR Batting Runs
- WAR Baserunning Runs
- WAR Double Play Runs
- WAR Defensive Runs
- WAR Positional Runs
- WAR Pitching Runs
- Plate Appearances
- Innings Pitched
The closer a pair’s score gets to zero, the more similar the players are. Because most of the inputs are centered around league average, the better a player gets, the harder it is for him to have closely similar players. For example:
- Ken Boyer and Sal Bando are very similar players (right down to three characters in their first name and five in their last). Their similarity score is 80. Once you see a pair of players of their caliber that close, you know they provided very similar value.
- Rob Deer, on the other hand, has a score of 80 or better with 22 players. Deer is closer to an average player and there are many more players at that part of the bell curve to be similar to.
- Lastly, there is basically no good comparison to Babe Ruth. Barry Bonds is the closest with a staggering 896 similarity score.
(Note: Similarity scores are currently available for all players with 1500+ plate appearances or 500+ innings pitched.)
Special thanks to Tim Vaughan (@MechanicalTim) for giving us a crash course in how to calculate similarity scores.
More About the Project Back to Top
- All data is based on the Baseball-Reference version of Wins Above Replacement (WAR). This version was originally created by Sean Smith (aka RallyMonkey) and made available at BaseballProjection.com.
- I also utilized the Sean Lahman Baseball Database for things not made available in Baseball-Reference’s WAR downloads.
- The number of Hall of Stats inductees is kept consistent with the Hall of Fame (220 inducted as players). This is to show the difference in quality between the two Halls from top to bottom.
- The Hall of Stats honors lifetime bans imposed by Major League Baseball so that both the Hall of Fame and Hall of Stats are pulling from the same pool of eligible players.
- The Hall of Stats ignores performance enhancing drugs. There’s just no reliable way to fairly account for them.
- The Hall of Stats only has a “player” designation while the Hall of Fame has other designations. At one time, Al Spalding was a Hall of Stats inductee while he is not considered a Hall of Fame player (he was inducted as a Pioneer/Executive).
- How players are added to the Hall of Stats:
- A player can be added as soon as he becomes eligible, as long as he is one of the best x eligible players by Hall Rating (where x is the number of players in the Hall of Fame). For example, when Greg Maddux becomes eligible, he simultaneously be voted into the Hall of Fame and added to the Hall of Stats.
- A player can be added if he is the next best eligible player by Hall Rating and a player is inducted into the Hall of Fame who already is a member of the Hall of Stats. For example, when Deacon White was inducted into the Hall of Fame, he was already a member of the Hall of Stats. Therefore, the next-best player outside the Hall of Stats (by Hall Rating) was inducted (in order to keep both Halls the same size).
- A player can also be added or removed if the Hall Rating formula is updated. The Hall of Stats strives to use the best data possible at all times. The Hall of Stats does not cling to “mistakes”—it corrects them.
The Team Back to Top
Ever since introducing the Hall of wWAR in March of 2011, Adam Darowski has been obsessed with the idea of the Hall of Stats. He conceived the idea for the site, developed the formula, crunched the numbers, designed the site, and built the site’s front end. A product designer by day for Dribbble, he continues to modify the site and the formula while writing the articles. Adam lives in New Hampshire with his wife and three young children. He tweets about baseball at @baseballtwit and everything else at @adarowski. He also runs the Hall of Stats Twitter (@HallOfStats) and Facebook accounts.
Jeffrey, a software developer who also works for Dribbble, helped Adam get the Hall of Stats up and running by building many of the site’s original features. Adam and Jeffrey previously collaborated on the Red Sox Hall of wWAR, a Baseball Hack Day project. You can follow Jeffrey on Twitter at @semanticart.
Michael, also a software developer, originally took on special projects like player search, similarity scores, and season stats. Since the original launch, he has built most of the new features like positional pages, player rankings, and franchise pages and charts. A former co-worker of Adam’s, Michael currently works at MeYouHealth. In addition to writing code, he is a singer/songwriter and linguist. You can follow Michael on Twitter at @hal678.
The Tech Back to Top
The Hall of Stats is open sourced and available on GitHub.
I’ve received multiple requests to make my data available. The following files are available as a CSV:
Thank You Back to Top
Thank You to my three favorite baseball Seans—Sean Forman (@sean_forman) of Baseball Reference, Sean Smith for originally creating this WAR framework, and Sean Lahman (@seanlahman) for his work on the Lahman Baseball Database. Also, thank you to Dan McCloskey (@_LeftField) and Sky Kalkman (@Sky_Kalkman) for letting me bounce ideas off them along the way. Thank you to the brilliant readers of High Heat Stats and Beyond the Box Score for providing wonderful feedback ever since I introduced wWAR. Finally, a huge thank you to Jeffrey and Michael (and Tim!) for helping me build the site of my dreams.