Columns, Sports

Sabermetrics: All you know about baseball statistics is wrong — hitting

Sports Columnist
Friday, April 5, 2013

The theatrical release of “Moneyball” in 2011 marked the first time many baseball fans — as well as non-sports enthusiasts — were ever exposed to the next generation of baseball statistics, affectionately known as sabermetrics. With the college baseball season well underway and Major League Baseball just getting started, now is the perfect time to explain some of these new ways of thinking about the game, starting today with hitting.

Conventional hitting numbers may be the easiest family of baseball statistics to understand, but that doesn’t mean the stats you’ll find on a baseball card are worth much. Luckily, sabermetrics has been able to quantify batting better than any other major aspect of the game, though these advances have yet to be publicly applied at the college level.

Arguably the most meaningless popular statistic is runs batted in (RBI), the number of baserunners that come across home plate to score as the result of a hitter’s batted balls. As a matter of storytelling, it makes sense to take note of who knocks in whom within a game. But this is less effective as a measure of skill over the course of a season.

The obvious problem that even high-profile baseball analysts overlook is that not all hitters have the same opportunities to drive batters in. This situation is similar to that of “clutch” hitters. While some argue that certain clutch hitters possess the preternatural ability to come up with big hits in important situations, it is generally accepted among the sabermetric community that clutchness is far less a factor in those situations than overall skill and random chance — if it exists at all. Those hitters with especially good clutch statistics are usually creations of small sample sizes.

Looking at runs scored as a measure of a player’s skill is a similar conflation of individual performance and circumstances beyond the batter’s control. Reaching base and getting in position to score is something for which a hitter could justly be credited, but obviously he has nothing to do with the subsequent batters’ abilities to drive him in.

Perhaps the most misunderstood statistic in baseball is batting average, if only because of its pervasiveness. A player who hits .300 is universally understood to be a very good hitter. It takes a true legend to hit .400 and someone batting under .200 — the so-called Mendoza Line — will probably be benched in short order. A closer look at the iconic statistic reveals some major problems rendering it ill-suited for judging a player.

Aside from the luck of how batted balls bounce affecting batting average and leading to averages taking a long time to stabilize, the most obvious problem with batting average is that it does not include walks. As those who saw — or better yet, read — “Moneyball” know, the specific ability to get hits is not as important as the more generic ability to get on base. A batter who goes 30-for-100 with no walks is not as valuable as a hitter who goes 20-for-80 with 20 free passes.

Maybe more importantly, batting average treats all hits — singles, doubles, triples and home runs — as equal. This, as baseball writer F.C. Lane famously analogized over 100 years ago, is like taking a pile of change and just counting the number of coins rather than separating the nickels from the dimes. It should be blatantly obvious that a home run is worth more than a single, but batting average makes no such distinction.

Over the last several decades, many statisticians have built more nuanced substitutes for batting average that distinguish between outcomes beyond the oversimplified binary of hits and outs. The best-known metric today is weighted on-base average, which puts singles, doubles, triples, home runs, walks and even hit-by-pitches on the same scale to estimate how many runs a hitter produces.

Unfortunately there are no non-proprietary examples of these statistics available for college baseball, but there are two better alternatives to batting average that are at our disposal: on-base percentage, the proportion of a batter’s plate appearances in which he reaches base safely — basically batting average that includes walks — and slugging percentage, the measure of a hitter’s total bases per at-bat (a single is one base, a double is two bases, etc.). The two are often combined to make on-base plus slugging percentage, a crude but pretty good measure of a player’s overall hitting ability.

As an example of how these statistics tell a different story than batting average, let’s turn to the final stat sheets for the 2012 Brown Bears. Of the 10 hitters who got at least 75 plate appearances last year, third baseman Nick Fornaca ’15 led the way with a .328 batting average. But, despite hitting just .318, first baseman Cody Slaughter ’13 was the team’s best hitter because he led the Bears in both on-base percentage with a .403 and slugging percentage with a .496. In a similar vein, designated hitter Mike DiBiase ’12 ranked dead-last among Brown regulars with a .256 batting average, but his power helped him to a .419 slugging percentage, higher than the team average of .394, and his team-leading 26 walks boosted his on-base percentage to .401, the second highest on the roster.

Though it is too early for the sample sizes to be significant, we can also apply this thinking to the Bears’ current season. Using statistics current as of Wednesday, outfielder and designated hitter Daniel Massey ’14 currently leads the team with a .367 batting average, but Slaughter with a .387 and right fielder Will Marcal ’15 with a .371 have gotten on base at a higher rate than Massey with a .365. In fact, due to a peculiarity in how batting average is calculated — sacrifice hits are not counted as at-bats in the denominator so as to encourage batters to put the team interest ahead of individual success — Massey’s OBP is actually lower than his batting average.

There’s still a month left of Brown baseball and nearly a full MLB season yet to be played. I hope this will give you some better perspective on how good the players are the next time you’re at the ballpark. Next week, we’ll take a look at pitching statistics and why they may be even worse.

  • Pablo Cruz

    Outstanding review of measuring hitting performance. Another successful hitting factor is the amount of taken pitches. Unfortunately scouts, satisfying the economic appetite of team owners, only account for traditional stats to evaluate players. The fan knowledge of the game is been diminishing over time as such the excitement of going to fences is the satisfaction owners attempt to deliver.