Tuesday, March 13, 2012

Batting Average on Balls in Play: fun, games and common sense.

http://mlb.mlb.com/network/promotions/clubhouse_glossary.jsp

BABIP: Batting Average on Balls in Play ... (H - HR)/(AB - K - HR + SF)
How many of a player's balls in play go for hits. BABIP removes HR, BB, and K -- outcomes not impacted by defense -- from the player's average. It can serve as a rough estimate of a player's luck, and can help predict future performance. The league rate stays around .300 (it was .295 in 2011). A higher BABIP often means a player is on the right side of luck (hits are falling in) and will regress. A lower number often implies a player is on the wrong side of luck and will improve. The metric can be used mostly for pitchers, and somewhat for hitters.
____________________

On the MLB Network Clubhouse Confidential TV program I had not noticed BABIP used less forcefully for batters than for pitchers but maybe it was.  BABIP seems to make the most sense for fielding, especially team fielding, but we'll leave that aside for now.

Because I have stressed the silliness of baseball being played in non-uniform playing areas from the first radical baseball post in 2006, BABIP has a fundamental flaw: home runs are not included.

Since almost all home runs in recent decades go over the fence, i.e., they are not inside the park home runs, they are hits that travel at least 300 feet on the fly and are therefore balls that are hit hard.  To discount them is silly.

For players who keep the same real batting average (BA) in the same number of at bats (AB) from one season to the next an increase in home runs will decrease the player's BABIP: fewer hits in fewer AB.  Conversely, an increase in strike outs (SO), which are also not counted, will increase the player's BABIP: same number of hits in fewer AB in BABIP.

In 2011 pitcher Justin Verlander had 250 SO in 969 plate appearances against him and 904 AB.  Using the BABIP equation, the denominator is 904 - 250 - 24 + 3 = 633.  BABIP numerator: 174 - 24 = 150.  That's 150/633 = .237.  Verlander 's career BABIP, including 2011, is .287.  So the conventional wisdom is that Verlander was lucky and very unlikely to repeat.

I don't believe in luck over six months and several hundred AB: 633.  I'm thinking that those 250 SO are an indication that Verlander was overpowering and that in many of the AB in which batters put the ball in play the ball was not hard hit, leading to more outs than Verlander usually gets on balls in play.

There is some indication that other factors are being addressed, like how many line drives were hit.  Recently, MLB CC host Brian Kenny provided these numbers for 2011 BABIP:
- flies: .152
- grounders: .233
- liners: .707

Of course, there are no definitions and no numbers.  What percent of balls in play are each?

Plus, flies omits home runs, all of which are hits.  Home runs are omitted because the pitcher supposedly has no control over balls hit so far that his fielders cannot catch them.  Say what?  That would be silly if all parks had the same home run distances in all direction and the same wall heights, you know, like it should be, like NFL and NBA do it.  Logically.  But with those factors being random the pitcher should be held accountable for 300 foot fly balls.

What's the BABIP for 300 foot fly balls, including home runs of course?

And since BA is in disrepute generally, why is it used here? What about reach on error (ROE)?  What about slugging (SLG)?  What about adjusting for ball parks?  Why not mush all that together and then come up with a number?

Sportvision is generating detailed data, which individual MLB teams are using in proprietary ways to evaluate performance.  This supposedly includes trajectory and speed of batted balls.  It seems to me what is needed is time and distance, which should be available in the videos.  Time could be calculated from the number of frames between when the ball is hit and when the ball touches something.  Distance must be in the system, perhaps from google earth images.  Combine them and you can then better define terms.

We think we know what a line drive is.  How about these:
1. liner to pitcher who drops it and throws out batter; probably a grounder
2. liner over an infielder that lands on the infield dirt; beats me
3. fly caught by an outfielder

Time and distance should provide a good idea of whether a ball shoud be caught and how hard it was hit.

And if you want to measure luck, why not just categorize all events.  For instance a 400 foot rocket to center field that is caught should be considered a home run.  In fact all flies over 380 feet should be home runs and home runs shorter than 380 feet should doubles.  Heck they track every play, so why use a guesstimate like BABIP at all except for prior seasons that were not tracked?  And for them use something better.

No comments: