Saturday, March 2, 2013

Lefty/Righty splits: historical data is difficult to find.

I don't have access to aggregate data below the season totals.  No home/road.  No lefty/righty.  Those splits are only available for some individuals.  To continue my handedness analysis I thought about using the top ten batters in each of four major categories.

Definitions:
Both: switch hitters
RR: bats right, throws right
LR: bats left, throws right
LL: bats left, throws left
BR: bats both, throws right

However, even individual splits do not go back much past 1950.  That's about as far back as reliable computer readable play-by-play data goes, as opposed to just box score data.  It's the difference between individual plate appearances (PA) and totals for a game.

retrosheet.org seemed to have data that could be derived.  However, the Player Batting Splits page states:

vs RHP       - Against Right-Handed Pitchers
vs LHP       - Against Left-Handed Pitchers
vs RHS*      - Against Right-Handed Starters
vs LHS*      - Against Left-Handed Starters ...

* - The RHS and LHS categories will be used when we don't have play-by-play data for a game and not all the pitchers on the opposing team in the game threw from the same side. In those cases, we use RHS if the starter was right-handed and LHS if the starter was left-handed.
_____________________________________________________________

retrosheet splits for Rogers Hornsby:

P AB               H BA
RHP 5,038 1,769 0.351
RHS 1,083 432 0.399
R 6,121 2,201 0.360

LHP 1,276 436 0.342
LSP 775 293 0.378
L 2,051 729 0.355

Tot 2,930 0.359

Using the retrosheet splits I derived Hornby's splits for batting average (BA) against all RHP/LHP as .360/.355.

Rogers Hornsby splits at baseball-reference.com are only v. RHS and LHS: .362/.351.

That's a five point difference on retrosheet and an eleven point difference on baseball-reference.  Plus, total at bats (AB) are 8,172 and 8,115 respectively.

Neither seems satisfactory.

retrosheet has season data in the Year Split Page:

R vs. R      - Right-Handed Hitters Against Right-Handed Pitchers
R vs. L      - Right-Handed Hitters Against Left-Handed Pitchers
L vs. R      - Left-Handed Hitters Against Right-Handed Pitchers
L vs. L      - Left-Handed Hitters Against Left-Handed Pitchers

If we don't have play-by-play data for a game and not all the pitchers on the opposing team in the game threw from the same side, these plate appearances are not included in the "R vs. R", "R vs. L", "L vs. R" and "L vs. L" categories.

Unfortunately, all seasons, which are in separate pages and download files (that's what I mean that the data is not aggregated), seem to point to that same page description of the data.

The Splits for 1920 ML

The Splits for 2010 ML

Click those links and check for yourself.

I'm pretty sure that there is little if any play-by-play for 1920 but that 2010 data is 100% play-by-play.

I'll probably take a sample, like the census, every ten years from 1920 through 2010.

Comments and suggestions are welcome.

No comments: