I don't have access to aggregate data below the season totals. No home/road. No lefty/righty. Those splits are only available for some individuals. To continue my handedness analysis I thought about using the top ten batters in each of four major categories.
Both: switch hitters
RR: bats right, throws right
LR: bats left, throws right
LL: bats left, throws left
BR: bats both, throws right
However, even individual splits do not go back much past 1950. That's about as far back as reliable computer readable play-by-play data goes, as opposed to just box score data. It's the difference between individual plate appearances (PA) and totals for a game.
retrosheet.org seemed to have data that could be derived. However, the Player Batting Splits page states:
vs RHP - Against Right-Handed Pitchers
vs LHP - Against Left-Handed Pitchers
vs RHS* - Against Right-Handed Starters
vs LHS* - Against Left-Handed Starters ...
* - The RHS and LHS categories will be used when we don't have play-by-play data for a game and not all the pitchers on the opposing team in the game threw from the same side. In those cases, we use RHS if the starter was right-handed and LHS if the starter was left-handed.
retrosheet splits for Rogers Hornsby:
P AB H BA
RHP 5,038 1,769 0.351
RHS 1,083 432 0.399
R 6,121 2,201 0.360
LHP 1,276 436 0.342
LSP 775 293 0.378
L 2,051 729 0.355
Tot 2,930 0.359
Using the retrosheet splits I derived Hornby's splits for batting average (BA) against all RHP/LHP as .360/.355.
Rogers Hornsby splits at baseball-reference.com are only v. RHS and LHS: .362/.351.
That's a five point difference on retrosheet and an eleven point difference on baseball-reference. Plus, total at bats (AB) are 8,172 and 8,115 respectively.
Neither seems satisfactory.
retrosheet has season data in the Year Split Page:
R vs. R - Right-Handed Hitters Against Right-Handed Pitchers
R vs. L - Right-Handed Hitters Against Left-Handed Pitchers
L vs. R - Left-Handed Hitters Against Right-Handed Pitchers
L vs. L - Left-Handed Hitters Against Left-Handed Pitchers
If we don't have play-by-play data for a game and not all the pitchers on the opposing team in the game threw from the same side, these plate appearances are not included in the "R vs. R", "R vs. L", "L vs. R" and "L vs. L" categories.
Unfortunately, all seasons, which are in separate pages and download files (that's what I mean that the data is not aggregated), seem to point to that same page description of the data.
The Splits for 1920 ML
The Splits for 2010 ML
Click those links and check for yourself.
I'm pretty sure that there is little if any play-by-play for 1920 but that 2010 data is 100% play-by-play.
I'll probably take a sample, like the census, every ten years from 1920 through 2010.
Comments and suggestions are welcome.