Average Batting Skill Through Major League History: Landmarks of Sabermetrics, Part I

Today still finds me in Arizona but at another conference, this one the premier edition of the SABR Analytics Conference in Mesa. Among the several compelling presentations on this first of a three-day event was Bill Squadron’s stunning presentation of the Bloomberg Analytics program, now adopted by twenty-four of MLB’s thirty clubs, which integrates pitch-by-pitch results with batted ball locations and coordinated video. To one who had a hand in the early days of sabermetrics, it was simply amazing.

Seated next to me was my old friend Richard D. Cramer, author of the first such program for an MLB club: the EDGE system created by Stats Inc. for the Oakland A’s in 1981. Dick is a man of manifold accomplishments, but for me it is hard to overestimate the brilliance and enduring influence of his article below, which he first published in SABR’s Baseball Research Journal in 1980. To celebrate SABR’s pioneering role, I offer it below as the first of three landmark essays that appeared in that publication.

It is today a commonplace wisdom that players of different eras may be compared by their relative dominance over league average performance. By this method Bill Terry’s .401 in 1930, when the NL batting average was .303, may be viewed as the same achievement as Carl Yastrzemski’s .301 in 1968, when the AL batted .230, in that each exceeded his league average by about 32 percent.

However, as Pete Palmer and I wrote in The Hidden Game of Baseball in 1984, “The trouble with this inference, reasonable though it is on its face, lies in a truth Einstein would appreciate: Everything is relative, including relativity. The National League batting average of .266 in 1902 does not mean the same thing as the American League BA of .266 in 1977, any more than Willie Keeler’s .336 in 1902 means the same thing as Lyman Bostock’s .336 in 1977: It does violence to common sense to suppose that, while athletes in every other sport today are measurably and vastly superior to those of fifty or seventy-five years ago, in baseball alone the quality of play has been stagnant or in decline. Keeler’s and Bostock’s Relative Batting Averages are identical, which signifies that each player exceeded his league’s performance to the same degree. But the question that is begged is ‘How do we measure average skill: What do the .266s of 1902 and 1977 mean?’”

Here, without further preamble, is Dick Cramer’s article:

Average Batting Skill Through Major League History

Is the American or the National a tougher league in which to hit .300? How well would Babe Ruth, Ty Cobb, or Cap Anson hit in 1980? What effect did World War II, league expansion, or racial integration have on the caliber of major league hitting? This article provides definitive answers to these types of questions.

The answers come from a universally accepted yardstick of batting competitiveness, comparing the performances of the same player in different seasons. For example, we all conclude that the National League is tougher than the International League because the averages of most batters drop upon promotion. Of course, factors other than the level of competition affect batting averages. Consider how low were the batting averages of the following future major leaguers in the 1971 Eastern League:

Lifetime BA,

 

1971 Eastern

majors (thru `79)

Bill Madlock

0.234

0.320

Mike Schmidt

0.211

0.255

Bob Boone

0.265

0.268

Andre Thornton

0.267

0.252

Bob Coluccio

0.208

0.220

Pepe Frias

0.240

0.239

Double A seems a bit tougher than the major leagues from these data because (1) this player sample is biased: most Eastern Leaguers don’t reach the majors, and I haven’t shown all the 1971 players who did, and (2) large and poorly lighted parks made the 1971 Eastern League tough for any hitter, as shown by its .234 league average. My study tries to avoid these pitfalls, minimizing bias by using all available data for each season-to-season comparison, and avoiding most “environmental factors” such as ball resilience or rule changes that affect players equally, by subtracting league averages before making a comparison. Of course, direct comparisons cannot be made for seasons more than 20 years apart; few played much in both periods, say, 1950 and 1970. But these seasons can be compared indirectly, by comparing 1950 to 1955 to 1960, etc., and adding the results.

Measures of batting performance are many. In the quest for a single accurate measure of overall batting effectiveness, I have developed the “batter’s win average” (BWA) as a “relative to league average” version of the Palmer/Cramer “batter’s run average” (BRA). (See Baseball Research Journal 1977, pp 74-9.) Its value rests on the finding that the scoring of major league teams is predicted from the BWA’s of its individual players with an error of ±21 runs (RMS difference) when all data are available (SB, CS, HBP, and GiDP as well as AB, H, TB, and BB) and about ±30 runs otherwise.

A property useful in visualizing the BWA in terms of conventional statistics is its roughly 1:1 equivalence with batting average, provided that differences among players arise only from singles. To make this point more clearly by an example, Fred Lynn’s +. 120 BWA led the majors in 1979. His value to the Red Sox was the same as that of a hitter who obtained walks, extra bases, and all other statistical oddments at the league average, but who hit enough extra singles to have an average .120 above the league, that is, a BA of .390. The difference between .390 and Lynn’s actual .333 is an expression mostly of his robust extra-base slugging.

The first stage in this study was a labor of love, using an HP67 calculator to obtain BWA’s for every non-pitcher season batting record having at least 20 BFP (batter facing pitcher) in major league history. The second stage was merely labor, typing all those BFP’s and BWA’s into a computer and checking the entries for accuracy by comparing player BFP sums with those in the Macmillan Encyclopedia. The final stage, performing all possible season-to-season comparisons player by player, took 90 minutes on a PDP10 computer. A season/season comparison involves the sum of the difference in BWA’s for every player appearing in the two seasons, weighted by his smaller number of BFP’s. Other weighting schemes tried seemed to add nothing to the results but complexity.

Any measurement is uncertain, and if this uncertainty is unknown the measure is almost useless. The subsequent treatment of these season/season comparisons is too involved for concise description, but it allowed five completely independent assessments of the level of batting skill in any given American or National League season, relative to their respective 1979 levels. The standard deviation of any set of five measurements from their mean was ±.007, ranging from .002 to .011. This implies that the “true” average batting skill in a season has a 2 in 3 chance of being within ±.007 of the value computed, and a 19 in 20 chance of being within ±.014, provided that errors in my values arise only from random factors, such as individual player streaks and slumps that don’t cancel. However, no study can be guaranteed free of “systematic error.” To cite an example of a systematic error that was identified and corrected: If a player’s career spans only two seasons, it is likely, irrespective of the level of competition, that his second season was worse than his first. (If he had improved, he was likely to have kept his job for at least a third season!) Another possible source of error which proved unimportant was the supposed tendency for batters to weaken with age (the actual tendency appears to be fewer hits but more walks). It appears that overall systematic error is less than 20 percent of the total differences in average levels. One check is that the 1972 to 1973 American League difference is attributable entirely to the calculable effect of excluding pitchers from batting, plus a general rising trend in American League skill in the 1970s.

Assessment of the relative strength of the major leagues, as might be expected, comes from players changing leagues. Results again were consistent and showed no dependence on the direction of the change. Results from the two eras of extensive interleague player movement, 1901 to 1905 and post-1960, agreed well also.

The results of my study are easiest to visualize from the graphical presentation [below]. (Because few readers will be familiar with the BWA units, I have not tabulated the individual numbers, but later convert them to relative BA’s and slugging percentages.) Theories on the whys and wherefores of changes in average batting skill I leave to others with greater personal and historical knowledge of the game. But the major trends are clear:

(1) The average level of batting skill has improved steadily and substantially since 1876. The .120-point difference implies that a batter with 1979-average skills would in 1876 have had the value of an otherwise 1876-average batter who hit enough extra singles for a .385 batting average.

(2) The American and National Leagues were closely matched in average batting strength for the first four decades (although not in number of superstars, the AL usually having many more). About  1938 the National League began to pull ahead of the American, reaching its peak superiority in the early ’60s. A resurgence during the ’70s makes the American League somewhat the tougher today, mainly because of the DH rule.

(3) The recent and also the earliest expansions had only slight and short-lived effects on batting competitiveness. However, the blip around 1900 shows the substantial effect on competition that changing the number of teams from 12 to 8 to 16 can have!

(4) World War II greatly affected competitiveness in 1944 and 1945.

Many baseball fans, myself included, like to imagine how a Ruth or a Wagner would do today. To help in these fantasies, I have compiled a table of batting average and slugging percentage corrections, based again on forcing differences in league batting skill overall into changes in the frequency of singles only. However, league batting averages and slugging percentages have been added back in, to reflect differences in playing conditions as well as in the competition. To convert a player’s record in year A to an equivalent performance in season B, one should first add to his year A batting and slugging averages the corrections tabulated for season A and then subtract the corrections shown for season B. The frequency of such other events as walks or stolen bases then can, optionally, be corrected for any difference in league frequencies between seasons A and B.

One interesting illustration might start with Honus Wagner’s great 1908 season (BWA=+. 145). What might Wagner have done in the 1979 American League, given a livelier ball but tougher competition?  The Table yields a batting average correction of – .059-(+.003)=- .062 and a slugging correction of – .020-(- .029)=+.009, which applied to Wagner’s 1908 stats gives a 1979 BA of .292 and SPct of .551. (In 600 ABs, he would have, say 30 HRs, 10 3BHs, 35 2BHs). Wagner’s stolen base crown and tenth place tie in walks translate directly to similar positions in the 1979 stats. That’s impressive batting production for any shortstop, and a “1979 Honus Wagner” would doubtless be an All-Star Game starter!

These results are fairly typical. Any 20th century superstar would be a star today. Indeed a young Babe Ruth or Ted Williams would out bat any of today’s stars. But of course, any of today’s stars–Parker, Schmidt, Rice, Carew–would before 1955 have been a legendary superstar. Perhaps they almost deserve their heroic salaries!

Facts are often hard on legends, and many may prefer to believe veterans belittling the technical competence of today’s baseball as compared, say, to pre-World War II. Indeed, “little things” may have been executed better by the average 1939 player. However, so great is the improvement in batting that if all other aspects of play were held constant, a lineup of average 1939 hitters would finish 20 to 30 games behind a lineup of average 1979 hitters, by scoring 200 to 300 fewer runs. This should hardly surprise an objective observer. Today’s players are certainly taller and heavier, are drawn from a larger population, especially more countries and races, are more carefully taught at all levels of play. If a host of new track and field Olympic records established every four years are any indication, they can run faster and farther. Why shouldn’t they hit a lot better?

3 Comments

When I was a young man, all I had to do was memorize the stats on the back of my baseball cards. I was satisfied with BA, RBI,HR,ERA, and stolen bases. I could accomodate those stats in my head. Today, and since SABR metrics began, I feel like I need calculus(a subject I could never have passed in High School) to understand the game that I love to watch. You and your work are amazing John, but I fear that the game may be going over my head. I may have to be content with the smell of the park, the taste of the dogs, the visuals, and the sounds of the bat hitting the ball. Those things will never go away. I envy your stay in Arizona-I am back in NJ for a few weeks.Harold

Pingback: Futility Infielder • BLOG

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 532 other followers

%d bloggers like this: