March 2012

Relative Batting Average: Landmarks of Sabermetrics, Part III

This 1976 article by David Shoebotham, whom I did not know then–but was amazed to bump into at this week’s SABR Analytics Conference in Mesa, Arizona–was a revelation to me. No one, to my knowledge, had ever taken this approach to cross-era comparison of baseball statistics. It may seem like common sense now, as many folks have compared single-season dominance in one year to that of another, across all batting, pitching, and fielding stats, but then that is the halmark of a truly great idea: It seems simple after someone else has come up with it. Think of David’s contribution as Baseball’s Theory of Relativity.

Chatting with David at the conference, I speculated that no one had ever attached his approach to batter walks, and that Roy Thomas, Phillies outfielder in the first decade of the last century, might well seem the Babe Ruth of his category once his totals were weighed against league average. David thought this was a possible outcome, but he would go back home and check it out. I suspect we have not heard the last from him or his revolutionary innovation.

Enough prologue. Here is where normalization to league average began. Read on, from SABR’s Baseball Research Journal of 1976.

Relative Batting Averages Print E-mail

By David Shoebotham

Who has the highest single season batting average in major league history? The modem fan would probably say that Rogers Hornsby’s .424 in 1924 is the highest. Old timers would point out that Hugh Duffy hit .438 in 1894. But the correct answer is Ty Cobb with .385 in 1910.

How can .385 be higher than .438? The answer is when it is compared to the average of the entire league for the year in question. This is the only way performances from different seasons and leagues can be compared. Thus a hitter’s relative batting average, which is the true measure of his ability to hit safely, is computed as follows:

As a further refinement (since it is unfair to compare a player to himself) the player’s own hits and ABs can be subtracted from the league totals, thus giving an average relative to the remainder of the league.

As an example, compare Bill Terry’s National League leading .401 in 1930 to Carl Yastrzemski’s American League leading .301 in 1968. At first glance the 100-point difference would make it appear that Yastrzemski’s average should not be mentioned in the same breath as Terry’s. But look at the calculations of relative averages:

The relative averages are almost identical, meaning that had the two performances occurred in the same season, the batting averages would have been within a few points of each other. The big difference, of course, is that in 1930 the National League had a combined average of .303, the highest of any major league in this century (and two points higher than Yastrzemski’s 1968 average), whereas in 1968 the American League had a combined average of .230, the lowest for any major league ever. (A relative average of 1.30 indicates that a player’s batting average was 30% higher than the remainder of his league.)

The following two graphs show league averages since 1900. It can be seen that the 1920s and ’30s, following the introduction of the lively ball, were fat times for hitters. Both leagues reached their recent lows in 1968, the “Year of the Pitcher.” Note that for the last three seasons the American League’s Designated Hitter rule has artificially raised the league’s average and thus lowered individual relative averages.

The table below shows the highest single season relative averages since 1900. The list is clearly dominated by Ty Cobb, who has 10 of the top 19 averages, including the highest of all: 1.594 in 1910. Interestingly, the second highest relative average is Nap Lajoie’s 1.592, also in 1910. That epic batting race, enlivened by the offer of a new car to the winner, resulted in a major scandal, the awarding of two automobiles, and incidentally the two highest relative averages of all time. Rogers Hornsby’s .424 produced the highest National League mark of 1.51, but this ranks only 14th on the list. (Duffy’s .438 reduces to a relative average of about 1.42.) Note that five of this century’s .400 averages do not qualify for this list.
Single Season Relative Average Greater Than 1.45

Lea.

Rel.

Rank

Player Year League AB

Hits

Avg.

Avg.

Avg.

1

Ty Cobb 1910 Amer.

509

196

0.385

0.242

1.594

2

Nap Lajoie 1910 Amer.

591

227

0.384

0.241

1.592

3

Nap Lajoie 1904 Amer.

554

211

0.381

0.243

1.570

4

Tris Speaker 1916 Amer.

546

211

0.386

0.247

1.570

5

Ty Cobb 1912 Amer.

553

227

0.410

0.263

1.560

6

Ty Cobb 1909 Amer.

573

216

0.377

0.242

1.560

7

Ty Cobb 1917 Amer.

588

225

0.383

0.246

1.560

8

Ty Cobb 1911 Amer.

591

248

0.420

0.271

1.550

9

Nap Lajoie 1901 Amer.

543

229

0.422

0.275

1.530

10

Ty Cobb 1913 Amer.

428

167

0.390

0.254

1.530

11

Ted Williams 1941 Amer.

456

185

0.406

0.265

1.530

12

Ted Williams 1957 Amer.

420

163

0.388

0.254

1.530

13

Ty Cobb 1918 Amer.

421

161

0.382

0.252

1.520

14

Rogers Hornsby 1924 Nat.

536

227

0.424

0.281

1.510

15

Joe Jackson 1911 Amer.

571

233

0.408

0.271

1.510

16

Joe Jackson 1912 Amer.

572

226

0.395

0.263

1.500

17

Ty Cobb 1916 Amer.

542

201

0.371

0.247

1.500

18

Ty Cobb 1915 Amer.

563

208

0.369

0.247

1.500

19

Ty Cobb 1914 Amer.

345

127

0.368

0.246

1.490

20

Honus Wagner 1908 Nat.

568

201

0.354

0.237

1.490

21

Cy Seymour 1905 Nat.

581

219

0.377

0.253

1.490

22

George Sisler 1922 Amer.

586

246

0.420

0.283

1.490

23

Joe Jackson 1913 Amer.

528

197

0.373

0.254

1.470

24

Tris Speaker 1912 Amer.

580

222

0.383

0.263

1.450

25

Stan Musial 1948 Nat.

611

230

0.376

0.259

1.450

26

George Stone 1906 Amer.

581

208

0.358

0.247

1.450

27

Joe Torre 1971 Nat.

634

230

0.363

0.251

1.450

28

George Sisler 1920 Amer.

631

257

0.407

0.282

1.450

29

Honus Wagner 1907 Nat.

515

180

0.350

0.242

1.450

With the modem preoccupation with home runs, high relative averages (not to mention high absolute averages) have become rare. The only relative average over 1.45 in recent years is Joe Torre’s 1971 mark.

For a look at other recent high marks, the next table shows the highest relative averages of the last 20 years. It is interesting to note that Rod Carew’s 1974 and 1975 marks would probably be well over 1.45 except for the Designated Hitter rule in the American League.

The final table shows the all-time leaders in career relative average. Not surprisingly, Ty Cobb tops the list with an average that is just a few hits short of 1.40. Close behind Cobb is Shoeless Joe Jackson, though the closeness of their averages is deceptive. Jackson’s career was abruptly terminated while he was still a star performer, and therefore he did not have the usual declining years at the end of his career that would have lowered his average. During the years that Jackson averaged 1.38, Cobb was averaging a fantastic 1.50.

It can be seen that despite the preponderance of pre-920 hitters in the single season leaders, the career list contains players from all periods since 1900, including four who are active. Rod Carew, who in 1975 moved past Ted Williams into third place, seems destined to be one of the all-time leaders in relative average. Whether all four active players will finish their careers among the leaders is an open question, but at least they show that hitting for high average is not altogether a lost art.

Highest Single Season Relative Averages During Last 20 Years (1956-1975)

Lea.

Rel.

Rank

Player Year League AB

Hits

Avg.

Avg.

Avg.

1

Ted Williams 1957 Amer.

420

163

.3S8

0.254

1.530

2

Joe Torre 1971 Nat.

634

230

0.363

0.251

1.450

3

Roberto Clemente 1967 Nat.

585

209

0.357

0.248

1.440

4

Mickey Mantle 1957 Amer.

474

173

0.365

0.254

1.440

5

Rico Carty 1970 Nat.

478

175

0.366

0.257

1.420

6

Norm Cash 1961 Amer.

535

193

0.361

0.255

1.420

7

Rod Carew 1974 Amer.

599

218

0.364

.257*

1.410

8

Harvey Kuenn 1959 Amer.

561

198

0.353

0.252

1.400

9

Rod Carew 1975 Amer.

535

192

0.359

.257*

1.400

10

Pete Rose 1969 Nat.

627

218

0.348

0.249

1.390

11

Carl Yastrzemski 1967 Amer.

579

189

0.326

0.235

1.390

12

Ralph Garr 1974 Nat.

606

214

0.353

0.254

1.390

13

Pete Rose 1968 Nat.

626

210

0.335

0.242

1.390

14

Roberto Clemente 1969 Nat.

507

175

0.345

0.250

1.380

15

Bill Madlock 1975 Nat.

514

182

0.354

0.256

1.380

16

Hank Aaron 1959 Nat.

629

223

0.355

0.259

1.370

17

Matty Alou 1968 Nat.

558

185

0.332

0.242

1.370

18

Tony Oliva 1971 Amer.

487

164

0.337

0.246

1.370

19

Roberto Clemente 1970 Nat.

412

145

0.352

0.257

1.370

20

Ralph Garr 1971 Nat.

639

219

0.343

0.251

1.370

*Designated Hitter rule in effect
Lifetime Relative Average Greater Than 1.20 (Over 4000 ABs)

Lea.

Rel.

Rank

Player Years AB

Hits

Avg.

Avg.

Avg.

1

Ty Cobb 1905-1928

11429

4191

0.367

0.263

1.390

2

Joe Jackson 1908-1920

4981

1774

0.356

0.258

1.380

3

Rod Carew 1967-1975*

4450

1458

0.328

0.247

1.330

4

Ted Williams 1939-1960

7706

2654

0.344

0.261

1.320

5

Nap Lajoie 1896-1916

9589

3251

0.339

0.258

1.310

6

Rogers Hornsby 1915-1937

8173

2930

0.358

0.275

1.300

7

Tris Speaker 1907-1928

10208

3515

0.344

0.266

1.290

8

Stan Musial 1941-1963

10972

3630

0.331

0.258

1.280

9

Honus Wagner 1897-1917

10427

3430

0.329

0.258

1.280

10

Eddie Collins 1906-1930

9949

3311

0.333

0.265

1.260

11

Rob.Clemente 1955-1972

9454

3000

0.317

0.254

1.250

12

Tony Oliva 1962-1975*

6178

1891

0.306

0.246

1.240

13

Pete Rose 1963-1975* 8221

2547

0.310

0.251

1.230

14

Harry Heilmann 1914-1932 7787

2660

0.342

0.278

1.230

15

Sam Crawford 1899-1917 9579

2964

0.309

0.252

1.230

16

George Sisler 1915-1930 8267

2812

0.340

0.278

1.230

17

Babe Ruth 1914-1935 8399

2873

0.342

0.279

1.230

18

Matty Alou 1960- 1974 5789

1777

0.307

0.252

1.220

19

Joe Medwick 1932-1948 7635

2471

0.324

0.266

1.210

20

Paul Waner 1926- 1944 9459

3152

0.333

0.275

1.210

21

Lou Gehrig 1923-1939 8001

2721

0.340

0.281

1.210

22

Bill Terry 1923-1936 6428

2193

0.341

0.282

1.210

23

Joe DiMaggio 1936-1951 6821

2214

0.325

0.269

1.210

24

Hank Aaron 1954-1975* 12093

3709

0.307

0.254

1.210

25

Jackie Robinson 1947- 1956 4877

1518

0.311

0.260

1.200

*Active player

On Base Average for Players: Landmarks of Sabermetrics, Part II

This is the second of three pioneering statistical articles published in the years before Bill James coined the term sabermetrics, which has endured as an honor to the Society for American Baseball Research–and has brought me and 300 others to the first SABR Baseball Analytics Conferences in Mesa, Arizona. In 1984, Pete Palmer and I collaborated on The Hidden Game of Baseball, in which the now commonplace OPS (On Base Plus Slugging) made its debut. One component of that stat, Slugging Percentage, was developed in the 1860s but was not accepted by the National League as an official statistic until 1923 and the American until 1946. It  is hard today to imagine that when we wrote Hidden Game, On Base Average was not yet an official stat. Here is Pete’s landmark article on the OBA, from SABR’s Baseball Research Journal in 1973. Some of the tabular material (league leaders in lifetime OBA by position, through 1972) is not offered here as it has become largely outdated.
On Base Average for Players Print E-mail

By Pete Palmer

There are two main objectives for the hitter. The first is to not make an out and the second is to hit for distance. Long-ball hitting is normally measured by slugging average. Not making an out can be expressed in terms of on base average (OBA), where:

OBA    =  Hits    + Walks + Hit-by-Pitch

At Bats + Walks + Hit-by-Pitch

For example, if we were figuring out Frank Robinson’s career on base average, it would be compiled like this:  2641 hits + 1213 walks + 178 hit-by-pitch (4032), divided by 8810 at bats + 1213 walks + 178 HBP (10201). His OBA is .395, which happens to be the tops among active players, but does not compare very well with players of the past. Sacrifice hits are ignored in this calculation.

On base average can be quite different from batting average. Take for example Joe DiMaggio and Roy Cullenbine, once outfield teammates for the Yankees.  DiMag had a lifetime batting average of .325 and Cullenbine .276. But Roy was walked much more frequently than Joe and made fewer outs; he had an OBA of .404, compared to .398 for the Yankee Clipper.

In calculating OBA, the Macmillan Baseball Encyclopedia was used for hits, at bats, and bases on balls. Hit by pitch data are from official averages back to 1920 in the AL and 1917 in the NL. Figures back to 1909 have been compiled by Alex Haas from newspaper box scores.   Some data before then comes from Haas, John Tattersall, and Bob Davids.  Additional information is available in some of the old newspapers, but has not yet been compiled.  Players with incomplete totals are credited with HEP at the known rate from available data for those unknown appearances. When no data are to be obtained, league averages are used.  Before 1887, a batter was not awarded first base when hit by a pitch.

Who is the all-time leader in on base average [remember, this is as of 1973, when Barry Bonds was nine years old)? It is Ted Williams with a spectacular .483 mark. Not surprisingly, Babe Ruth is second with .474.  It is no secret that Williams and Ruth were both exceptionally good hitters as well as being among the most frequent walk receivers. It was not unusual for them to get on base 300 times a season. Ranking third is the all-time list is John McGraw, who was elected to the Hall of Fame as a manager, but was also a fine hitter. In addition, he was adept at getting on base from walks and HBP. He holds the all-time NL record for OBA both lifetime and season. Billy Hamilton, the stolen base king, and Lou Gehrig are next in line, followed by such big names as Rogers Hornsby, Ty Cobb, Jimmie Foxx and Tris Speaker. Rounding out the top ten is Ferris Fain, former first baseman of the A’s, who quietly attained a very high OBA to go with his two batting titles.

Some players who many fans might not think to be among the leaders in OBA are Max Bishop, second baseman of the A’s last super teams of 1929-31; Clarence “Cupid” Childs, Cleveland second sacker in the 1890s; Roy Thomas, Phils’ center fielder at the turn of the century; and Joe Cunningham, who played with the Cardinals and White Sox just a few years ago. On the other hand, some of the famous hitters of baseball are not included in the accompanying list of players with lifetime on base averages of .400 or better. Missing are such stars as Willie Keeler, Bill Terry, George Sisler, Nap Lajoie, Al Simmons, Hans Wagner, Cap Anson, Joe DiMaggio, and Roberto Clemente.

Since most of the players in the .400 list are either outfielders or first basemen, an additional table is shown that provides data on the top ten players at each position [tables npt offered here]. Many unheralded players are high in the OBA figures, such as Wally Schang, who played for many AL clubs in the teens and twenties, who is second among catchers, and Elmer Valo, another Connie Mack product, who ranks sixth in right field.

There are no active players with OBA’s of .400 or better, and only a few among the leaders by position. The level of OBA in the majors is presently quite low. This could be attributed to many factors, such as improved pitching (bigger and stronger pitchers throwing from the unchanged distance of 60 feet 6 inches, more use of relief pitchers, and the widespread use of the slider as an extra pitch), larger ball parks, and increased emphasis on hitting home runs. Those players with high OBA’s that are now active are shown below:

Frank Robinson 0.395 Harmon Killebrew 0.385
Carl Yastrzemski 0.389 Al Kaline 0.383
Willie Mays 0.388 Joe Morgan 0.383
Dick Allen 0.388 Henry Aaron 0.381
Willie McCovey 0.387 Norm Cash 0.379

It is interesting to note that if hit by pitch were not included in figuring OBA, Frank Robinson would rank only fourth.

In regard to season averages, Dick Allen led the majors in OBA in 1972 with a mark of .422. Joe Morgan was the NL leader with .419. The only others with .400 or better on base average were Carlos may at .408, and Billy Williams at .403.  These season averages are far, far below the top season averages of the past. The list of top season marks, which includes all instances of OBA of .500 or better, is dominated by another Williams named Ted, the all-time season leader, and by Ruth.

Ted Williams, 1941 .551 Babe Ruth, 1926 .516
John McGraw, 1899 .546 Mickey Mantle, 1954 .515
Babe Ruth, 1923 .545 Babe Ruth, 1924 .513
Babe Ruth, 1920 .530 Babe Ruth, 1921 .512
Ted Williams, 1957 .528 Rog. Hornsby, 1924 .508
Billy Hamilton, 1894 .521 Joe Kelley, 1894 .502
Ted Williams, 1946 .516 Hugh Duffy, 1894 .501

Ted Williams led the league in OBA every year he qualified except for his rookie season, and he had a higher OBA than the leader in three of his four seasons shortened by injury.  Those leading the league most often in OBA are:

AL                                                                   NL

Ted Williams               12                                 Rogers Hornsby          8

Babe Ruth                   10                                Stan Musial                 5

Ty Cobb                      6                                  Billy Hamilton            4

Lou Gehrig                  5                                  Richie Ashburn           4

Carl Yastrzemski         5                                  Mel Ott                       4

Honus Wagner            4

It is important to remember that OBA is only one component of hitting, and that slugging is equally valuable. Of course, the best long-ball hitters usually rank high in both departments because they are generally walked more frequently. One thing the OBA does is give percentage recognition to the player’s ability to get on via the walk and the HBP as well as the hit. He has saved his team an out and he is in a good position to score a run.

ON BASE AVERAGE LEADERS

1000 games minimum – through 1972

Player Years

AB

BH

BB

HBP OBA
Ted Williams 1939-1960

7706

2654

2018

39 0.483
Babe Ruth 1914-1935

8399

2873

2056

42 0.474
John McGraw 1891-1906

3924

1309

836

105+ 0.462
Billy Hamilton 1888-1901

6268

2158

1187

50* 0.452
Lou Gehrig 1923-1939

8001

2721

1508

45 0.447
Rogers Hornsby 1915-1937

8173

2930

1038

48 0.434
Ty Cobb 1905-1928

11437

4192

1249

90 0.433
Jimmie Foxx 1926-1945

8134

2646

1452

13 0.430
Tris Speaker 1907-1928

10205

3514

1381

101 0.427
Ferris Fain 1947-1955

3930

1139

903

18 0.425
Eddie Collins 1906-1930

9949

3310

1503

76 0.424
Joe Jackson 1908-1920

4981

1774

519

59 0.423
Max Bishop 1924-1935

4494

1216

1153

31 0.423
Mickey Mantle 1951-1968

8102

2415

1734

13 0.423
Mickey Cochrane 1925-1937

5169

1652

857

29 0.419
Stan Musial 1941-1963

10972

3630

1599

53 0.418
DanBrouthers 1879-1904

6711

2296

840

32* 0.418
Jesse Burkett 1890-1905

8421

2850

1029

63* 0.414
Clarence Childs 1890-1901

5615

1720

990

44* 0.414
Mel Ott 1926-1947

9456

2876

1708

64 0.414
Rank Greenberg 1930-1947

5193

1628

852

16 0.412
Roy Thomas 1899-1911

5296

1537

1042

42* 0.411
Charlie Keller 1939-1952

3790

1085

784

10 0.410
Harry Heilmann 1914-1932

7787

2660

856

40 0.410
Jackie Robinson 1947-1956

4877

1518

740

72 0.410
Eddie Stanky 1943-1953

4301

1154

996

34 0.410
Ed Delahanty 1888-1903

7505

2597

741

55* 0.409
Roy Cullenbine 1938-1947

3879

1072

852

11 0.408
Joe Cunningham 1954-1966

3362

980

599

49 0.406
Riggs Stephenson 1921-1934

4508

1515

494

40 0.406
Arky Vaughan 1932-1948

6622

2103

937

46 0.406
Paul Waner 1926-1945

9459

3152

1091

38 0.404
Chas. Gehringer 1924-1942

8858

2839

1185

51 0.404
Joe Kelley 1891-1908

6977

2213

910

99+ 0.403
Lu Blue 1921-1933

5904

1696

1092

43 0.402
Pete Browning 1882-1894

4820

1646

466

20* . 402
Denny Lyons 1885-1897

4294

1333

621

32* 0.401

+Hit by pitch estimated from partial career totals

*Hit by pitch estimated from league average

Average Batting Skill Through Major League History: Landmarks of Sabermetrics, Part I

Today still finds me in Arizona but at another conference, this one the premier edition of the SABR Analytics Conference in Mesa. Among the several compelling presentations on this first of a three-day event was Bill Squadron’s stunning presentation of the Bloomberg Analytics program, now adopted by twenty-four of MLB’s thirty clubs, which integrates pitch-by-pitch results with batted ball locations and coordinated video. To one who had a hand in the early days of sabermetrics, it was simply amazing.

Seated next to me was my old friend Richard D. Cramer, author of the first such program for an MLB club: the EDGE system created by Stats Inc. for the Oakland A’s in 1981. Dick is a man of manifold accomplishments, but for me it is hard to overestimate the brilliance and enduring influence of his article below, which he first published in SABR’s Baseball Research Journal in 1980. To celebrate SABR’s pioneering role, I offer it below as the first of three landmark essays that appeared in that publication.

It is today a commonplace wisdom that players of different eras may be compared by their relative dominance over league average performance. By this method Bill Terry’s .401 in 1930, when the NL batting average was .303, may be viewed as the same achievement as Carl Yastrzemski’s .301 in 1968, when the AL batted .230, in that each exceeded his league average by about 32 percent.

However, as Pete Palmer and I wrote in The Hidden Game of Baseball in 1984, “The trouble with this inference, reasonable though it is on its face, lies in a truth Einstein would appreciate: Everything is relative, including relativity. The National League batting average of .266 in 1902 does not mean the same thing as the American League BA of .266 in 1977, any more than Willie Keeler’s .336 in 1902 means the same thing as Lyman Bostock’s .336 in 1977: It does violence to common sense to suppose that, while athletes in every other sport today are measurably and vastly superior to those of fifty or seventy-five years ago, in baseball alone the quality of play has been stagnant or in decline. Keeler’s and Bostock’s Relative Batting Averages are identical, which signifies that each player exceeded his league’s performance to the same degree. But the question that is begged is ‘How do we measure average skill: What do the .266s of 1902 and 1977 mean?’”

Here, without further preamble, is Dick Cramer’s article:

Average Batting Skill Through Major League History

Is the American or the National a tougher league in which to hit .300? How well would Babe Ruth, Ty Cobb, or Cap Anson hit in 1980? What effect did World War II, league expansion, or racial integration have on the caliber of major league hitting? This article provides definitive answers to these types of questions.

The answers come from a universally accepted yardstick of batting competitiveness, comparing the performances of the same player in different seasons. For example, we all conclude that the National League is tougher than the International League because the averages of most batters drop upon promotion. Of course, factors other than the level of competition affect batting averages. Consider how low were the batting averages of the following future major leaguers in the 1971 Eastern League:

Lifetime BA,

 

1971 Eastern

majors (thru `79)

Bill Madlock

0.234

0.320

Mike Schmidt

0.211

0.255

Bob Boone

0.265

0.268

Andre Thornton

0.267

0.252

Bob Coluccio

0.208

0.220

Pepe Frias

0.240

0.239

Double A seems a bit tougher than the major leagues from these data because (1) this player sample is biased: most Eastern Leaguers don’t reach the majors, and I haven’t shown all the 1971 players who did, and (2) large and poorly lighted parks made the 1971 Eastern League tough for any hitter, as shown by its .234 league average. My study tries to avoid these pitfalls, minimizing bias by using all available data for each season-to-season comparison, and avoiding most “environmental factors” such as ball resilience or rule changes that affect players equally, by subtracting league averages before making a comparison. Of course, direct comparisons cannot be made for seasons more than 20 years apart; few played much in both periods, say, 1950 and 1970. But these seasons can be compared indirectly, by comparing 1950 to 1955 to 1960, etc., and adding the results.

Measures of batting performance are many. In the quest for a single accurate measure of overall batting effectiveness, I have developed the “batter’s win average” (BWA) as a “relative to league average” version of the Palmer/Cramer “batter’s run average” (BRA). (See Baseball Research Journal 1977, pp 74-9.) Its value rests on the finding that the scoring of major league teams is predicted from the BWA’s of its individual players with an error of ±21 runs (RMS difference) when all data are available (SB, CS, HBP, and GiDP as well as AB, H, TB, and BB) and about ±30 runs otherwise.

A property useful in visualizing the BWA in terms of conventional statistics is its roughly 1:1 equivalence with batting average, provided that differences among players arise only from singles. To make this point more clearly by an example, Fred Lynn’s +. 120 BWA led the majors in 1979. His value to the Red Sox was the same as that of a hitter who obtained walks, extra bases, and all other statistical oddments at the league average, but who hit enough extra singles to have an average .120 above the league, that is, a BA of .390. The difference between .390 and Lynn’s actual .333 is an expression mostly of his robust extra-base slugging.

The first stage in this study was a labor of love, using an HP67 calculator to obtain BWA’s for every non-pitcher season batting record having at least 20 BFP (batter facing pitcher) in major league history. The second stage was merely labor, typing all those BFP’s and BWA’s into a computer and checking the entries for accuracy by comparing player BFP sums with those in the Macmillan Encyclopedia. The final stage, performing all possible season-to-season comparisons player by player, took 90 minutes on a PDP10 computer. A season/season comparison involves the sum of the difference in BWA’s for every player appearing in the two seasons, weighted by his smaller number of BFP’s. Other weighting schemes tried seemed to add nothing to the results but complexity.

Any measurement is uncertain, and if this uncertainty is unknown the measure is almost useless. The subsequent treatment of these season/season comparisons is too involved for concise description, but it allowed five completely independent assessments of the level of batting skill in any given American or National League season, relative to their respective 1979 levels. The standard deviation of any set of five measurements from their mean was ±.007, ranging from .002 to .011. This implies that the “true” average batting skill in a season has a 2 in 3 chance of being within ±.007 of the value computed, and a 19 in 20 chance of being within ±.014, provided that errors in my values arise only from random factors, such as individual player streaks and slumps that don’t cancel. However, no study can be guaranteed free of “systematic error.” To cite an example of a systematic error that was identified and corrected: If a player’s career spans only two seasons, it is likely, irrespective of the level of competition, that his second season was worse than his first. (If he had improved, he was likely to have kept his job for at least a third season!) Another possible source of error which proved unimportant was the supposed tendency for batters to weaken with age (the actual tendency appears to be fewer hits but more walks). It appears that overall systematic error is less than 20 percent of the total differences in average levels. One check is that the 1972 to 1973 American League difference is attributable entirely to the calculable effect of excluding pitchers from batting, plus a general rising trend in American League skill in the 1970s.

Assessment of the relative strength of the major leagues, as might be expected, comes from players changing leagues. Results again were consistent and showed no dependence on the direction of the change. Results from the two eras of extensive interleague player movement, 1901 to 1905 and post-1960, agreed well also.

The results of my study are easiest to visualize from the graphical presentation [below]. (Because few readers will be familiar with the BWA units, I have not tabulated the individual numbers, but later convert them to relative BA’s and slugging percentages.) Theories on the whys and wherefores of changes in average batting skill I leave to others with greater personal and historical knowledge of the game. But the major trends are clear:

(1) The average level of batting skill has improved steadily and substantially since 1876. The .120-point difference implies that a batter with 1979-average skills would in 1876 have had the value of an otherwise 1876-average batter who hit enough extra singles for a .385 batting average.

(2) The American and National Leagues were closely matched in average batting strength for the first four decades (although not in number of superstars, the AL usually having many more). About  1938 the National League began to pull ahead of the American, reaching its peak superiority in the early ’60s. A resurgence during the ’70s makes the American League somewhat the tougher today, mainly because of the DH rule.

(3) The recent and also the earliest expansions had only slight and short-lived effects on batting competitiveness. However, the blip around 1900 shows the substantial effect on competition that changing the number of teams from 12 to 8 to 16 can have!

(4) World War II greatly affected competitiveness in 1944 and 1945.

Many baseball fans, myself included, like to imagine how a Ruth or a Wagner would do today. To help in these fantasies, I have compiled a table of batting average and slugging percentage corrections, based again on forcing differences in league batting skill overall into changes in the frequency of singles only. However, league batting averages and slugging percentages have been added back in, to reflect differences in playing conditions as well as in the competition. To convert a player’s record in year A to an equivalent performance in season B, one should first add to his year A batting and slugging averages the corrections tabulated for season A and then subtract the corrections shown for season B. The frequency of such other events as walks or stolen bases then can, optionally, be corrected for any difference in league frequencies between seasons A and B.

One interesting illustration might start with Honus Wagner’s great 1908 season (BWA=+. 145). What might Wagner have done in the 1979 American League, given a livelier ball but tougher competition?  The Table yields a batting average correction of – .059-(+.003)=- .062 and a slugging correction of – .020-(- .029)=+.009, which applied to Wagner’s 1908 stats gives a 1979 BA of .292 and SPct of .551. (In 600 ABs, he would have, say 30 HRs, 10 3BHs, 35 2BHs). Wagner’s stolen base crown and tenth place tie in walks translate directly to similar positions in the 1979 stats. That’s impressive batting production for any shortstop, and a “1979 Honus Wagner” would doubtless be an All-Star Game starter!

These results are fairly typical. Any 20th century superstar would be a star today. Indeed a young Babe Ruth or Ted Williams would out bat any of today’s stars. But of course, any of today’s stars–Parker, Schmidt, Rice, Carew–would before 1955 have been a legendary superstar. Perhaps they almost deserve their heroic salaries!

Facts are often hard on legends, and many may prefer to believe veterans belittling the technical competence of today’s baseball as compared, say, to pre-World War II. Indeed, “little things” may have been executed better by the average 1939 player. However, so great is the improvement in batting that if all other aspects of play were held constant, a lineup of average 1939 hitters would finish 20 to 30 games behind a lineup of average 1979 hitters, by scoring 200 to 300 fewer runs. This should hardly surprise an objective observer. Today’s players are certainly taller and heavier, are drawn from a larger population, especially more countries and races, are more carefully taught at all levels of play. If a host of new track and field Olympic records established every four years are any indication, they can run faster and farther. Why shouldn’t they hit a lot better?

Stats and History, Part 4

This is the fourth and final installment on the subject of how baseball’s statistics evolved. The text below is from the opening chapter of The Hidden Game of Baseball (1984), on which Pete Palmer and I collaborated.  On to the pitching statistics, the ones you commonly see. First is wins, with its correlated average of won-lost percentage. Wins are a team statistic, obviously, as are losses, but we credit a win entirely to one pitcher in each game. Why not to the shortstop? Or the left fielder? Or some combination of the three? In a 13–11 game, several players may have had more to do with the win than any pitcher. No matter. We’re not going to change this custom, though Ban Johnson gave it a good try.

To win many games a pitcher generally must play for a team which wins many games (we discount relievers from this discussion because they rarely win 15 or more) or must benefit from extraordinary support in his starts or must allow so few runs that even his team’s meager offense will be enough, as Tom Seaver and Steve Carlton did in the early 1970s. Verdict on both wins and the won-lost-record percentage: situation- dependent. Look at Red Ruffing’s W-L record with the miserable Red Sox of the 1930s, then his mark with the Yankees. Or Mike Cuellar with Houston, then with Baltimore. Conversely, look at Ron Davis with the Yanks and then the Twins. There is an endless list of good pitchers traded up in the standings by a tailender to “emerge” as stars.

The recognition of the weakness of this statistic came early. Originally it was not computed by such men as Chadwick because most teams leaned heavily, if not exclusively, on one starter, and relievers as we know them today did not exist. As the season schedules lengthened—the need for a pitching staff became evident, and separating out the team’s record on the basis of who was in the box seemed a good idea. It was not and is not a good statistic, however, for the simple reason that one may pitch poorly and win, or pitch well and lose.

The natural corrective to this deficiency of the won-lost percentage is the earned run average—which, strangely, preceded it, gave way to it in the 1880s, and then returned in 1913. Originally, the ERA was computed as earned runs per game because pitchers almost invariably went nine innings. In this century it has been calculated as ER times 9 divided by innings pitched.

The purpose of the earned run average is noble; to give a pitcher credit for doing what he can to prevent runs from scoring, aside from his own fielding lapses and those of the men around him. It succeeds to a remarkable extent in isolating the performance of the pitcher from his situation, but objections to the statistic remain. Say a pitcher retires the first two men in an inning, then has the shortstop kick a ground ball to allow the batter to reach first base. Six runs follow before the third out is secured. How many of these runs are earned? None. (Exception: If a reliever comes on in mid-inning, any men he puts on base who come in to score would be classified as earned for the relievers, though unearned for the team statistic. This peculiarity accounts for the occasional case in which a team’s unearned runs will exceed the individual totals of its staff.) Is this reasonable? Yes. Is it a fair depiction of the pitcher’s performance in that inning? No.

The prime difficulty with the ERA in the early days, say 1913, when one of every four runs scored was unearned, was that a pitcher got a lot of credit in his ERA for playing with a bad defensive club. The errors would serve to cover up in the ERA a good many runs which probably should not have scored. Those runs would hurt the team, but not the pitcher’s ERA. This situation is aggravated further by use of the newly computed ERAs for pitchers prior to 1913, the first year of its official status. Example: Bobby Mathews, sole pitcher for the New York Mutuals of 1876, allowed 7.19 runs per game, yet his ERA was only 2.86—almost a perfect illustration of the league’s 40 percent proportion of earned runs.

In modern baseball, post–1946, with 88 out of every 100 runs being earned, the problem has shifted. The pitcher with the bad defense behind him is going to be hurt less by errors than by balls that wind up recorded as base hits which a superior defense team might have stopped. Bottom line: You pitch for a bad club, you get hurt. There is no way to isolate pitching skill completely unless it is through play-by-play observation and meticulous, consistent bookkeeping.

In a column in The Sporting News on October 9, 1976, Leonard Koppett, in an overall condemnation of earned run average as a misleading statistic, suggested that total runs allowed per game would be a better measure. It is a proposition worth considering, now that the proportion of earned runs has been level for some forty years; one can reasonably assume that further improvements in fielding would be of an infinitesimal nature. [In 2012, this comment seems at least debatable.] However, when you look at the spread in fielding percentage between the worst team and the best, and then examine the number of additional unearned runs scored, pitchers on low-fielding-percentage teams probably still have a good case for continuing to have their effectiveness computed through the ERA. In 1982, for example, in the American League, only 39 of the runs scored against Baltimore were the result of errors; yet Oakland, with the most error-prone defense in the league, allowed 84 unearned runs.

What gave rise to the ERA, and what we appreciate about it, is that like batting average it is an attempt at an isolation stat, a measure of individual performance not dependent upon one’s own team. While the ERA is a far more accurate reflection of a pitcher’s value than the BA is of a hitter’s, it fails to a greater degree than BA in offering an isolated measure. For a truly unalloyed individual pitching measure we must look to the glamour statistic of strikeouts, the pitcher’s mate to the home run (though home runs are highly dependent upon home park, strikeouts to only a sight degree).

Is a strikeout artist a good pitcher? Maybe yes, maybe no, as indicated in the discussion of the Carlton-Ryan-Johnson triad [in the Introduction, not republished in this blog series]; an analogue would be to ask whether a home-run slugger was a good hitter. The two stats run together: periods of high home-run activity (as a percentage of all hits) invariably are accompanied by high strikeout totals. Strikeout totals, however, may soar even in the absence of overzealous swingers, say, as the result of a rules change such as the legalization of overhand pitching in 1884, the introduction of the foul strike (NL, 1901; AL, 1903), or the expansion of the strike zone in 1963.

Just as home-run totals are a function of the era in which one plays, so are strikeouts. The great nineteenth-century  totals—Matches Kilroy’s 513, Toad Ramsey’s 499, One Arm Dailey’s 483—were achieved under different rules and fashions. No one in the century fanned batters at the rate of one per inning; indeed, among regular pitchers (154 innings or more), only Herb Score did until 1960. In the next five years the barrier was passed by Sandy Koufax, Jim Maloney, Bob Veale, Sam McDowel, and Sonny Siebert. Walter Johnson , Rube Waddell, and Bob Feller didn’t run up numbers like that. Were they slower, or easier to hit, than Sonny Siebert?

Even in today’s game, which lends itself to the accumulation of, by historic standards, high strikeout totals for a good many pitchers and batters, the strikeout is, as it always has been, just another way to make an out. Yes, it is a sure way to register an out without the risk of advancing baserunners and so is highly useful in a situation like man on third with fewer than two outs; otherwise, it is a vastly overrated stat because it has nothing to do with victory or defeat—it is mere spectacle. A high total indicates raw talent and overpowering stuff, but the imperative of the pitcher is simply to retire the batter, not to crush him. What’s not listed in your daily averages are strikeouts by batters—fans are not as interested in that because it’s a negative measure—yet the strikeout may be a more significant stat for batters than it is for pitchers.

On second thought, maybe it’s just the same. So few errors are being made these days—2 in 100 chances, on average—maybe there’s not a great premium on putting the ball into play anymore. Sure, you might move a runner up with a grounder hit behind him or with a long fly, but on the other hand, with a strikeout you do avoid hitting into a double play. At least that’s what Darryl Strawberry said in his rookie season when asked why he was unperturbed about striking out every third time he came to the plate!

Bases on balls will drive a manager crazy and put lead in fielders’ feet, but it is possible to survive, even to excel, without first-rate control, provided your stuff is good enough to hold down the number of hits. Occasionally you will see a stat called Opponents’ Batting Average, or opponents’ On Base Average, or Opponents’ Slugging Percentage, all of which seem at first blush more revealing than they are. In fact these calculations are all academic, in that it doesn’t matter how many men a pitcher puts on base. Theoretically he can put three men on base every inning, leave twenty-seven baserunners allowed, and pitch a shutout. A man who gives up one hit over nine innings can lose 1–0; it’s even possible to allow no hits and lose. Who is the better pitcher? The man with the shutout and twenty-seven baserunners allowed, or the man who allows one hit? No matter how sophisticated your measurements for pitchers, the only really significant one is runs. [Today I might add, “unless you’re evaluating players for purposes of salary offer or acquisition.”]

The nature of baseball at all points is one man against nine. It’s the pitcher against a series of batters. With that situation prevailing, we have tended to examine batting with intricate, ingenious stats, while viewing pitching through generally much weaker, though perhaps more copious, measurements. What if the game were to be turned around so that we had a “pitching order”—nine pitchers facing one batter? Think of that for one minute. The nature of the statistics would change, too, so that your batting stats would be vastly simplified. You wouldn’t care about all the individual components of the batter’s performance, all combining in some obscure fashion to reveal run production. You’d care only about runs. Yet what each of the nine pitchers did would bear intense scrutiny, and over the course of a year each pitcher’s Opponents’ BA, Opponents’ OBA, Opponents’ SLG, and so forth, would be recorded and turned this way and that to come up with a sense of how many runs saved each pitcher achieved.

A stat with an interesting history is completed games. This is your basic counter stat, but it’s taken to mean more than most of those measurements by baseball people and knowledgeable baseball fans. When everyone was completing 90–100 percent of his starts, the stat was without meaning and thus was not kept. As relief pitchers crept into the game after 1905, the percentage of completed games declined rapidly…. By the 1920s it became a point of honor to complete three quarters of one’s starts; today the man who completes half is quite likely to lead his league. [Another sentence that raises an eyebrow in 2012.] So with these shifting standards, what do CGs mean? Well, it’s useful to know that of a pitcher’s 37 starts, he completed 18. That he accepted no assistance in 18 of his 37 games is indisputable; that he required none is a judgment for others such as fans or press to make. There is managerial discretion involved: it is seldom a pitcher’s decision whether to go nine innings or not, and there are different managerial styles and philosophies.

There are the pilots who will say give me a good six or seven, fire as hard as you can as long as you can, and I’ll bring in The Goose to wrap it up. There are others who encourage their starting pitchers to go nine, feeling that it builds team morale, staff morale, and individual confidence. Verdict: situation-dependent, to a fatal degree. CGs tell you as much about the manager and his evaluation of his bullpen as they tell you about the arm or the heart of the pitcher.

Can we say that a pitcher with 18 complete games out of 37 starts is better than one with 12 complete games in 35 starts? Not without a lot of supporting help we can’t, not without a store of knowledge about the individuals, the teams, and especially the eras involved. The more uses to which we attempt to put the stat, the weaker it becomes, the more attenuated its force. If we declare the hurler with 18 CGs “better,” how are we to compare him with another pitcher from, say, fifty years earlier who completed 27 out of 30 starts? Or another pitcher of eighty years ago who completed all the games he started? (Jack Taylor completed every one of the 187 games he started over five years.) Or what about Will White, who 1880 started 76 games and completed 75 of them? But the rules were different, you say, or the ball was less resilient, or they pitched from a different distance, with a different motion, or this, or that. The point is, there are limits to what a traditional baseball statistic can tell you about a player’s performance in any given year, let alone compare his efforts to those of a player from a different era.

Perhaps the most interesting new statistic of [the last] century is the one associated with the most significant strategic element since the advent of the gopher ball—saves. Now shown in the papers on a daily basis, saves were not officially recorded at all until 1960; it was at the instigation of Jerry Holtzman of the Chicago Sun-Times, with the cooperation of The Sporting News, that this statistic was finally accepted. The need arose because relievers operated at a disadvantage when it came to picking up wins, and at an advantage in ERA. The bullpenners were a new breed, and as their role increased, the need arose to identify excellence, as it had long ago for batters, starting pitchers, and fielders.

The save is, clearly, another stat that hinges on game situation and managerial discretion. If your are a Ron Davis on a team that has a Goose Gossage, the best you can hope for is to have a great won-lost record, as David did in 1979 and ’80. To pile up a lot of saves, you have to be saved for save situations, as Martin reserves Gossage; Howser, Quisenberry; or Herzog, Sutter. These relief stars are not brought in with their teams trailing; the game must be tied or preferably the  lead is in hand. The prime statistical drawback is that there is no negative to counteract the positive, no stat for saves blown (except, all too often, a victory for the “fireman”).

In April 1982, Sports Illustrated produced a battery of well-conceived, thought-provoking new measurements for relief pitchers which at last attempted, among other things, to give middle and long relievers their due. Alas, the SI method was too rigorous for the average fan, and the scheme dropped from sight. It was a worthy attempt, but perhaps the perfect example of breaking a butterfly on the wheel. The “Rolaids Formula,” which at least takes games lost and games won into account, is a mild improvement over simply counting saves or adding saves and wins. It awards two points for a save or a win and deducts one point for a loss. The reasoning, we suppose, is that a reliever is a high-wire walker without a net—one slip may have fatal consequences. His chances of drawing a loss are far greater than his chances of picking up a win, which requires the intervention of forces not his own.

So today, when we have BABIP, WHIP, VORP, plus video analysis to back up the late-night noodling, we have better ways to evaluate pitching, and especially to correlate it, or unshackle it, from fielding.

Stats and History, Part 3

This is the third installment on the subject of how baseball’s statistics evolved from Outs and Runs in 1845. The text below continues the publication online, for the first time, of the opening chapter of The Hidden Game of Baseball (1984). Other statistics introduced before the turn of the century were stolen bases (though not caught stealing), sacrifice bunts, doubles, triples, homers, strikeouts for batters and for pitchers, bases on balls, hit by pitch (HBP), and erratically, grounded into double play (GIDP). Caught stealing figures are available on a very sketchy basis in some of later years of the century, as some newspapers carried the data in the box scores of home-team games. From 1907 on, [Ernie] Lanigan recorded CS in box scores of the New York Press, but the leagues did not keep the figure officially until 1920. The AL has CS from that year to the present, excepting 1927 … [while] the NL kept CS from 1920 to 1925, then not again until 1951. […]

The sacrifice bunt became a prime offensive weapon of the 1880s and began appearing as a statistical entry in box scores by 1889. The totals piled up in the years when a single run was precious—from 1889 to 1893, then again from 1901 to 1920—were stupendous by modern standards (sacrifices counted as at bats until the early 1890s). Hardy Richardson had 68 sacrifice hits in 1891 (in 74 games!), Ray Chapman 67 in 1917; today it is unusual to see a player with as many as 20.

Batter bases on balls (and strikeouts) were recorded for the last year of the American Association, 1891, by Boston’s Clarence Dow, and for some years of the mid-’90s in the National League, but didn’t become an official statistic until 1910 in the NL, 1913 in the AL. Caught stealing, hit by pitch and grounded into double plays were not kept steadily in the nineteenth century, making it impossible for modern statisticians to apply the most sophisticated versions of their measures to early players.

The [next century added] little in the way of new official statistics—ERA and RBIs and SLG are better regarded as revivals despite their respective adoption dates of 1912, 1920, and 1923. These are significant measures, to be sure, but they represent official baseball’s classically conservative response to innovation: wait forty or fifty years, then “make it new.” Running counter to that trend have been baseball’s two most interesting new stats of the century [to 1984], the save and the game winning RBI. Both followed in fairly close relationship to a perception that something was occurring on the field yet, because it was not being measured, it had no verifiable reality [a later analogous situation became the middle reliever’s Hold]. (Another such stat which did not survive, alas, was stolen bases off pitchers, which the American League recorded only in 1920–24.)

The same could have been said back in 1908, in a classic case of a statistic rushing in to fill a void, as Phillies’ manager Billy Murray observed that his outfielder Sherry Magee had the happy facility of providing a long fly ball whenever presented with a situation of man on third, fewer than two outs. Taking up the cudgels on his player’s behalf, Murray protested to the National League office that it was unfair to charge Magee with an unsuccessful time at bat when he was in fact succeeding, doing precisely what the situation demanded. Murray won this point, but baseball flip-flopped a couple of times on this stat, in some years reverting to calling it a time at bat, in other years not even crediting an RBI.

The most delightfully loony stat of the century (though the GWRBI [gave] it a run for the money) was unofficial: the “All- American Pitcher” award, given to the Giants’ reliever Otis Crandall after the 1910 season, then sinking into deserved oblivion. It went like this: Add together a pitcher’s won-lost percentage, fielding percentage, and batting average, and voila, you get an All-American. Crandall’s combined figures of .810, .984, and .342, respectively, gave him 2,136 points and, according to those in the know, the best mark of all time, surpassing Al Spalding’s 2,096 points of 1875. Who is the all-time All-American since 1910? You tell us. But seriously, folks, the idea wasn’t a bad one—measuring the overall ability of pitchers—it was just that the inadequacies of the individual statistics were magnified by lumping them in this way.

There have been other new statistical tabulations in this century, but of a generally innocuous sort: counting intentional bases on balls, balks, wild pitches, shutouts, and sacrifice bunts and sacrifice flies against pitchers. Other new stats of a far superior quality appeared in the 1940s and ’50s but have not yet [as of 1984] gained the official stamp of approval. […]

Now that the genealogy of the more significant official measures has been described, it’s time to evaluate the important ones you saw in the newspapers over breakfast, and a few which are tabulated officially only at year’s end, or are found in the weekly Sporting News. [This sentence today seems particularly wistful and quaint.]

The first offensive statistic to consider will be that venerable, uncannily durable fraud, the batting average. What’s wrong with it? What’s right with it? We’ve recited the objections for the record, but we know as well as anyone else that this monument just won’t topple; the best that can be hoped is that in time fans and officials will recognize it as a bit of nostalgia, a throwback to the period of its invention when power counted for naught, bases on balls were scarce, and no one wanted to place a statistical accomplishment in historical context because there wasn’t much history as yet.

Time has given the batting average a powerful hold on the American baseball public; everyone knows that a man who hits .300 is a good hitter while one who hits .250 is not. Everyone knows that, no matter that is not true. You want to trade Bill Madlock for Mike Schmidt? Bill Buckner for Darrell Evans? BA treats all its hits in egalitarian fashion. A two-out bunt single in the ninth with no one on base and your team trailing by six runs counts the same as Bobby Thomson’s “shot heard ‘round the world.” And what about a walk? Say you foul off four 3–2 pitches, then watch a close one go by to take your base. Where’s your credit for a neat bit of offensive work? Not in the BA. And a .250 batting average may have represented a distinct accomplishment in certain years, like 1968 when the American League average was .230. That .250 hitter stood in  the same relation to an average hitter of his season as a .277 hitter did in the National League in 1983—or a .329 hitter in the NL of 1930! If .329 and .250 mean the same thing, roughly, what good is the measure?

So in attempting to assess batting excellence with the solitary yardstick of the batting average, we tend to diminish the accomplishments of (a) the extra-base hitter, whose blows have greater run-scoring potential, both for himself and for whatever men may be on base; (b) the batter whose talent is to extract walks from pitchers who do not wish to put him on base, or whose power is such that pitchers will take their chances working the corners of the plate rather than risk an extra-base hit; (c) the batter whose misfortune it is to be playing in a period dominated by pitching, either because of the game’s evolutionary cycles or because of rules-tinkering to stem a previous domination by batters; and (d) the man whose hits are few but well-timed, or clutch-they score runs. In brief, the BA is an unweighted average; it fails to account for at least one significant offensive category (not to mention hit by pitch, sacrifices, steals, and grounded into double play); it does not permit cross-era comparison; and it does not indicate value to the team.

And yet, the batting champion each year is declared to be the one with the highest batting average, and this will not soon change. The Hall of Fame is filled with .300 hitters who couldn’t carry the pine tar of many who will stay forever on the outside looking in. Knowledgeable fans have long realized that the ability to reach base and to produce runs are not adequately measured by batting average, and they have looked to other measures, for example, the other two components of the “triple crown,” home runs and RBIs. Still more sophisticated fans have looked to the slugging percentage or On Base Average. […]

How well do these other stats compensate for the weaknesses of the BA when viewed in conjunction with it or in isolation? The slugging percentage does acknowledge the role of the man whose talent is for the long ball and who may, with management’s blessing, be sacrificing bat control and thus batting average in order to let ‘er rip. (The slugging percentage is the number of total bases divided by at bats rather than hits divided by at bats, which is the BA.) But the slugging percentage has its problems, too.

It declares that a double is worth two singles, that a triple is worth one and a half doubles, and that a home run is worth four singles. All of these proportions are intuitively pleasing, for they relate to the number of bases touched on each hit, but in terms of the hits’ value in generating runs, the proportions are wrong. A home run in four at bats is not worth as much as four singles, for instance, in part because the run potential of the four singles is greater, in part because the man who hit the four singles did not also make three outs; yet the man who goes one for four at the plate, that one being a homer, has the same slugging percentage of 1.000 as a man who singles four times in four at bats.

Moreover, it is possible to attain a high slugging percentage without being a slugger. In other words, if you have a high batting average, you must have a decent slugging percentage; it’s difficult to hot .350 and have a slugging percentage of only .400. Even a bunt single boosts not only your batting average but also your slugging percentage. […]

Other things the slugging percentage does not do are: indicate how many runs were produced by the hits; give any credit for other offensive categories, such as walks, hit by pitch, or steals; permit the comparison of sluggers from different eras (if Jimmie Foxx had a slugging percentage of .749 in 1932 and Mickey Mantle had one of .705 in 1957, was Foxx 7 percent superior? The answer is no.. […]

Well, how about On Base Average? It has been around for quite a while and still [in 1984] is not an official statistic of the major leagues. But it does appear on a daily basis in some newspapers’ leaders section, weekly in The Sporting News, and annually in the American League’s averages book (since 1979, when Pete Palmer put it there). The OBA has the advantage of giving credit for walks and hit by pitch, but is an unweighted average and thus makes no distinction between those two events and, say, a grand-slam homer. A fellow like Eddie Yost, who drew nearly a walk a game in some years in which he hit under .250, gets his credit with this stat as does a Gene Tenace, one of those guys whose statistical line looks like zip without his walks. Similarly, players like Mickey Rivers or Mookie Wilson, leadoff hitters with a lot of speed, no power, and no patience are exposed by the OBA as distinctly marginal major leaguers, even in years when their batting averages look respectable or excellent. In short, the OBA does tell you more about a man’s ability to get on than BA does, and thus is a better indicator of run generation, but it’s not enough by itself to separate “good” hitters from “average” or “poor” ones.

RBIs? Don’t they indicate run production and clutch ability? Yes and no. They tell how many runs a batter pushed across the plate, all right, but they don’t tell how many fewer he might have driven in had he batted eighth rather than fourth, or how many more he might have driven in on a team that put more men on base. They don’t even tell how many more runs a batter might have driven in if he had delivered a higher proportion of his hits with men on base. (The American League kept RBI Opportunities—men on base presented to each batter—as an official stat for the first three weeks of 1918, then saw how much work was involved and ditched it.) […]

The RBI does tell you something about run-producing ability, but not enough: It’s a situation-dependent statistic, inextricably tied to factors which vary wildly for individuals on the same team or on others. And the RBI makes no distinction between being hit by a pitch to drive in the twelfth run of a game that concludes 14–3 and, again for comparison, the Thomson blast. [The counting stats are limited in their usefulness, except to say that, the fellow who hit 39 doubles was better at that skill than the fellow who hit 38.]

It’s an odd fact that from being the most interesting stat in the early days of baseball, runs has become the least interesting stat of today; it’s odd in that runs remain the essence of baseball, remain the key to victory. What has happened over the years is that the correlation between runs and times reached base has been almost constantly widening. In 1875 the number of hits allowed per nine innings was incredibly, not much different from what it is today. Tommy Bond of Hartford allowed only 7.95 hits per nine innings (facing underhand pitching was easy?). Bases on balls were in force at the time, but eight balls were required to get one, which accounts for their scarcity in the 1870s. Today, with walks greatly increased and hits only somewhat reduced, the number of runs per nine innings has dropped dramatically, although not the number of earned runs. Indeed, as the ratio of hits to runs has diminished through the years, the ratio of earned runs to total runs has increased. In 1876, for example, the National League scored 3,066 runs, of which only 1,201—39.2 percent—were earned. By the early 1890s this figure reached 70 percent, an extraordinary advance. It took until 1920 to reach 80 percent, and by the late 1940s it leveled off in the 87-89 percent range, where it remains.

In the fourth and final installment, we will move on to pitching statistics, thus setting the scene for the sabermetric revolution that, in 1984, was still regarded as nerdville and nothing more.

Stats and History, Part 2

Let’s return to the subject of how baseball’s statistics came into being and changed over time. The text below continues, in excerpted form, the publication online, for the first time, of the opening chapter of The Hidden Game of Baseball (1984). Chadwick’s bias against the long ball was in large measure responsible for the game that evolved and for the absence of a hitter like Babe Ruth until 1919. When lively balls were introduced—as they were periodically from the very infancy of baseball—and long drives were being belted willy-nilly, and scores were mounting, Chadwick would ridicule such games in the press. What he valued most in the early days was the low scoring game marked by brilliant fielding. In the early annual guides, he listed all the “notable” games between significant teams—i.e., those in which the winner scored under ten runs!

Chadwick prevailed, and Hits Per Game became the criterion for the Clipper batting championship and remained so until 1876, when the problem with using games as the denominator in the average at last became clear. If you were playing for a successful team, and thus were surrounded by good batters, or if your team played several weak rivals who committed many errors, the number of at bats for each individual in that lineup would increase. The more at bats one is granted in a game, the more hits one is likely to have. So if Player A had 10 at bats in a game, which was not as unusual in the ’60s, he might have 4 base hits. In a more cleanly played game, Player B might bat only 6 times, and get 3 base hits. Yet Player A, with his 4-for-10, would achieve an average of 4.00; the average of Player B, who went 3–for–6, would be only 3.00. By modern standards, of course, Player A would be batting .400 while Player B would be batting .500.

In short, the batting average used in the 1860s is the same as that used today except in its denominator, with at bats replacing games. Moreover, Chadwick posited a primitive version of the slugging percentage in the 1860s, with total bases divided by number of games; change the denominator from games to at bats and you have today’s slugging percentage—which, incidentally, was not accepted by the National League as an official statistic until 1923 and the American until 1946 (the game was born conservative). Chadwick’s “total bases average” represents the game’s first attempt at a weighted average—an average in which the elements collected together in the numerator or the denominator are recognized numerically as being unequal. In this instance, a single is the unweighted unit, the double is weighted by a factor of two, the triple by three, and the home run by four. Statistically, this is a distance leap forward from, first, counting, and next, averaging. The weighted average is in fact the cornerstone of today’s statistical innovations.

The 1870s gave rise to some new batting stats and to the first attempt to quantify thoroughly the other principal facets of the game, pitching and fielding. Although the Clipper recorded base hits and total bases as early as 1868, a significant wrinkle was added in 1870 when at bats were listed as well. This is a critical introduction because it permitted the improvement of the batting average, first introduced in its current form in the Boston press on August 10, 1874, and first computed officially—that is, for the National League—in 1876.

Since then the BA has not changed. [NOTE: later research revealed an earlier inception of the concept for a modern batting average, by Hervie Alden Dobson in the Clipper of March 11, 1871.] The objections to the batting average are well known, but to date [i.e., 1984] have not have not dislodged the BA from its place as the most popular measure of hitting ability. First of all, the batting average makes no distinction between the single, the double, the triple, and the home run, treating all as the same unit—a base hit—just as its prototype, Runs Per Game, treated  the run as its unvarying, indivisible unit. This objection was met in the 1860s with Total Bases Per Game. Second, it gives no indication of the effect of that base hit; in other words, it gives no indication of the value of the hit to the team. This was probably the objection that Chadwick had to tabulating base hits originally, because it is not likely that the idea just popped into his head in 1867, upon which he decided to act immediately; he must have thought of a hit-constructed batting average earlier and rejected it.

A third objection to the batting average is that it does not take into account times first is reached via base on ball, hit by pitch or error. This, too, was addressed at a  surprisingly early date. In 1879 the National League adopted as an official statistic a forerunner of the On Base Average; it was called “Reached First Base.” Paul Hines was the leader that year with 193, which included times reached by error as well as base on balls and base hits. But the figure was dropped after  that year. […]

The year 1876 was significant not only for the founding of the National League and the official debut of the batting average in its current form; it was also the Centennial of the United States, which was marked by a giant exposition in Philadelphia celebrating the mechanical marvels of the day. American ingenuity reigned, and technology was seen as the new handmaiden of democracy. Baseball, that mirror of American life, reflected the fervor for things scientific with an explosion of statistics far more complex than those seen before, particularly in the previously neglected areas of pitching and fielding. The increasingly minute statistical examination of the game met a responsive audience, one primed to view complexity as an indication of quality.

When the rule against the wrist-snap was removed in 1872, permitting curve pitching, and as the number of errors declined through the early 1870s—thanks to the heightened level of competition provided by baseball’s first professional league, the National Association—the number of runs scored dropped off markedly.

With the pitcher unshackled—transformed from a mere delivery boy of medium pace, straight balls to a formidable adversary—the need to identify excellence, to plot the stars, arose just as it had for batters in the 1860s. Likewise as fielding errors became more the exception than the rule, they became at last worth counting and contrasting with chances accepted cleanly, in other words, the fielding percentage. Fielding skill was still the most highly sought after attribute of a ballplayer, but the balance of fielding, batting, and pitching was in flux; by the 1880s pitching and batting would begin their long rise to domination of the game, Chadwick’s tastes notwithstanding.

The crossroads of 1876 highlights how the game had changed to that point, and how it has changed since.

In that year, the number of offensive stats tabulated at season’s end … was six: games, at bats, runs hits, runs per game, and batting average. Of these, only runs and runs per game were common in the 1860s, while that decade’s tabulation of total bases vanished. The number of [official] offensive stats a hundred years later? Twenty. (Today [i.e., 1984] the number is twenty-one, with the addition of the game winning RBI.)

The number of pitching categories in 1876 was eleven, and there some surprises, such as earned run average, hits allowed, hits per game, and opponent’s batting average. Strikeouts were not recorded, for Chadwick saw them strictly as a sign of poor batting rather than good pitching (his view had such an impact that the pitchers’ K’s were not kept officially until 1887). The number of [official]pitching stats today [i.e., 1984]? Twenty-four

The number of fielding categories in 1876 was six. One hundred years later it was still six (with the exception of the catcher, who gets a seventh: passed balls), dramatizing how the game—at least the hidden game of statistics—had passed fielding by. The fielding stats of 1876 were combined to form an average, the “percentage of chances accepted,” or fielding percentage. A “missing link” variant, devised by Al Wright in 1875, was to form averages by dividing the putouts by the number of games to yield a “putout average”; dividing the assists similarly to arrive at an “assist average”; and to divide putouts plus assists by games to get “fielding average.” These averages took no account of errors. (Does Wright’s “fielding average” look familiar? You may have recognized it as Bill James’s Range Factor! Everything old is new again.)

This is all testimony to the changing nature of the game—not just to the evolving approaches of statisticians, but to fundamental changes in the games. […] The public’s appetite for new statistics was not sated by the outburst of 1876. New measures were introduced in dizzying profusion in the remaining years of the century. Some of these did not catch on and were soon dropped, some for all time, others only to reappear with renewed vigor in the twentieth century.

The statistic that never resurfaced after its solitary appearance in 1880 was “Total Bases Run,” a wonderfully silly figure which signified virtually nothing about either an individual’s ability in isolation or his value to his team. It was sort of an RBI in reverse, or from the baserunner’s perspective. Get on with a single, proceed to score in whatever manner, and you’ve touched four bases. Abner Dalrymple of Chicago was baseball history’s only recorded leader in the category with 501. Now there’s a major league trivia question.

Another stat that was stillborn in the 1870s was times reached base on error (it was computed again in 1917 –19 by the NL, then dropped for all time). Its twentieth-century companion piece, equally short-lived after its introduction in the 1910s, was runs allowed by fielders. Lanigan records this lovely bit of doggerel written to “honor” Chicago shortstop Red Corriden, whose errors in 1914 let in 20 runs:

Red Corriden was figuring the cost of livelihood.

“‘Tis plain,” he said, “I do not get the money I should.

According to my figrin’, I’d be a millionaire

 If I could sell the boots I make for 30 cents a pair.”

Previously mentioned was another stat which blossomed in only one year (1879), Reached First Base. This resurfaced, however, in the early 1950s in an improved form called On Base Average, which may be the most widely familiar of all unofficial statistics. [It was made official in 1985, the year after publication of Hidden Game.] In the same manner, the “total bases per game” tabulation of the 1860s vanished only to be named an official stat decades later in its modified version of slugging percentage. And yet another 1860s stat, earned run average, dropped from sight in the 1880s only to return triumphant to the NL in 1912 and the AL in 1913, when Ban Johnson not only proclaimed it official but also dictated that the AL compile no official won-lost records (this state of affairs lasted for seven years, 1913 –19.)

Another stat which was “sent back to the minors” before settling in for good in 1920 was the RBI. Introduced by a Buffalo newspaper in 1879, the stat was picked up the following year by the Chicago Tribune, which in the words of Preston D. Orem, “proudly presented the ‘Runs Batted In’ record of the Chicago players for the season, showing Anson and Kelly in the lead. Readers were unimpressed. Objections were that the men who led off, Dalrymple and Gore, did not have the same opportunities to knock in runs. The paper actually wound up “almost apologizing for the computation.” Even then astute fans knew the principal weakness of the statistic to be its extreme dependence on situation—in a particular at bat, whether or not men are on base; over a season or career, one’s position in the batting order and the overall batting strength in one’s team. It is a curious bit of logical relativism to observe that fans of the nineteenth century rejected ribbies because of their poor relation to run-producing ability while twentieth-century fans embrace the stat for its presumed indication of that same quality.

More tomorrow!

Stats and History

In an odd rush of events, last weekend I attended the MIT Sloan Sports Analytics Conference in Boston, where I took part in a panel with Bill James, John Dewan, Dean Oliver, and John Walsh. After one day back at home, I departed for Phoenix to attend the NINE Spring Training Conference, a baseball history and culture affair I have addressed previously, but this time I am a noncombatant, with no obligation other than fun. I will stay out here to March 15, when I will attend the SABR Analytics Conference, a new and promising venture at which I’ll be part of a “how sabermetrics began” panel with Dick Cramer, Gary Gillette, and Sean Forman.

Bill James was the not the first to think about baseball in sabermetric terms, though he coined the term and is the godfather of the burgeoning movement, now prevalent in all individual and team sports. When MIT held its first conference five years ago, two hundred people attended. Last year the attendance topped a thousand, barely. This year it exceeded 2200. Bill must have been unimaginably gratified to see the fruits of his labors. I was, too, a little. I tweeted from the conference, “Feeling like Rip Van Winkle to be here and see how huge sports analysis has become since Pete Palmer and I partnered 30 years ago.”

I am not and have never been a statistician, but I might fairly be called an early worker in the lonely fields of sabermetrics (in football, too, with Palmer and the lamentably departed Bob Carroll). My statistical writing is surely behind me, and I confess that I struggle to maintain interest amid the current swarm of digitally convergent information: batted ball and pitch locations, advanced defensive metrics, and endless video to assess bat speed, arm angles, and the like. However, thanks to SABR—at my first national convention, as a member of two weeks standing, the first two men I met were Carroll and Palmer—I have a place in sabermetric history.

Let me share with you with some portions of the opening chapter of The Hidden Game of Baseball, published in 1984. I was asked at the conference whether I would wish to see it reprinted. “No,” I replied, not in any revised or updated fashion. It is a historical marker, a period piece that reflects where we were then in our thinking.” The book is now prized among collectors and retains an honored place among today’s sports analysts. Here goes:

Before we assess where baseball statistics are headed, we ought first to see where they’ve been.

In the beginning, baseball knew numbers and was not ashamed. The game’s Eden dates ca. 1845, the year in which Alexander Cartwright and his Knickerbocker teammates codified the first set of rules and the year in which the New York Herald printed the primal box score. {I have recently, in Baseball in the Garden of Eden, upended some notions about the Knicks]. The infant game became quantified in part to ape the custom of its big brother, cricket; yet the larger explanation is that the numbers served to legitimize men’s concern with a boys’ pastime. The pioneers of baseball reporting—William Cauldwell of the Sunday Mercury, William Porter of Spirit of the Times, the unknown ink-stained wretch at the Herald, and later Father Chadwick—may indeed have reflected that if they did not cloak the game in the “importance” of statistics, it might not seem worthwhile for adults to read about, let alone play.

Americans of that somewhat grim period were blind to the virtue of play (much to the befuddlement of Europeans) and could take their amusements only with a chaser of purposefulness. Baseball, though simple in its essence (a ball game with antecedents in the Egypt of the pharaohs), was intricate in its detail and thus peculiarly suited to quantification; statistics elevated baseball from other boys’ field games of the 1840s and ’50s to make it somehow “serious” like business or the stock market. […]

[Henry]Chadwick’s cricket background was largely the impetus to his method of scoring a baseball game, the format of his early box scores, and the copious if primitive statistics that appeared in his year-end summaries in the New York Clipper, Beadle’s Dime Base-Ball Player, and other publications.

Actually, cricket had begun to shape baseball statistics even before Chadwick’s conversion. The first box score reported on two categories, outs and runs: Outs, or “hands out,” counted both unsuccessful times at bat and outs run into on the basepaths; “runs” were runs scored, not those driven in. The reason for not recording hits in the early years, when coverage of baseball matches appeared alongside that of cricket matches, was that, unlike baseball, cricket had no such category as the successful hit which did not produce a run. To reach “base” in cricket is to run to the opposite wicket, which tallies a run; if you hit the ball and do not score a run, you have been put out. […]

Cricket box scores were virtual play-by-plays, a fact made possible by the lesser number of possible events. This play-by-play aspect was applied to a baseball box score as early as 1858 in the New York Tribune; interestingly, despite the abundance of detail, hits were still not accounted. Nor did they appear in Chadwick’s own box scores until 1867, and his year-end averages to that time also reflected a cricket mind-set. The batting champion as declared by Chadwick, whose computations were immediately and universally accepted as “official,” was the man with the highest average of Runs Per Game.

An inverse though imprecise measure of batting quality was Outs Per Game. After 1863, when a fair ball caught on one bounce was no longer an out, fielding leaders were those with the greatest total of fly catches, assists, and “foul bounds” (fouls caught on one bounce). Pitching effectiveness was based purely on control, with the leader recognized as the one whose delivery offered the most opportunities for outs at first base and the fewest passed balls [by his brave catcher, behind the bat with neither mask nor mitt].

In a sense, Chadwick’s measuring of baseball as if it were cricket can be viewed as correct in that when you strip the game to its basic elements, those that determine victory or defeat, outs and runs are all that count in the end. No individual statistic is meaningful to the team unless it relates directly to the scoring of runs. Chadwick’s blind spot in his early years of baseball reporting lay in not recognizing the linear character of the game, the sequential nature whereby a string of base hits or men reaching base on error (there were no walks then) was necessary in most cases to produce a run. In cricket each successful hit must produce at least one run, while in baseball, more of a team game on offense, a successful hit may produce none. […]

Early player stats were of the most primitive kind, the counting kind. They’d tell you how many runs, or outs, or fly catches; later, how many hits or total bases. Counting is the most basic of all statistical processes; the next step up is averaging, and Chadwick was the first to put this into practice. […]

As professionalism infiltrated the game, teams began to bid for star-caliber players. Stars were known not by their stats but by their style: Every boy would emulate the flair of a George Wright at shortstop, the whip motion of a Jim Creighton pitching, the nonchalance of a John Chapman making the over-the-shoulder one-handed catches in the outfield (this in the days before the glove!). But Chadwick recognized the need for more individual accountability, the need to form objective credentials for those perceived as stars (or, in the parlance of the period, “aces”). The creation of popular heroes is a product of the post-Civil War period, with a few notable exceptions (Creighton, Joe Start, Dickey Pearce, J.B. Leggett).

So in 1865, in the Clipper, Chadwick began to record a form of batting average taken from the cricket pages—Runs Per Game. Two years later, in his newly founded baseball weekly, The Ball Players’ Chronicle, Chadwick began to retotal bases, total bases per game, and hits per game. The averages were expressed not with decimal places but in the standard cricket format of the “average and over.” Thus a batter with 23 hits in six games would have an average expressed not as 3.83 but as “3-5”—an average of 3 with an overage, or remainder, of 5. Another innovation was to remove from the individual accounting all bases gained through errors. Runs scored by a team, beginning in 1867, were divided between those scored after a man reached base on a clean hit and those arising from a runner’s having reached base on an error.

In 1868, despite Chadwick’s derision, the Clipper continued to award the prize for the batting championship to the player with the greatest average of Runs Per Game. Actually, the old yardstick had been less preposterous a measure of batmanship than one might imagine today, because team defenses were so much poorer and the pitcher, with severe restrictions on his method of delivery, was so much less important. If you reached first base, whether by a hit or by an error, your chances of scoring were excellent; indeed, teams of the mid-1860s registered more runs than hits! By the 1876 season, the first of National League play, the caliber of both pitching and defense had improved to the extent that the ratio of runs to hits was about 6.5 to 10; today [i.e., 1984] the ratio stands at roughly 4 to 10. […]

The Outs Per Game figure was tainted as a measure of batting skill because it may reflect as easily a strikeout or a double unsuccessfully stretched into a triple. Or, in a ridiculous but true example, a man might get on base with a single, then be forced out on second base on a ground ball. The runner who was forced out is debited with the out; not only does the man who hit the grounder fail to register a notch in the out column—if he comes around to score he’ll get a counter in the run column.

In the late 1860s Chadwick was recording total bases and home runs, but he placed little stock in either, as conscious attempts at slugging violated his cricket-bred image of “form.” Just as cricket aficionados watch the game for the many opportunities for fine fielding it affords, so was baseball from its inception perceived as a fielders’ sport. The original Cartwright rules of 1845, in fact, specified that a ball hit out of the field—in fair territory or foul—was a foul ball! “Long hits are showy,” Chadwick wrote in the Clipper in 1868, “but they do not pay in the long run. Sharp grounders insuring the first-base certain, and sometimes the second-base easily, are worth all the hits made for home-runs which players strive for.”

More than a century later, after dozens of new statistics had been created to apportion individual accomplishments, sabermetricians came around full circle to the view that outs and runs were what mattered in the end. I will continue this stats history tomorrow.

Follow

Get every new post delivered to your Inbox.

Join 516 other followers