Stats and History

In an odd rush of events, last weekend I attended the MIT Sloan Sports Analytics Conference in Boston, where I took part in a panel with Bill James, John Dewan, Dean Oliver, and John Walsh. After one day back at home, I departed for Phoenix to attend the NINE Spring Training Conference, a baseball history and culture affair I have addressed previously, but this time I am a noncombatant, with no obligation other than fun. I will stay out here to March 15, when I will attend the SABR Analytics Conference, a new and promising venture at which I’ll be part of a “how sabermetrics began” panel with Dick Cramer, Gary Gillette, and Sean Forman.

Bill James was the not the first to think about baseball in sabermetric terms, though he coined the term and is the godfather of the burgeoning movement, now prevalent in all individual and team sports. When MIT held its first conference five years ago, two hundred people attended. Last year the attendance topped a thousand, barely. This year it exceeded 2200. Bill must have been unimaginably gratified to see the fruits of his labors. I was, too, a little. I tweeted from the conference, “Feeling like Rip Van Winkle to be here and see how huge sports analysis has become since Pete Palmer and I partnered 30 years ago.”

I am not and have never been a statistician, but I might fairly be called an early worker in the lonely fields of sabermetrics (in football, too, with Palmer and the lamentably departed Bob Carroll). My statistical writing is surely behind me, and I confess that I struggle to maintain interest amid the current swarm of digitally convergent information: batted ball and pitch locations, advanced defensive metrics, and endless video to assess bat speed, arm angles, and the like. However, thanks to SABR—at my first national convention, as a member of two weeks standing, the first two men I met were Carroll and Palmer—I have a place in sabermetric history.

Let me share with you with some portions of the opening chapter of The Hidden Game of Baseball, published in 1984. I was asked at the conference whether I would wish to see it reprinted. “No,” I replied, not in any revised or updated fashion. It is a historical marker, a period piece that reflects where we were then in our thinking.” The book is now prized among collectors and retains an honored place among today’s sports analysts. Here goes:

Before we assess where baseball statistics are headed, we ought first to see where they’ve been.

In the beginning, baseball knew numbers and was not ashamed. The game’s Eden dates ca. 1845, the year in which Alexander Cartwright and his Knickerbocker teammates codified the first set of rules and the year in which the New York Herald printed the primal box score. {I have recently, in Baseball in the Garden of Eden, upended some notions about the Knicks]. The infant game became quantified in part to ape the custom of its big brother, cricket; yet the larger explanation is that the numbers served to legitimize men’s concern with a boys’ pastime. The pioneers of baseball reporting—William Cauldwell of the Sunday Mercury, William Porter of Spirit of the Times, the unknown ink-stained wretch at the Herald, and later Father Chadwick—may indeed have reflected that if they did not cloak the game in the “importance” of statistics, it might not seem worthwhile for adults to read about, let alone play.

Americans of that somewhat grim period were blind to the virtue of play (much to the befuddlement of Europeans) and could take their amusements only with a chaser of purposefulness. Baseball, though simple in its essence (a ball game with antecedents in the Egypt of the pharaohs), was intricate in its detail and thus peculiarly suited to quantification; statistics elevated baseball from other boys’ field games of the 1840s and ’50s to make it somehow “serious” like business or the stock market. […]

[Henry]Chadwick’s cricket background was largely the impetus to his method of scoring a baseball game, the format of his early box scores, and the copious if primitive statistics that appeared in his year-end summaries in the New York Clipper, Beadle’s Dime Base-Ball Player, and other publications.

Actually, cricket had begun to shape baseball statistics even before Chadwick’s conversion. The first box score reported on two categories, outs and runs: Outs, or “hands out,” counted both unsuccessful times at bat and outs run into on the basepaths; “runs” were runs scored, not those driven in. The reason for not recording hits in the early years, when coverage of baseball matches appeared alongside that of cricket matches, was that, unlike baseball, cricket had no such category as the successful hit which did not produce a run. To reach “base” in cricket is to run to the opposite wicket, which tallies a run; if you hit the ball and do not score a run, you have been put out. […]

Cricket box scores were virtual play-by-plays, a fact made possible by the lesser number of possible events. This play-by-play aspect was applied to a baseball box score as early as 1858 in the New York Tribune; interestingly, despite the abundance of detail, hits were still not accounted. Nor did they appear in Chadwick’s own box scores until 1867, and his year-end averages to that time also reflected a cricket mind-set. The batting champion as declared by Chadwick, whose computations were immediately and universally accepted as “official,” was the man with the highest average of Runs Per Game.

An inverse though imprecise measure of batting quality was Outs Per Game. After 1863, when a fair ball caught on one bounce was no longer an out, fielding leaders were those with the greatest total of fly catches, assists, and “foul bounds” (fouls caught on one bounce). Pitching effectiveness was based purely on control, with the leader recognized as the one whose delivery offered the most opportunities for outs at first base and the fewest passed balls [by his brave catcher, behind the bat with neither mask nor mitt].

In a sense, Chadwick’s measuring of baseball as if it were cricket can be viewed as correct in that when you strip the game to its basic elements, those that determine victory or defeat, outs and runs are all that count in the end. No individual statistic is meaningful to the team unless it relates directly to the scoring of runs. Chadwick’s blind spot in his early years of baseball reporting lay in not recognizing the linear character of the game, the sequential nature whereby a string of base hits or men reaching base on error (there were no walks then) was necessary in most cases to produce a run. In cricket each successful hit must produce at least one run, while in baseball, more of a team game on offense, a successful hit may produce none. […]

Early player stats were of the most primitive kind, the counting kind. They’d tell you how many runs, or outs, or fly catches; later, how many hits or total bases. Counting is the most basic of all statistical processes; the next step up is averaging, and Chadwick was the first to put this into practice. […]

As professionalism infiltrated the game, teams began to bid for star-caliber players. Stars were known not by their stats but by their style: Every boy would emulate the flair of a George Wright at shortstop, the whip motion of a Jim Creighton pitching, the nonchalance of a John Chapman making the over-the-shoulder one-handed catches in the outfield (this in the days before the glove!). But Chadwick recognized the need for more individual accountability, the need to form objective credentials for those perceived as stars (or, in the parlance of the period, “aces”). The creation of popular heroes is a product of the post-Civil War period, with a few notable exceptions (Creighton, Joe Start, Dickey Pearce, J.B. Leggett).

So in 1865, in the Clipper, Chadwick began to record a form of batting average taken from the cricket pages—Runs Per Game. Two years later, in his newly founded baseball weekly, The Ball Players’ Chronicle, Chadwick began to retotal bases, total bases per game, and hits per game. The averages were expressed not with decimal places but in the standard cricket format of the “average and over.” Thus a batter with 23 hits in six games would have an average expressed not as 3.83 but as “3-5”—an average of 3 with an overage, or remainder, of 5. Another innovation was to remove from the individual accounting all bases gained through errors. Runs scored by a team, beginning in 1867, were divided between those scored after a man reached base on a clean hit and those arising from a runner’s having reached base on an error.

In 1868, despite Chadwick’s derision, the Clipper continued to award the prize for the batting championship to the player with the greatest average of Runs Per Game. Actually, the old yardstick had been less preposterous a measure of batmanship than one might imagine today, because team defenses were so much poorer and the pitcher, with severe restrictions on his method of delivery, was so much less important. If you reached first base, whether by a hit or by an error, your chances of scoring were excellent; indeed, teams of the mid-1860s registered more runs than hits! By the 1876 season, the first of National League play, the caliber of both pitching and defense had improved to the extent that the ratio of runs to hits was about 6.5 to 10; today [i.e., 1984] the ratio stands at roughly 4 to 10. […]

The Outs Per Game figure was tainted as a measure of batting skill because it may reflect as easily a strikeout or a double unsuccessfully stretched into a triple. Or, in a ridiculous but true example, a man might get on base with a single, then be forced out on second base on a ground ball. The runner who was forced out is debited with the out; not only does the man who hit the grounder fail to register a notch in the out column—if he comes around to score he’ll get a counter in the run column.

In the late 1860s Chadwick was recording total bases and home runs, but he placed little stock in either, as conscious attempts at slugging violated his cricket-bred image of “form.” Just as cricket aficionados watch the game for the many opportunities for fine fielding it affords, so was baseball from its inception perceived as a fielders’ sport. The original Cartwright rules of 1845, in fact, specified that a ball hit out of the field—in fair territory or foul—was a foul ball! “Long hits are showy,” Chadwick wrote in the Clipper in 1868, “but they do not pay in the long run. Sharp grounders insuring the first-base certain, and sometimes the second-base easily, are worth all the hits made for home-runs which players strive for.”

More than a century later, after dozens of new statistics had been created to apportion individual accomplishments, sabermetricians came around full circle to the view that outs and runs were what mattered in the end. I will continue this stats history tomorrow.


As usual, John’s fascinating history leaves me wowwed.

I was lucky to find a copy of Hidden Game on eBay for relatively cheap and in good condition (it had once been part of a small public library’s collection). It is still worth reading 28 years later.

Pingback: New sabermetrics? I was just getting used to the old sabermetrics.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 442 other followers

%d bloggers like this: