The Baseball Analyst Issue 1: a review

A scanned copy of a newsletter, available from SABR’s website. The following comments are more a description than a proper review.

Bill James published 40 quarterly issues of a newsletter called The Baseball Analyst beginning in June of 1982. His idea was to “provide a place where people who have research they want to do can find a place to print it.” The first edition contained five articles, and was apparently edited by James:

  • Ballpark Effects on the Production of Infield Errors and Double Plays, by Paul Schwartzenbart.
  • The Distribution of Runs Scored, by Dallas Adams.
  • Nolan Ryan’s Fifth No-Hitter, by Tom Jones.
  • Wins and Losses for All Players, by Mark Pankin.
  • Home Runs — a matter of attitude, by Bob Kingsley.

The Schwartzenbart piece on Errors and Double Plays provides some statistics, extracted from hundreds of box scores, summarizing error rates and double plays for National League ballparks from 1972 to 1980 (the exact coverage is less clear than it seems at first read, but that’s probably insignificant). The author’s original interest was the impact of artificial playing fields, but his data’s more generally useful. The data shows strong astroturf effects for infielders, weaker turf effects for DPs, and no strong playing surface effects on outfielder errors.

Dallas Adams provides a table and a graph showing the distribution of game scores, broken down by team offense. For instance, a team which scored about 3.5 runs per game would score 4 runs in 12.28% of its games, while a team which averaged 4.50 runs per game would score 4 runs in 13.11% of its games. Adams goes on to explore some of the implications of his research, and to suggest ways the work might be applied to specific questions.

The Jones piece on Ryan’s No-Hitter is likely only of interest to a limited number of readers. It is, all the same, nicely done.

Pankin’s Wins & Losses piece is odd, and convoluted, and requires access to play-by-play data. It’s an effort to apportion wins and losses among all the team’s players. The method’s pretty arbitrary, though. I’m curious whether anyone’s applied it to Retrosheet, and whether the results seem to make sense. Perhaps I’ll try that….

The Kingsley Home Runs piece attempts to explain why ballpark characteristics and weather don’t seem to fully account for home run rates in specific ball yards. While the essay’s well-reasoned and interesting, and I understand the author’s calculations, I’ve serious doubts about the assumptions behind the resulting numbers. An intriguing effort, nonetheless.

Summary: It’s worth noting that the first two pieces, which were pioneering research projects in their day, could now be easily reproduced from Retrosheet data. The Pankin piece also needed something like Retrosheet to become practicable; it also foreshadows the James’ Win Shares project and other efforts to better apportion responsibility for what goes on on the field. The other pieces are interesting, and were worth sharing.