status bias in baseball umpiring

Jerry Kim and I have an op-ed in Sunday’s New York Times about our new paper on status bias in baseball umpiring. We analyzed over 700,000 non-swinging pitches from the 2008-09 season and found that umpires made numerous types of mistakes in calling strikes-balls. Most notably, we expected that umpires would be influenced by the status and reputation of the pitcher, and this is indeed what we found:

One of the sources of bias we identified was that umpires tended to favor All-Star pitchers. An umpire was about 16 percent more likely to erroneously call a pitch outside the zone a strike for a five-time All-Star than for a pitcher who had never appeared in an All-Star Game. An umpire was about 9 percent less likely to mistakenly call a real strike a ball for a five-time All-Star. The strike zone did actually seem to get bigger for All-Star pitchers and it tended to shrink for non-All-Stars.

An umpire’s bias toward All-Star pitchers was even stronger when the pitcher had a reputation for precise control, as measured by the career percentage of batters walked. We found that pitchers with a track record of not walking batters — like Greg Maddux — were much more likely to benefit from their All-Star status than similarly decorated but “wilder” pitchers like Randy Johnson.

Baseball insiders have long suspected what our research confirms: that umpires tend to make errors in ways that favor players who have established themselves at the top of the game’s status hierarchy. But our findings are also suggestive of the way that people in any sort of evaluative role — not just umpires — are unconsciously biased by simple “status characteristics.” Even constant monitoring and incentives can fail to train such biases out of us.

You can can download the paper, which is forthcoming in Management Science, if you’re interested in learning more about the analyses and their implications for theories about status characteristics and the Matthew Effect.


Written by brayden king

March 29, 2014 at 10:17 pm

7 Responses

Subscribe to comments with RSS.

  1. Great work! Looking forward to reading it.



    March 29, 2014 at 10:45 pm

  2. Very interesting, thank you for posting.

    I’m not sure if it matters but I can’t shake wondering whether status characteristics of the umpire somehow moderate the effect? Some umpires enjoy the same kind of “all-star” status in certain circles as do all-star players. The outcomes reported are very focused on the characteristics of the pitcher with an assumption that umpires are basically the same. I wondered whether longer-serving or higher profile umpires (maybe based on the number of all-star games, or number of playoff games, they have officiated) differed significantly in terms of the effects you reported.



    March 31, 2014 at 2:31 pm

  3. Alex, the main results were based on models with umpire fixed effects, so we are comparing within a given umpire. At one point, we included umpire characteristics such as tenure in the league (which is highly correlated with playoff/All-Star appearances), and umpire-pitcher familiarity, but they didn’t really do much as a variable, nor as a moderator of the pitcher status effect. We also ran umpire-specific regressions, and didn’t find a lot of variation in the estimated coefficients. We think this is consistent with the view that we are looking at a cognitive bias (as opposed to a calculated/deliberate action), and thus, similar in magnitude across individuals.



    March 31, 2014 at 5:47 pm

  4. I was a (very enthusiastic) reviewer of this paper at one stage, and am unsurprised to see it receiving these accolades. Great use of great data.



    March 31, 2014 at 9:46 pm

  5. Thanks for your enthusiasm and support anonymous!


    brayden king

    April 1, 2014 at 3:17 pm

  6. Cool stuff, except that Brian Mills already did this (in Managerial & Decision Economics).



    April 2, 2014 at 5:50 pm

  7. Thanks anon for highlighting Brian’s paper, which is really excellent. Our analysis differs from Brian’s though in a number of ways, most notably that we’re looking at the rate of errors in umpires’ calls, while Brian is looking at the rate of calling a strike. The difference in the DV means that our paper is more focused on accuracy and the underlying biases that lead to inaccuracy.


    brayden king

    April 2, 2014 at 6:03 pm

Comments are closed.

%d bloggers like this: