orgtheory.net

let’s get rid of the pseudo-r2

with 3 comments

Previous orgtheory posts onĀ  controversial statistical topics: interaction terms, Bayesian statistics, p-values and survey response rates.

I’ve come to the conclusion that a common statistical practice – the reporting of an R-sqaured for a logit or other categorical data model – is lame and we should immediately stop. Why? Let’s count the ways:

  • Categorical data models estimate the effects of unobserved latent variables. You can’t compare the estimate with this purely theoretical object. All R2 are weird concoctions.
  • There’s literally seven different R-squared statistics (at least) and they don’t always correlate well. Most people don’t know the difference between them. Even statisticians disagree on which one you should use.
  • Pseudo-R2 measures usually do not measure variance, but they measure changes in likelihood and related quantities. For example, McFadden’s R2 measures changes in likelihood functions, which have no obvious interpretation. Unless the pseudo-R2 is either 0 or 1, the statistic is uninterpretable in relation to your data.
  • The pseudo-R2 is not an analog to the OLS R2 and it’s misleading to say so. But folks believe it is.
  • The pseudo-R2 for some models, like the Tobit, actually create *neagtive* values and are thus confusing for many readers.
  • Reporting pseudo-R2 leads to the reader to think that you are directly measuring goodness of fit. Wrong.
  • There is no decent way to decide what is good or bad fit. A psuedo-R2 measure of .02 tells you nothing about the match between predicted and observed quantities.

For the full run down, check out this 2007 article by Illinois b-school’s Glenn Hoetker in the Strategic Management Journal. Thanks to my colleague Scott Long, for bringing this article to my attention.

Written by fabiorojas

April 14, 2008 at 12:32 am

3 Responses

Subscribe to comments with RSS.

  1. Another similar problem is the reporting of R2 with instrumental variable (IV) regressions. In this situation, R2 not only has no statistical meaning but also may not be bounded between 0 and 1. But time and again, you will find people providing R2 for IV estimation probably thinking that it provides similar information as R2 in OLS.

    lester

    April 14, 2008 at 1:39 am

  2. See also Ziliak and McCloskey (2004) Size matters: The standard error of regressions in the American Economic Review. The Journal of Social Economics 33, pp. 527–546.

    Isaac

    April 14, 2008 at 3:48 am

  3. It’s that kind of article that makes me dismayed. We can do a bit better when it comes to statistical practice.

    fabiorojas

    April 14, 2008 at 6:10 pm


Leave a Reply