letters of recommendation: still garbage
Long time readers know that I am a skeptic when it comes to letters of recommendation. The last time I wrote about the topic, I relied on a well cited 1993 article by Aamodt, Bryan amd Whitcomb in Public Personnel Management that reviews the literature and shows that LoR’s have very little validity. I.e., they are poor predictors of future job performance. But what if the literature has changed in the meanwhile? Maybe these earlier studies were flawed, or based on limited samples, or better research methods provide more compelling answers. So I went back and read some more recent research on the validity of LoRs. The answer? With a few exceptions, still garbage.
For example, the journal Academic Medicine published a 2014 article that analyzed LoR for three cohorts of students at a medical school. From the abstract:
Results: Four hundred thirty-seven LORs were included. Of 76 LOR characteristics, 7 were associated with graduation status (P ≤ .05), and 3 remained significant in the regression model. Being rated as “the best” among peers and having an employer or supervisor as the LOR author were associated with induction into AOA, whereas having nonpositive comments was associated with bottom of the class students.
Conclusions: LORs have limited value to admission committees, as very few LOR characteristics predict how students perform during medical school.
Translation: Almost all information in letters is useless, except the occasional negative comment (which academics strive not to say). The other exception is explicit comparison with other candidates, which is not a standard feature of many (or most?) letters in academia.
Ok, maybe this finding is limited to med students. What about other contexts? Once again, LoRs do poorly unless you torture specific data out of them. From a 2014 meta-analysis of LoR recommendation research in education from the International Journal of Selection and Assessment:
… Second, letters of recommendation are not very reliable. Research suggests that the interrater reliability of letters of recommendation is only about .40 (Baxter, et al., 1981; Mosel & Goheen, 1952, 1959; Rim, 1976). Aamodt, Bryan & Whitcomb (1993) summarized this issue pointedly when they noted, ‘The reliability problem is so severe that Baxter et al. (1981) found that there is more agreement between two recommendations written by the same person for two different applicants than there is between two people writing recommendations for the same person’ (Aamodt et al., 1993, p. 82). Third, letter readers tend to favor letters written by people they know (Nicklin & Roch, 2009), despite any evidence that this leads to superior judgments.
Despite this troubling evidence, the letter of recommendation is not only frequently used; it is consistently evaluated as being nearly as important as test scores and prior grades (Bonifazi, Crespy, & Reiker, 1997; Hines, 1986). There is a clear and gross imbalance between the importance placed on letters and the research that has actually documented their efficacy. The scope of this problem is considerable when we consider that there is a very large literature, including a number of reviews and meta-analyses on standardized tests and no such research on letters. Put another way, if letters were a new psychological test they would not come close to meeting minimum professional criteria (i.e., Standards) for use in decision making (AERA, APA, & NCME, 1999). This study is a step toward addressing this need by evaluating what is known, identifying key gaps, and providing recommendations for use and research. [Note: bolded by me.]
As with other studies, there is a small amount of information in LoRs. The authors note that “… letters do appear to provide incremental information about degree attainment, a difficult and heavily motivationally determined outcome.” That’s something, I guess, for a tool that would fail standard tests of validity.