letters of recommendation: still garbage

Long time readers know that I am a skeptic when it comes to letters of recommendation. The last time I wrote about the topic, I relied on a well cited 1993 article by Aamodt, Bryan amd Whitcomb in Public Personnel Management that reviews the literature and shows that LoR’s have very little validity. I.e., they are poor predictors of future job performance. But what if the literature has changed in the meanwhile? Maybe these earlier studies were flawed, or based on limited samples, or better research methods provide more compelling answers. So I went back and read some more recent research on the validity of LoRs. The answer? With a few exceptions, still garbage.

For example, the journal Academic Medicine published a 2014 article that analyzed LoR for three cohorts of students at a medical school. From the abstract:

Results: Four hundred thirty-seven LORs were included. Of 76 LOR characteristics, 7 were associated with graduation status (P ≤ .05), and 3 remained significant in the regression model. Being rated as “the best” among peers and having an employer or supervisor as the LOR author were associated with induction into AOA, whereas having nonpositive comments was associated with bottom of the class students.

Conclusions: LORs have limited value to admission committees, as very few LOR characteristics predict how students perform during medical school.

Translation: Almost all information in letters is useless, except the occasional negative comment (which academics strive not to say). The other exception is explicit comparison with other candidates, which is not a standard feature of many (or most?) letters in academia.

Ok, maybe this finding is limited to med students. What about other contexts? Once again, LoRs do poorly unless you torture specific data out of them. From a 2014 meta-analysis of LoR recommendation research in education from the International Journal of Selection and Assessment:

… Second, letters of recommendation are not very reliable. Research suggests that the interrater reliability of letters of recommendation is only about .40 (Baxter, et al., 1981; Mosel & Goheen, 1952, 1959; Rim, 1976). Aamodt, Bryan & Whitcomb (1993) summarized this issue pointedly when they noted, ‘The reliability problem is so severe that Baxter et al. (1981) found that there is more agreement between two recommendations written by the same person for two different applicants than there is between two people writing recommendations for the same person’ (Aamodt et al., 1993, p. 82). Third, letter readers tend to favor letters written by people they know (Nicklin & Roch, 2009), despite any evidence that this leads to superior judgments.

Despite this troubling evidence, the letter of recommendation is not only frequently used; it is consistently evaluated as being nearly as important as test scores and prior grades (Bonifazi, Crespy, & Reiker, 1997; Hines, 1986). There is a clear and gross imbalance between the importance placed on letters and the research that has actually documented their efficacy. The scope of this problem is considerable when we consider that there is a very large literature, including a number of reviews and meta-analyses on standardized tests and no such research on letters. Put another way, if letters were a new psychological test they would not come close to meeting minimum professional criteria (i.e., Standards) for use in decision making (AERA, APA, & NCME, 1999). This study is a step toward addressing this need by evaluating what is known, identifying key gaps, and providing recommendations for use and research. [Note: bolded by me.]

As with other studies, there is a small amount of information in LoRs. The authors note that “… letters do appear to provide incremental information about degree attainment, a difficult and heavily motivationally determined outcome.” That’s something, I guess, for a tool that would fail standard tests of validity.

50+ chapters of grad skool advice goodness: Grad Skool Rulz/From Black Power

Written by fabiorojas

October 29, 2014 at 12:01 am

Posted in academia, education, fabio, mere empirics, psychology

7 Responses

Subscribe to comments with RSS.

Very interesting post. Just wondering: Do these studies look at LoRs for students from the entire applicant pool or only those who are admitted? If they don’t look at the entire pool of applicants (i.e., both admitted and non-admitted students), isn’t there a major selection-on-the-dependent-variable problem here? If we only look at people who have already met a high enough standard to be admitted to medical school, then sure, I wouldn’t expect LoRs to be a major predictor of variation within that already highly selected group. But if we looked at the entire pool, perhaps we would find that these letters do actually have some value for admission committees.

LikeLiked by 1 person

A Loyal Reader

October 29, 2014 at 1:16 am
There’s another huge selection issue at play that this post also leaves unacknowledged – getting someone to agree to write a LOR for you in the first place. The assumption seems to be that the only potential value a letter can provide is via its substance. I think this is a pretty faulty assumption to start the question/analysis/critique from. They could have (and many frequently do) said no, which is an important aspect of LORs. Some might suggest this argues for a move towards more form letter type formats. I’m not sure about that as it may lower the bar sufficiently to decrease the cost in (and thus signal sent by) agreeing to write them.

LikeLiked by 1 person

jimi adams

October 29, 2014 at 1:44 am
@Loyal: It depends on the study. In some cases, it only makes sense to look at admitted people. If I am predicting that how you will do at job X, and X only makes sense at my company, then you are stuck with admitted folks. In a few other studies, they do compare all applicants. An unpublished study of econ grad school applicants by Levitt et al. tracked down graduation for all people who applied to grad schools. This is also done in studies that look at high school students and eventual graduation. But you are correct, it is a problem with the lit.

@jimi: I do agree that getting the letters is probably a non-trivial signal. One of the few studies to find positive effects (Levitt’s study), merely the status of the recommender matters. Also, in med school contexts (see original post) the type of recommender matters as well.

LikeLiked by 1 person

fabiorojas

October 29, 2014 at 8:37 pm
Probably useful to give some thought to how recommendation letters are actually used in practice and also what types of outcomes interest us. I think that in most cases the letters don’t carry a whole bunch of weight when making decisions about admission. For the majority of applicants, their records really speak for themselves (e.g., some look really strong across all indicators whereas others clearly do not). I suspect that we tend to give more weight to the letters in those borderline cases when applicants are on the bubble. We may look for clues within the letter that help us to determine which candidates are worth betting on and which are not. Not sure how that plays out when trying to determine whether particular attributes of a letter predict graduation or some other outcome. Unfortunately, it doesn’t seem as if too much thought and effort has gone into many of the letters I have seen, but I have seen some gems that really helped with the selection process and also accurately anticipated strong student performance.

LikeLike

Rory McVeigh

October 30, 2014 at 3:46 pm
@rory: That sounds about right and it reflects my own practice. Letters don’t carry much weight except in ambiguous cases and then, only when the letter has concrete information. There is one top soc program that has taken two of my recent undergrad students, I think, because I did an explicit ranking, which the literature says is a good signal.

LikeLike

fabiorojas

October 30, 2014 at 4:38 pm
Most letters are hardly worth reading. “Good” letters arise from the conjunction of a faculty member actually knowing a student, which occurs primarily in smaller schools and that faculty member either being recognized by the readers or knowing enough about the elite system of graduate schools to know how to qualify themselves as a writer. Students at large universities are systematically disadvantaged in getting letters; it is hard to get to know an actual tenure-track professor in our institution and in most other large schools that rely on TAs and adjuncts for the hands-on teaching in smaller sections. Students at lower-prestige institutions are systematically disadvantaged in being less likely to have access to faculty who either have recognized names in the field or know how to qualify themselves (by explaining that they do indeed have the right cultural capital.) Only a tiny fraction of all LOR produced in grad school apps has even a chance of containing enough information to be useful. Add to it that the knowledgeable letter writers have a stake in placing their students and the information quality goes down.

There can easily be a large pool of quite talented people who simply never come to the attention of faculty in a large university.

The kinds of letters that can actually be useful include the carefully restrained letter for a person with high test scores who seems very smart, often a sign of trouble. (I can think of a variety of cases in which we had coded warning from the letters if we’d paid attention.) And the thoughtful and detailed letter from a knowledgeable writer who is going to bat for someone from a disadvantaged background who had a rocky start, explaining why they are really a good bet.

But I don’t see how even the useful exceptions could make the correlations move very far from zero.

LikeLiked by 2 people

olderwoman

October 30, 2014 at 11:03 pm
I’m dubious about the notion that letters offering explicit comparisons are useful; they often tell you more about how savvy the recommender is than the candidate’s qualities. A good letter (and they are, as olderwoman notes, not readily available to all strong candidates) offers information about how to read a file. A good letter draws attention to facts that may be hard to find in the rest of the application (e.g., an undergraduate who works too many hours for money or changed majors early on), suggesting what the selection committee should consider carefully.

LikeLike

David S. Meyer

October 31, 2014 at 3:01 pm

Comments are closed.

orgtheory.net