picking the right metric: from college ratings to the cold war

Two years ago, President Obama announced a plan to create government ratings for colleges—in his words, “on who’s offering the best value so students and taxpayers get a bigger bang for their buck.”

The Department of Education was charged with developing such ratings, but they were quickly mired in controversy. What outcomes should be measured? Initial reports suggested that completion rates and graduates’ earnings would be key. But critics pointed to a variety of problems—ranging from the different missions of different types of colleges, to the difficulties of measuring incomes along a variety of career paths (how do you count the person pursuing a PhD five years after graduation?), to the reductionism of valuing college only by graduates’ incomes.

Well, as of yesterday, it looks like the ratings plan is being dropped. Or rather, it’s become a “college-rating system minus the ratings”, as the Chronicle put it. The new plan is to produce a “consumer-facing tool” where students can compare colleges on a variety of criteria, which will likely include data on net price, completion rates, earning outcomes, and percent Pell Grant recipients, among other metrics. In other words, it will look more like U-Multirank, a big European initiative that was similarly a response to the political difficulty of producing a single official ranking of universities.

A lot of political forces aligned to kill this plan, including Republicans (on grounds of federal mission creep), the for-profit college lobby, and most colleges and universities, which don’t want to see more centralized control.

But I’d like to point to another difficulty it struggled with—one that has been around for a really long time, and that shows up in a lot of different contexts: the criterion problem.

While this has been going on in the news, I’ve been buried in the 1950s in Santa Monica—at the RAND Corporation (or at least in its documents). The RAND Corporation played a key role in bringing economics to public policy in the 1960s, as the source of the McNamara “whiz kids” who would bring rational, quantifiable social science to first the Defense Department and eventually the whole executive branch. It’s a fascinating story, but not generally the most bloggable.

Except that in the 1950s, as these ideas were being worked out at RAND, they were basically struggling with the same exact problem the Education Department has been having this week. Plus ça change…They didn’t solve it either.


There are a lot of books about RAND.

 They called it the “criterion problem”. In the 1950s, RAND was conducting what they called systems analysis. During World War II, operations research had become really good at answering questions that optimized decisions at the human-technology interface. If you were going to drop mines in Japanese waters and wanted to minimize your losses along the way, at what times should your pilots fly? At what altitude? In what formation? Data on past expeditions could be used in conjunction with rapidly developing mathematical methods to identify the best solution to such problems. Methods like these, advanced especially in Britain and the U.S., contributed significantly to winning the war.

But by the 1950s, RAND analysts were trying to answer related, but broader, questions of national defense for the Air Force. What, for example, was the most efficient way for the U.S. to deliver nuclear weapons to Soviet territory? RAND used its cutting-edge mathematical techniques to identify the “best” answer: Don’t use fancy new jet bombers. Instead, buy a large number of cheap, slow, turboprop planes, and bomb the heck out of Soviet targets.

Well, this recommendation went down like a ton of bricks. RAND’s recommendation optimized based on a specific criterion: maximizing damage inflicted per dollar spent. But despite the quantity of its calculations and sophistication of its analysis, RAND had overlooked some really big things. One, it ignored the value of pilots’ lives—which did not go over so well with Air Force brass, most of them former pilots themselves. Two, it went against a deep organizational imperative of the Air Force: to develop exciting new planes.

RAND pic

RAND’s economists were really cool.

The failure of this massive project, the result of many man-hours (sic) of work, caused major handwringing at RAND. Much of it was centered around the criterion problem. What should they be optimizing for? What if the best thing to optimize for was hard, or impossible, to measure? What if conflicting goals seemed equally valuable? Should they “sub-optimize”—solve narrower problems, where the criterion problem was most acute—and let the big picture take care of itself? What was quantifiable, and what should they do with important, but unquantifiable, factors?

Some of the best minds of mid-century social science—Jack Hirshleifer, Armen Alchian, Charles Hitch, Charles Lindblom—published internal papers contributing to this debate. A number (including Hirshleifer, Alchian, and Lindblom) were quite skeptical that the problem was solvable, and thought that systems analysts were, more or less, barking up the wrong tree. But theirs was not the position that won out. Though RAND’s analysts were slightly chastened by this first big failure—Hitch pointed out that “calculating quantitative solutions using the wrong criteria is equivalent to answering the wrong questions” and “may prove worse than useless”—they soldiered on, believing that despite the influence of politics, organizational dynamics, and human psychology, better criteria could improve the decision-making process.

And, for the most part, they succeeded—at least at convincing others, first in the Defense Department and then elsewhere in Washington—that this was the case. There is, in fact, a long but direct line from RAND’s 1950s struggle with the criterion problem and the Education Department’s struggle with how to measure successful college outcomes today. (It is no coincidence that one of RAND’s first non-defense studies was an application of systems analysis to education by an economist who would go on to play a key role in the War on Poverty.)

The problems today are basically the same. We haven’t really made much progress toward solving the criterion problem—because, as a fundamentally human, not technical, problem, it is not actually solvable. But our hope or faith that we can, in fact, answer it continues to take us down all sorts of interesting roads. A little more understanding of the past might usefully inform these efforts.

(Note: There are a zillion sources on RAND. The list of books in the picture is a good start; I’ll also point to Will Thomas’s new book, Rational Action: The Sciences of Policy in Britain and America, 1940-1960), as well as David Jardini’s Thinking through the Cold War, which, until it was recently self-published, was the best never-published dissertation I’d ever read.)

Written by epopp

June 26, 2015 at 1:48 pm

3 Responses

Subscribe to comments with RSS.

  1. This is excellent, thanks a lot.

    Even generating rankings from ordinal criteria is very hard to model. Because path dependence is violated. The overall ranking might depend on the order you use the decision criteria.

    Assuming that you have something stronger than ordinal information just adds more confusion.

    I think we should focus not optimal solutions, but rather show which rankings of outcomes “are coherent” with which orderings of criteria.


    michael webster

    June 26, 2015 at 2:56 pm

  2. Great analogy. In the context of higher ed ratings, a couple of things that really jump out are that net price cannot possibly be a single indicator (for example, high-tuition high-aid institutions have a high net price, but poor students will never pay that price) and that different students see vastly different tradeoffs between price and outcomes depending on their career interests and expected future pay. But even there it is not so clear–take the example of arts graduates, who on average attend expensive schools and earn low incomes but yet are satisfied with the education they received (

    Prospective students clearly do, of course, need better information to guide their postsecondary decision-making. But from talking to my own students, I don’t think the problem is that the information is insufficiently available–I think the problem is that first-generation students don’t even know they should be looking for it. If the money being spent building–and arguing about–this data system instead went to providing high-quality college advising for disadvantaged high school students, I suspect it would produce more bang for the buck.



    June 26, 2015 at 5:22 pm

  3. This is a really late reply, but thanks for the comments. @michael, there was a definite camp at RAND that took just the position you suggest. @Mikaila, I agree about students not knowing what to look for. Public Agenda did a good report based on focus groups that talked to adults who were considering returning to college. The study was motivated partly by interest in trying to provide better tools for college selection, but what the groups found was that students think about colleges based on criteria that seem quite irrational from the perspective of the well-educated tool designers but make a lot of sense in the context of their lives:



    June 30, 2015 at 4:48 pm

Comments are closed.

%d bloggers like this: