Archive for the ‘rankings and reputation’ Category
Two years ago, President Obama announced a plan to create government ratings for colleges—in his words, “on who’s offering the best value so students and taxpayers get a bigger bang for their buck.”
The Department of Education was charged with developing such ratings, but they were quickly mired in controversy. What outcomes should be measured? Initial reports suggested that completion rates and graduates’ earnings would be key. But critics pointed to a variety of problems—ranging from the different missions of different types of colleges, to the difficulties of measuring incomes along a variety of career paths (how do you count the person pursuing a PhD five years after graduation?), to the reductionism of valuing college only by graduates’ incomes.
Well, as of yesterday, it looks like the ratings plan is being dropped. Or rather, it’s become a “college-rating system minus the ratings”, as the Chronicle put it. The new plan is to produce a “consumer-facing tool” where students can compare colleges on a variety of criteria, which will likely include data on net price, completion rates, earning outcomes, and percent Pell Grant recipients, among other metrics. In other words, it will look more like U-Multirank, a big European initiative that was similarly a response to the political difficulty of producing a single official ranking of universities.
A lot of political forces aligned to kill this plan, including Republicans (on grounds of federal mission creep), the for-profit college lobby, and most colleges and universities, which don’t want to see more centralized control.
But I’d like to point to another difficulty it struggled with—one that has been around for a really long time, and that shows up in a lot of different contexts: the criterion problem.
When people look at PhD programs, they usually base their judgment on the fame of its scholars or the placement of graduates. Fair enough, but any seasoned social scientist will tell you that is a very imperfect way to judge an institution. Why? Performance is often related to resources. In other words, you should expect the wealthiest universities to hire away the best scholars and provide the best environment for training.
Thus, we have a null model for judging PhD program (nothing correlates with success) and a reasonable baseline model (success correlates with money). According to the baseline, PhD program ranks should roughly follow measures of financial resources, like endowments. Thus, the top Ivy League schools should all have elite (top 5) programs in any field in which they choose to compete, anything less is severe under performance. Similarly, for a research school with a modest endowment to have a top program (say Rutgers in philosophy) is wild over performance.
According to this wiki on university endowments, the top ten wealthiest institutions are Harvard, Texas (whole system), Yale, Stanford, MIT, Texas A&M (whole system), Northwestern, Michigan, and Penn. This matches roughly with what you’d expect, except that Texas and Texas A&M are top flight engineering and medicine but much weaker in arts and sciences (compared to their endowment rank). This is why I remain impressed with my colleagues at Indiana sociology. Our system wide endowment is ranked #46 but our soc programs hovers in that 10-15 range. We’re pulling our weight.
“there’s no rankings problem that money can’t solve” – the tale of how northeastern gamed the college rankings
There’s a September 2014 Boston.com article on Northeastern University and how it broke the top-100 in the US News & World Report of colleges and universities. The summary goes something like this: Northeastern’s former president, Richard Freeland, inherited a school that was a poorly endowed commuter school. In the modern environment, that leads you to a death spiral. A low profile leads to low enrollments, which leads to low income, which leads to an even lower profile.
The solution? Crack the code to the US News college rankings. He hired statisticians to learn the correlations between inputs and rankings. He visited the US News office to see how they built their system and bug them about what he thought was unfair. Then, he “legally” (i.e., he didn’t cheat or lie) did things to boost the rank. For example, he moved Northeastern from commuter to residential school by building more dorms. He also admitted a different profile of student that wouldn’t the depress the mean SAT score and shifted student to programs that were not counted in the US News ranking (e.g., some students are admitted in Spring admissions and do not count in the US News score).
Comments: 1. In a way, this is admirable. If the audience for higher education buys into the rankings and you do what the rankings demand, aren’t you giving people what they want? 2. The quote in the title of the post is from Michael Bastedo, a higher ed guru at Michigan, who is pointing out that rankings essentially reflect money. If you buy fancier professors and better facilities, you get better students. The rank improves. 3. Still, this shows how hard it is to move. A nearly billion dollar drive moves you from a so-so rank of about 150 to a so-so rank of about 100-ish. Enough to be “above” the fold, but not enough to challenge the traditional leaders of higher ed.
I’m kind of obsessed with the REF, considering that it has zero direct impact on my life. It’s sort of like watching a train wreck in progress, and every time there’s big REF news I thank my lucky stars I’m in the U.S. and not the U.K.
For those who might not have been paying attention, the REF is the Research Excellence Framework, Britain’s homegrown academic homage to governmentality. Apologies in advance for any incorrect details here; to an outsider, the system’s a bit complex.
Every six years or so, U.K. universities have to submit samples of faculty members’ work – four “research outputs” per person – to a panel of disciplinary experts for evaluation. The panel ranks the outputs from 4* (world leading) to 1* (nationally recognized), although work can also be given no stars. Universities submit the work of most, but not all, of their faculty members; not being submitted to the REF is not, shall we say, a good sign for your career. “Impact” and “environment,” as well as research outputs, are also evaluated at the department level. Oh, and there’s £2 billion of research funding riding on the thing.
The whole system is arcane, and every academic I’ve talked to seems to hate it. Of course, it’s not meant to make academics happy, but to “provide…accountability for public investment in research and produce…evidence of the benefits of this investment.” Well, I don’t know that it’s doing that, but it’s certainly changing the culture of academia. I’d actually be very interested to hear a solid defense of the REF from someone who’s sympathetic to universities, so if you have one, by all means share.
Anyway, 2014 REF results were announced on Friday, to the usual hoopla. (If you’re curious but haven’t been following this, here are the results by field, including Sociology and Business and Management Studies; here are a few pieces of commentary.)
In its current form, outputs are “reviewed” by a panel of scholars in one’s discipline. This was strongly fought for by academics on the grounds that only expert review could be a legitimate way to evaluate research. This peer review, however, has become something of a farce, as panelists are expected to “review” massive quantities of research. (I can’t now find the figure, but I think it’s on the order of 1000 articles per person.)
At the same time, the peer-review element of the process (along with the complex case-study measurement of “impact”) has helped to create an increasingly elaborate, expensive, and energy-consuming infrastructure within universities around the management of the REF process. For example, universities conduct their own large-scale internal review of outputs to try to guess how REF panels will assess them, and to determine which faculty will be included in the REF submission.
All this has led to a renewed conversation about using metrics to distribute the money instead. The LSE Impact of Social Sciences blog has been particularly articulate on this front. The general argument is, “Sure, metrics aren’t great, but neither is the current system, and metrics are a lot simpler and cheaper.”
If I had to place money on it, I would bet that this metrics approach, despite all its limitations, will actually win out in the not-too-distant future. Which is awful, but no more awful than the current version of the REF. Of course metrics can be valuable tools. But as folks who know a thing or two about metrics have pointed out, they’re useful for “facilitating deliberation,” not “substitut[ing] for judgment.” It seems unlikely that any conceivable version of the REF would use metrics as anything other than a substitute for judgment.
In the U.S., this kind of extreme disciplining of the research process does not appear to be just around the corner, although Australia has partially copied the British model. But it is worth paying attention to nonetheless. The current British system took nearly thirty years to evolve into its present shape. One is reminded of the old story about the frog placed in the pot of cool water who, not noticing until too late that it was heating up, inadvertently found himself boiled.
Melissa Wooten is an Assistant Professor of Sociology at the University of Massachusetts, Amherst. Her forthcoming book In the Face of Inequality: How Black Colleges Adapt (SUNY Press 2015) documents how the social structure of race and racism affect an organization’s ability to acquire the financial and political resources it needs to survive.
“Look…Come on…It’s $10 million dollars” is how the Saturday Night Live parody explains the Los Angeles chapter of the National Association for the Advancement of Colored People’s (NAACP) decision to accept donations from now disgraced, soon-to-be former, NBA franchise owner, Donald Sterling. This parody encapsulates the dilemma that many organizations working for black advancement face. Fighting for civil rights takes money. But this money often comes from strange quarters. While Sterling’s personal animus toward African Americans captivated the public this spring, his organizational strategy of discriminating against African Americans and Hispanic Americans had already made him infamous among those involved in civil rights years earlier. So why would the NAACP accept money from a man known to actively discriminate against the very people it seeks to help?
A similar question arose when news of the Koch brothers $25 million donation to the United Negro College Fund (UNCF) emerged in June. Not only did the UNCF’s willingness to accept this donation raise eyebrows, it also cost the organization the support of AFSCME, a union with which the UNCF had a long-standing relationship. The Koch brothers support of policies that would limit early voting along with their opposition to minimum wage legislation are but a few of the reasons that have made some skeptical of a UNCF-Koch partnership. So why would the UNCF accept a large donation from two philanthropists known to support policies that would have a disproportionately negative affect on African American communities?
Usually when someone starts throwing citation impact data at me, my eyelids get heavy and I want to crawl into a corner for a nap. Like Teppo wrote a couple of years ago, “A focus on impact factors and related metrics can quickly lead to tiresome discussions about which journal is best, is that one better than this, what are the “A” journals, etc. Boring.” I couldn’t agree more. Unfortunately, I’ve heard a lot about impact factors lately. The general weight of impact factors as a metric for assessing intellectual significance has seemed to skyrocket since the time I began training as a sociologist. Although my school is not one of them, I’ve heard of academic institutions using citation impact as a way to incentivize scholars to publish in certain journals and as a measure to assess quality in hiring and tenure cases. And yet it has never struck me as a very interesting or useful measure of scholarly worth. I can see the case for why it should be. Discussions about scholarly merit are inherently biased by people’s previous experiences, status, in-group solidarity, personal tastes, etc. It would be nice to have an objective indicator of a scholar’s or a journal’s intellectual significance, and impact factors pretend to be that. From a network perspective it makes sense. The more people who cite you, the more important your ideas should be.
My problem with impact factor is that I don’t trust the measure. I’m skeptical for a few reasons: gaming efforts by editors and authors have made them less reliable, lack of face validity, and instability in the measure. Let me touch on the gaming issue first.
Yesterday afternoon I ended up reading this Vox story about an effort to rank US Universities and Colleges carried out in 1911 by a man named Kendric Charles Babcock. On Twitter, Robert Kelchen remarks that the report was "squashed by Taft" (an unpleasant fate), and he links to the report itself, which is terrific. Babcock divided schools into four Classes, beginning with Class I:
And descending all the way to Class IV:
Babcock’s discussion of his methods is admirably brief (the snippet above hints at the one sampling problem that possibly troubled him), so I recommend you read the report yourself.
University reputations are extremely sticky, the conventional wisdom goes. I was interested to see whether Babcock’s report bore that out. I grabbed the US News and World Report National University Rankings and National Liberal Arts College Rankings and made a quick pass through them, coding their 1911 Babcock Class. The question is whether Mr Babcock, should he return to us from the grave, would be satisfied with how his rankings had held up—more than a century of massive educational expansion and alleged disruption notwithstanding.
It turns out that he would be quite pleased with himself.