is “public intellectual” oxymoronic?

A guest post by Jerry Davis. He is the Wilbur K. Pierpont Collegiate Professor of Management at the Ross School of Business at the University of Michigan.

By this point everyone in the academy is familiar with the arguments of Nicholas Kristof and his many, many critics regarding the value of academics writing for the broader public. This weekend provided a crypto-quasi-experiment that illustrated why aiming to do research that is accessible to the public may not be a great use of our time. It also showed how the “open access” model can create bad incentives for social science to write articles that are the nutritional equivalent of Cheetos.

Balazs Kovacs and Amanda Sharkey have a really nice article in the March issue of ASQ called “The Paradox of Publicity: How Awards Can Negatively Affect the Evaluation of Quality.” (You can read it here: http://asq.sagepub.com/content/59/1/1.abstract) The paper starts with the intriguing observation that when books win awards, their sales go up but their evaluations go down on average. One can think of lots of reasons why this should not be true, and several reasons why it should, all implying different mechanisms at work. The authors do an extremely sophisticated and meticulous job of figuring out which mechanism was ultimately responsible. (Matched sample of winning and non-winning books on the short list; difference-in-difference regression; model predicting reviewers’ ratings based on their prior reviews; several smart robustness checks; and transparency about the sample to enhance replicability.) As is traditional at ASQ, the authors faced smart and skeptical reviewers who put them through the wringer, and a harsh and generally negative editor (me). This is a really good paper, and you should read it immediately to find out whodunit.

The paper has gotten a fair bit of press, including write-ups in the New York Times and The Guardian (http://www.theguardian.com/books/2014/feb/21/literary-prizes-make-books-less-popular-booker). And what one discovers in the comments section of these write-ups is that (1) there is no reading comprehension test to get on the Internet, and (2) everyone is a methodologist. Wrote one Guardian reader:

The methodology of this research sounds really flawed. Are people who post on Goodreads representative of the general reading public and/or book market? Did they control for other factors when ‘pairing’ books of winners with non-winners? Did they take into account conditioning factors such as cultural bias (UK readers are surely different from US, and so on). How big was their sample? Unless they can answer these questions convincingly, I would say this article is based on fluff.

Actually, answers to some of these questions are in The Guardian’s write-up: the authors had “compared 38,817 reader reviews on GoodReads.com of 32 pairs of books. One book in each pair had won an award, such as the Man Booker prize, or America’s National Book Award. The other had been shortlisted for the same prize in the same year, but had not gone on to win.” And the authors DID answer these questions convincingly, through multiple rounds of rigorous review; that’s why it was published in ASQ. The Guardian included a link to the original study, where the budding methodologist-wannabe could read through tables of difference-in-difference regressions, robustness checks, data appendices, and more. But that would require two clicks of a functioning mouse, and an attention span greater than that of a 12-year-old.

Another says:

This is a non story based on very iffy research. Like is not compared with like. A positive review in the New York Times is compared with a less complimentary reader review on GoodReads…I’ll wait to fully read the actual research in case it’s been badly reported or incorrectly written up

Evidently this person could not even be troubled to read The Guardian’s brief story, much less the original article, and I’m a bit skeptical that she will “wait to fully read the actual research” (where her detailed knowledge of Heckman selection models might come in handy). After this kind of response, one can understand why academics might prefer to write for colleagues with training and a background in the literature.

Now, on to the “experimental” condition of our crypto-quasi-experiment. The Times reported another study this weekend, this one published in PLoS One (of course), which found that people who walked down a hallway while texting on their phone walked slower, in a more stilted fashion, with shorter steps, and less straight than those who were not texting (http://well.blogs.nytimes.com/2014/02/20/the-difficult-balancing-act-of-texting-while-walking/). Shockingly, this study did not attract wannabe methodologists, but a flood of comments about how pedestrians who text are stupid and deserve what they get. Evidently the meticulousness of the research shone through the Times write-up.

One lesson from this weekend is that when it comes to research, the public prefers Cheetos to a healthy salad. A simple bite-sized chunk of topical knowledge goes down easy with the general public. (Recent findings that are frequently downloaded on PLoS One: racist white people love guns; time spent on Facebook makes young adults unhappy; personality and sex influence the words people use; and a tiny cabal of banks controls the global economy.)

A second lesson is that there are great potential downsides to the field embracing open access journals like PLoS One, no matter how enthusiastic Fabio is. Students enjoy seeing their professors cited in the news media, and deans like to see happy students and faculty who “translate their research.” This favors the simple over the meticulous, the insta-publication over work that emerges from engagement with skeptical experts in the field (a.k.a. reviewers). It will not be a good thing if the field starts gravitating toward media-friendly Cheeto-style work.

50+ chapters of grad skool advice goodness: From Black Power/Grad Skool Rulz

Written by fabiorojas

February 26, 2014 at 12:05 am

Posted in academia, fabio, guest bloggers, pet peeve, pet peeves and rants

40 Responses

Subscribe to comments with RSS.

When clicking on the Guardian link, one can actually see that reasonable balanced comments that offer views on why this might be the case (awards –> lower ratings) get many more “recommend” votes than the posts that Jerry refers to. But that was not relevant in his very selected reading of the mediocricy of the public I gues?

LikeLike

Anonymous

February 26, 2014 at 7:32 am
Jerry: I agree with you that Kristof’s article was insipid. And I agree with you that one of the challenges in writing social science for the public is that the public often seems to go for sizzle rather than steak. (Though are you sure that if you looked through those comments, you wouldn’t also see people who reacted to the PLOS One article with, “Duh!”? ) You will not be surprised, however, that I don’t agree that the sizzle vs. steak contrast is necessarily one that boils down to open-access vs. not.

On the one hand, Science and Nature are not open-access but they are all about sizzle, often at the expense of steak. A great example are the Barabasi papers that bedazzle us with massive amounts of cellphone data and very cool graphs to tell us the big news that modern Americans do not move around through the day in random fashion, but tend to go from…. home to work/school and back again.

On the other hand, Sociological Science went to press last week. What do you think of the first four articles? Sizzle at the expense of steak, or as good a level of quality as you’ll find anywhere? At the very least, perhaps let’s reserve judgment (the next articles in the queue are quite strong too, in my opinion) and see whether there is more than one model out there for making steak?

LikeLike

ezrazuckerman

February 26, 2014 at 11:11 am
Ahh, the putrid masses. So easy to criticize. They should just shut up and go flip burgers while the smart people talk and tell them what is good…

LikeLike

Hector

February 26, 2014 at 11:29 am
Jerry – I don’t really get your argument here. Because some people say dumb things in the comment section of the internet, then public scholarship is a waste of time? Why do these cherry-picked comments stand in for public reception? Doing more to make our work accessible is not necessarily about making our professional writing more accessible (think climate science models). It’s about doing the extra work of helping others make sense of that professional work. I write for both a public and a professional audience. These are different kinds of tasks… but they have a common element: a recognition of your audience. I think you overestimate the meticulous nature of peer review. And I’m with Hector; if we write in ways that have such smug condescension to “the public,” it’s hardly a surprise that they won’t be very interested in what we have to say.

LikeLike

shakha

February 26, 2014 at 12:46 pm
Paraphrasing the story of the paper, leaving research to the taste of a small circle of few translates into making books available only to the prizes’ committees.

LikeLike

thewisdomofcrowds

February 26, 2014 at 12:49 pm
@Anonymous: Of the 27 comments posted on the Guardian as of this morning, 7 got more “up” votes than the “Iffy research” comment, and several of these raised more criticisms that were addressed in the published research. One said “Yes, of course. More people reading leads to a greater diversity of opinions, and some of the most disaffected complain on the internet. I hope the journal article contains something a little more surprising than this.” One said “I suspect it’s also because some people read the book with half an eye to finding fault just because someone else (e.g.: an awards panel) said it’s good.” One said “There’s no evidence offered as to how representative the complainants are of the expanded readership generated by the winning of a major prize, or how useful GoodReads is as a sampling frame for measuring overall popularity.” And a couple make smart points about the nature of literary prizes and the publishing industry, including my favorite, which opens “The authors’ reasoning seems sound; and it’s nice to have some hard evidence, for once…”

@Ezra: Agreed about Science and Nature. Will reserve judgment on Soc Sci, but I am optimistic!

@Hector: The point of the post is not about “the masses” (who should not be expected to be versed in quasi-experimental design or fixed effects models, any more than I should be able to explain how a microwave oven works). It is that bad things can happen if research is done to be “accessible.” If medical research were done so that it could be reported in People, we’d get a lot more “6 new dieting tricks for beach weather” and perhaps less “The Role of Clusterin in Amyloid-β–Associated Neurodegeneration.”

LikeLike

jerrydavisumich

February 26, 2014 at 1:04 pm
The other “shocking” lesson from this weekend, not mentioned here, is that 120 articles were pulled from subscription journals because they were generated by a computer program designed to create articles for computer science. (How … meta.) They were, in the language of one retraction, free of scientific content. These articles appeared in publications by Springer-Verlag, Elsevier, and other “well-respected” presses, in subscription journals purchased by your institution and mine. The Sokal Hoax, scaled up by technology.

We can think about the publishing space as a two-by-two-by-two table, where the dimensions are:
– open access or subscription
– for profit or nonprofit
– peer review or not peer reviewed (actually, as Fabio pointed out in a prior post there are levels of peer review, but let’s keep it simple)

All 8 of the cells are populated.

Open access journals can be nonprofit or for-profit. The latter, whether they are run by Sage or Elsevier or by the fly-by-night operation that spams my inbox regularly, are all about generating profits from the intellectual labor of academics. It’s what they do.

Subscription journals can also be nonprofit or for profit.

Both can be peer-reviewed, or not.

“Duh,” right? But there’s a tendency for critics of open access to equate open access with lack of peer review, low quality, or, in this case, soundbite research generated for a public that’s incapable of differentiating good research from bad. Those are false equivalences.

LikeLike

Kim Weeden

February 26, 2014 at 1:10 pm
Jerry: Thanks though the larger point was about not tarring all OA journals with the same brush. See Kim’s post.
And related to your response to Hector, are you so sure that this is not true of (closed-access) medical journals? See this great piece: http://www.nytimes.com/2014/02/09/opinion/sunday/why-nutrition-is-so-confusing.html.

LikeLike

ezrazuckerman

February 26, 2014 at 1:14 pm
a) I assume you know that you did not address my point – the positive, well-balanced comments got significantly more recommend votes than the iffy ones you referred to.
b) Furthermore, a bit ironic that you complain about people not realizing that their concerns were adressed in the published research – when the public doesn’t have access to he published research!!!
c) And quite a strange inference to make, that PLoS ONE necessarily published stuff that is more accessible? Especially since the (awesome) ASQ paper in question is exceptionally accessible. Presented it for 500 students in a Philosophy of Science class, and it took me probably90 seconds to explain.

LikeLike

Anonymous

February 26, 2014 at 2:14 pm
@Kim and Ezra: this is an entirely fair and appropriate point. There are many flavors of open access with varied levels of quality, and many flavors of for-profit journal publishers (which also vary widely in how they do what they do), and it’s not fair to put all OA in one camp. I’m actually a big fan of open access and letting a thousand flowers bloom when it comes to journal formats (as Fabio might say). PLoS One’s editorial statement and philosophy are well-reasoned and smart. The idea of having the potential impact of an article emerge after publication, rather than being pre-judged by editors and reviewers, is intriguing, and could ultimately turn out to be a good idea; judging by the “most downloaded” list each month (the most immediate measure of “impact”), there is reason to be skeptical.
The number of journals has exploded since 2000, in all different formats. Web of Science indexes 174 “management” journals (triple the number from 2001), which between them published 34,000 articles between 2010 and 2013. PLoS One has published over 92,000 (peer-reviewed) articles. And then there are the various journals publishing computer-generated gibberish. It’s an open question how this has contributed to the net advancement of the social sciences, but there is reason for anxiety.

@Anonymous (if that’s your real name): (a) some well-balanced comments got more recommends, but as I noted, so too did methods complaints. In any case, the median was 3 recommends, so not clear this is a great metric. (b) I guess you didn’t click the link in The Guardian either: it goes to the SSRN [free] pre-publication version of the article. But it’s also currently available free at the ASQ website [also linked above]. (c) Not all PLoS One articles are easily digestible as the texting one, of course, but it does seem to be a preferred outlet for them. And the book award article can be conveyed in 90 seconds, until people ask questions [as peer reviewers do], which requires fancy models and robustness checks. “People are descended from apes” takes only 5 words too, but the backup requires a few additional pages.

LikeLike

jerrydavisumich

February 26, 2014 at 3:05 pm
My overall conclusion on the article after giving it a 15 minute lookover (and, in typical blog commentary fashion, less than it took to write what is below!):

My guess is that SocSci would have accepted this paper in its 30 day window but also would have made as a condition of acceptance that (a) more of the analysis be disclosed and (b) that the strength of the conclusion about the decline in the post-win rating be scaled back. This would lead to more steak with less sizzle. (Of course, this is a guess, since this final version has presumably gone through many rounds of revision in response to reviewer-induced pain.)

One finding of the paper is that winners see their sales increase. This is uncontroversial and is used to set up the purported post-win decline in ratings as something unexpected and novel. SocSci would have accepted the sales boost finding but perhaps would have given the reaction: So what, can’t we just take that for granted, since no one would object to it? Awards confer visibility, and that drives sales, and so on.

The novel finding of a post-win decline in ratings would have generated the following concerns, at least if I was the Deputy Editor:

On context:

Footnote 6 is violated at least for the Booker prize. Booksellers in the UK set up shelves for finalists, lots of people read all of them before and after the winner is announced, and everyone seems to enjoy attacking the committee’s eventual decision. Figure 1 implies that for books of similar quality, this dynamic is very much in play, with people then writing comments after the award is announced, etc., which are statements as much about the taste of the committee as about the books themselves. They may well have written their reviews before the winner was announced and then posted them as a type of commentary on the Booker decisions after the fact, revising to critique the Booker committee (and thereby attempting to enhance their own status as reviewers with “outsider” status). Also, the fact that Table 4 shows that this result is driven entirely by fiction awards makes the Booker prize especially important.

On estimation:

Diff-in-diff assumes that winning (i.e., selection into the treatment) is not a function of the pretreatment outcome (pre-win ratings). Essentially, the model requires that no regression-to-the-mean dynamic is present (as well as parallelism in the underlying trajectories, net of adjustment variables and the effect of interest). It is debatable whether this is all reasonable, since one can imagine the winning decision being a function of pre-win ratings by reviewers (although perhaps not on Goodreads, but one does wonder whether Oscars-style lobbying exists for Booker prize winners, and whether that is independent of pre-win Goodreads reviews, if the campaigns are public). Given this, it would be important to offer a lag variable alternative to the diff-in-diff coefficients.

On disclosure of results:

I couldn’t really figure out from my quick read how the matches were formed, and there are no results that give the reader a clear sense of how much balance has been achieved by the matching on whatever has been matched on. Given the vastness of the matching literature, and how much is known now about what works and what doesn’t, SocSci would have asked for such details so that readers could fully evaluate the results.

On the assumed causal model:

Perhaps most fundamentally, the analysis is set up by conditioning on an outcome (only those nominated and of similar quality show up in the models) and then estimating the effect of wining the award within this subset. Both the “win” outcome and the “nominate” outcomes are structured by the quality of the books and the tastes of those making the decisions, which are presumably more “high brow” (or at least more “insider”) than the typical Goodreads review writer. To really evaluate the effects and conclusions, when you start off by conditioning on variables such as “being nominated” or “being a finalist” and then having high ratings among nominees (as determined by a different set of people), you need to acknowledge that you are quite likely inducing associations between underlying tastes and ratings, all while considering two different pools of people (those who award the prizes and those who write for Goodreads). I don’t see any discussion at all of the modern literature on these types of causal inference challenges, all of which fall under the heading of “be careful when conditioning on a collider.” Instead, the article goes straight from words into results, and there was an opportunity here that was missed to dig deeply into quite interesting causal inference challenges involved. At SocSci, I expect we would have asked the authors to write a bit more.

Overall, I like the article quite a bit. But, if anything, it has too much sizzle!

(Now the usual disclosure: I spent a lot less time with this than the reviewers and authors, and so what I write above may in fact be wrong and misleading. Evaluate the paper on your own. Most importantly, I am not trying to attack the authors, and I would have recommended to the EIC at SocSci that this one be accepted because it is inherently interesting and has quite deep methodological puzzles worth disentangling.)

LikeLike

Steve Morgan

February 26, 2014 at 3:53 pm
You wrote: “It also showed how the “open access” model can create bad incentives for social science to write articles that are the nutritional equivalent of Cheetos.”

This could be entirely true. However, it could also be true that people like submitting to PLoS One and Sociological Science, because when they’re put through the wringer by top-tier journals, that eats away at the time they want to spend on meaningful things like having a family life and mentoring students.

LikeLike

Chris M

February 26, 2014 at 7:48 pm
Maybe cheetos is the wrong metaphor. Try pizza instead…http://www.npr.org/blogs/money/2014/02/26/282132576/74-476-reasons-you-should-always-get-the-bigger-pizza?ft=1&f=1053&utm_source=food_nprfood&utm_medium=twitter

LikeLike

dr

February 26, 2014 at 8:16 pm
In terms of open-access, I think this is a matter of the timing of ‘success’. Getting through the screen of an A-journal per se, or eventually making lasting impact, perhaps after first being misunderstood by experts and novices alike. I am not sure the former ensures the latter. I am also not sure being misunderstood by novices prevents it. I think PLOS ONE’s idea that impact evaluation starts after publication, not before, makes sense in the long-term.

Let’s face it, the evaluation of expert-peers is not perfect. The Lemons paper by Akerlof is a great example. I the public (which includes experts!) would have discovered that one eventually.

LikeLike

Ulrik

February 27, 2014 at 10:30 am
Steve, I wanted to respond to one specific part of your post:

>(Of course, this is a guess, since this final version has presumably gone through many rounds of revision in response to reviewer-induced pain.)

Actually, this paper only had 1 R&R at ASQ, and the process took slightly less than a year from the time the paper was first submitted to ASQ to its publication. So, thank you for your comments. I was almost feeling like I got shortchanged without multiple rounds of review and the addition of multiple new reviewers on each round! :)

LikeLike

Amanda

February 27, 2014 at 7:29 pm
Apologies for over-interpreting “As is traditional at ASQ, the authors faced smart and skeptical reviewers who put them through the wringer, and a harsh and generally negative editor (me).” I assumed that the ASQ wringer is like other wringers I know, and I have never reviewed for ASQ. At Soc of Ed, we basically tried to move to a one R&R process during my four years as Deputy Editor, and we mostly accomplished it. I think it is the way to go for traditional “developmental” journals. If that is where ASQ already is, then good for them.

LikeLike

Steve Morgan

February 27, 2014 at 7:41 pm
@Steve: at ASQ, our wringer is both developmental and fast. The editorial board is superb and responsive. We never add new reviewers after the first round (unless someone drops out), we almost never go beyond two rounds, and the turnaround times are pretty great: the mean to first decision is 36 days [including manuscripts that don’t go out for full review], and the typical full review is about 60 days per round.

LikeLike

jerrydavisumich

February 27, 2014 at 8:08 pm
Jerry: Your pride shines brightly.

I am still awaiting your defense of why you and your tough reviewers missed so much on this one. Or maybe I am just wrong on the points above? Feel free to skewer me. I only gave it 15 minutes of reading. Surely your multiple readings, informed by harsh reviewers and superb editorial board, and your general negativity, have given you far more insight. Certainly when you went after those naive Guardian commenters, you had a great deal of confidence in the findings, based on your review process. So, go for it. Let me have it.

LikeLike

Steve Morgan

February 27, 2014 at 8:31 pm
@Steve: since you started by saying you had only spent 15 minutes reading the paper, I spent 45 seconds reading your post, to maintain proportionality. So:

Mentioning increased sales: this is a matter of taste; I felt it established the puzzle effectively.
Reporting more results: still having a paper journal puts a practical limit on article length (and we asked for about a 20% reduction in text on the first round), but the reported tables address the main issues. The footnotes describe various alternative analyses not reported in the text, and I’m sure the authors would share their findings.
Conditioning on a collider variable: see footnote 1.
Selection for nomination: of course this came up in the review process. Given the nearly limitless # of possible nominees, it’s not practical, compared with the matched sample comparison.
Matching: see “prior” models in table 2. Also read online appendix.
Potential regression to mean: see robustness checks, esp. “Temporal effect of an award.”

Of course, since I only spent 45 seconds reading your comments, this is just a scattered response, and I should not be held accountable for anything. But if you would care to read the paper thoroughly, I’m sure the authors would be delighted to respond to your queries as an impromptu fourth reviewer.

Or, we could return to a discussion of the impact of different reviewing formats on the kinds of questions the field pursues.

LikeLike

jerrydavisumich

February 28, 2014 at 12:51 am
@Jerry

Let me first reiterate that I think it is a fine article, would have been published in SocSci, and that, on balance, there is more than enough suggestive evidence that their favored mechanism has support. The only questions for debate are whether the article is as exemplary as you claim, whether it rules out as many of the alternative explanations as the authors claim, and whether the ASQ development process should get as much credit relative to the authors as you imply.

Your original post reads to me as: (a) ASQ has published a great paper, (b) open access journals publish crap papers, (c) people really should read the ASQ paper so that they can understand how great it is, (d) the masses will never do so, but (e) if they did so they would come to understand why journals like ASQ are a treasure that no one should challenge because they are uniquely able to develop such papers, and (f) open access journals will appeal to the masses and forsake serious scholarship. My take on this is (a) if ASQ is so great and helps people to produce such great work, and you have selected this paper as evidence of it, then why is this paper not as great as you say it is, (b) SocSci would have given feedback on a range of issues, but we still would have published the paper 9 months sooner. And without your wringer.

If you or anyone else cares to read, here are the portions of the paper that seem underdeveloped to me, even after your ASQ process:

It still feels to me that the authors, and very tough Editor, got the whole dynamic of the Booker prize wrong, unlike the Guardian readers you chastise. The authors note in the text that for these awards the shortlist is announced ahead of time (see page 10) but then in footnote 6 we get “A reviewer has raised the possible alternative explanation that award winners decrease more in ratings than the matched short-listed books because reviewers might feel sorry for the books that were short-listed but did not win. Though this is an interesting hypothesis, we do not believe it to be the case, as the award announcements in our sample do not include runners-up. This makes it unlikely that Goodreads reviewers are aware of who the runners-up were. Yet, to rule out this explanation, in additional analyses not shown here, we compared winners with all the short-listed books and found that the decrease in ratings for award winners was significantly more than that of the full set of short-listed books, rather than only the short-listed books that form our matched pair sample. Thus the contrast in the ratings trajectory of winners is evident not only in comparison to the book we chose as a matched counterpart but seems to occur in comparison with all short-listed books.” Basically, the authors claim that an analysis of the decline in the ratings of the award winner relative to all short listed books somehow allows them to support the overall conclusion that winning causes people to think less of the winner and for others to be attracted to the book who are less predisposed to like. To me, this footnote 6 is a red herring. It is quite likely that many Goodreads reviewers are contrarian with respect to all short-listed books, not just the one that was selected by the authors for the matched pair based on pre-win ratings (perhaps precisely because they do not know who the runner-up is and so treat all short-listed books as the runner-up). The authors favored mechanism is probably part of the story, but the reviewer’s claim is not ruled out. Bottom line: the fact that the results the authors report for the “paradox” are entirely for the part of the sample that is for fiction awards (which is less than half of the overall sample of pairs) should have been a red flag to the Editor that something quite different was occurring for fiction books, almost half of which are Booker pairs. (I know less about Pen/Faulkner and when they release their shortlist and how it is received. The same dynamic could be at play, but it seems less likely relative to the excitement generated by the Booker shortlist and then the announcement of the winner a month later and then the customary criticism of the Booker committee.) Finally, the authors do offer an interesting direct analysis of backlash, but it is impossible to figure out how persuasive it is (because of the underreporting in your constrained, old school print journal). Are those 2077 reviews restricted to the fiction books where the “paradox” exists? And, given that they are not a random sample of Goodreads reviews, the noneffect for them cannot necessarily be extrapolated to the rest of Goodreads reviews. What if they are the history buff reviewers who select non-fiction books and tend not to care about awards and hence don’t participate in backlash? It would be nice to know.

Footnote 1 shows that the authors in fact do know the literature on “conditioning on colliders” enough to discuss it in their theory setup and yet were not asked by the Editor to consider how consequential it was for all of their models and conclusions. You are correct I did not notice that footnote on my 15 minute read. Usually I expect relevant issues pertinent to design to be in the methods and results section, not in the long front end. Bottom line: causal assertions everywhere in this paper but not a single formal definition of a causal effect anywhere. And a synthetic collection of cases, with a clear fiction/non-fiction divide, that cannot be pinned to any underlying population or well-defined set of events. If pinned to anything, it is a collection of reviewers about which we also get no real information in the article.

You also use the constraint of the print journal to excuse the authors from actually explaining what they seem to present as one of the most novel features of their analysis – how they select a matched sample of 32 pairs on which to build models. The supplementary appendix that you point me to gives nothing other than a list of books. It contains no explanation for how they arrived at that list of books. It would appear from what one can discern from the tables that they favored matches on prior ratings over matches on number of reviews. The result is that the pairs have quite similar rating levels, but winners had fewer reviews on Goodreads. So, while the level of ratings is matched, the gross number of people at those ratings is not. The result is that the nonwinners on average had a bit more aggregate support on Goodreads than winners in each pair. It is possible that this contributes to backlash through contagion, as there would be slightly more people who would be disappointed by the winner than people to rally behind the winner. If I were in their shoes, I would not have offered only a design with a near-exact match on ratings, since that excludes very many possible cases of relevance to the effect they care about. And imbalance in pre-win ratings can always be adjusted for in the parametric model. Instead, they picked only pairs that are finely matched on pre-win ratings, which are precisely the ones where backlash (the strongest alternative explanation) is likely to occur. It could have helped the results to show, if possible, that the ratings decline occurs even for the Booker winners that everyone assumed were going to win and then did win because there were no worthy competitors, as judged by pre-win ratings of winners and their feeble competitors. The favored mechanism should still play out for these types of winners.

You offer no response to the diff-in-diff model specification issue, other than that I should have read the ‘temporal effect of award section.’ I confess that I did miss that one. It was indeed only a 15 minute read. Before that the authors write “Our identification strategy relies on comparing changes in ratings over time between books that won the award and the matched control books that were short-listed for the same award and are identical or very similar to the award-winning book in terms of the average rating and number of reviews received prior to the announcement of the winner.” That is the estimation strategy, followed by a rationale for confounder adjustment (which is the matching routine). The identification strategy rests crucially on the assumptions of the diff-in-diff model, which are parallel trajectories in the absence of the effect and no relationship between winning and pre-win ratings. Diff-in-diff can be very misleading if these assumptions are violated, and it is possible that they are. The authors have a pretreatment trajectory (not just the usual single pretest), and so they could actually test the appropriateness of the diff-in-diff model. I guess the Editor did not suggest this. Instead, they try to assess regression to the mean by looking at posttest trajectories, even though these are a function of the effects that they are analyzing. The required test is in the pretreatment trajectories, and they eliminated their ability to test for this by confining most of their the analysis only to matched pairs rather than the full short-list sets. It is possible that the winner and matched partner are both on the same pre-win trajectory, and they could have shown this. If they are not, and either the winners or their matched partners had momentum going into the decision, then selection on the pretest is present, regardless of the post-win trajectories that may or may not reveal regression to the mean.

LikeLike

Steve Morgan

February 28, 2014 at 5:12 am
This interesting debate stimulated my reflections on the trajectories of the field. Scholars are supposed to be rigorous, but today “rigor” seems to be solely a matter or jumping on the train of the funkiest methodological tricks. Papers are often questioned in terms of methods and analyses. Fair enough: theories need to be subject to falsification, and if the Caudine Forks of the data are not passed, doubts on the quality of the research will be reasonably raised. However, when theories are really (and I mean REALLY) good, they do not need Caudine Forks. The best papers I’ve read are simple stories, which neither necessarily mean using good theories of others to justify simple facts and figures, nor they mean crazy empirical sections. If we expect the public to be familiar with DID, and if we reduce academic rigor to DID, then we can say goodbye to theory. Some might say that theories have often become a matter of cliques, but I would say that those are mostly ideologies, and not any longer theories.
Whether a journal is open or not, I would like to see theories back on the table, and more effort to theorize about interesting facts, and honest effort in testing these theories. What I am tired to read are the papers that just reinforce theories of others and randomly attach them to interesting patterns in the data, while testing them with the methodological fads. This is becoming kind of frustrating.

LikeLike

thewisdomofcrowds

February 28, 2014 at 9:40 am
In response to the above comment:

Have a look at the SocSci submission guidelines on the website. We spent a good amount of time talking and debating this before going live, but it doesn’t seem like anyone has noticed them in the blogosphere. Maybe they are worth a blog post. We will be revising them at some point, and some feedback would be appreciated. (Fabio: Are you there? Please take the bait!)

My personal view is that the support for one’s claim should be based on the simplest model that is sufficient to convey the claim, with the caveat that one ought to understand how you could support it with more complex models too. The essential point is that you know the implicit assumptions of your favored models and you’ve used the data to the maximum extent possible to consider them. You should be ready to disclose everything in response to an editor, but you should write your paper in as simple a way as you can manage. You should probably have an elaborate supplementary appendix that is ready to go up on your own website, and maybe the journal’s too. This all applies regardless of whether your claims are descriptive or causal.

The paper in question is far, far better than most sociology articles on these dimensions, and my comments are mostly a response to the pridefulness of the ASQ Editor, and his disdain for open access and Guardian readers (the latter of whom know far more about the Booker prize than he does, I would bet). The fact that I will never submit to ASQ and will probably never review for ASQ, and generally like knife fighting, means that I have no problem going public against such pridefulness. He can be forgiven for being light on methodological standards (he trained before most of the modern literature on causal inference was known), but not for claiming he has done far better as Editor than he has.

I don’t want to leave the impression that SocSci is only interested in papers that can justify causal effects. Not at all, and the submission guidelines make this clear. Different deputy editors will evaluate differently, and Jesper really does make the final call. In my case, I am most attracted to papers that fall cleanly on either side of a divide between novel descriptive analysis and convincing causal analysis. In sociology, we seem to have too many articles sitting on the fence, and the whole field suffers from it. In fact, we just rejected a paper two days ago by some very prominent people who can do great research, but gave us one of these middle-road stinkers.

On good old fashioned scholasticism of theory, I think this is worse in organizations research than in the areas where I work (where theory seems to recede ino the background more and more every year … which is maybe a worse problem). But perhaps there is progress to be made. Better models, often simpler, and more disclosure could help.

LikeLike

Steve Morgan

February 28, 2014 at 12:37 pm
Gee, wouldn’t it be great if our journals had comments sections, so that exchanges like Professor Morgan and Professor Davis’ could be published with the article, and Professors Sharkey and Kovacs could respond if they wanted to? And all this could take place immediately, rather than 9 months after the initial article was published? And everyone with internet access could read the exchange?

Oh, wait…

LikeLike

anon

February 28, 2014 at 12:58 pm
This comment is posted on behalf of Jerry Davis, who had some technical problems posting.

@Steve: It is possible that my original post was ever-so-slightly overstated, as your exegesis helpfully suggests. Thinking as an editor, you probably noticed that the so-called experiment lacked experimental control, and that the comparison was just between two articles that happened to be covered on the same weekend in the same newspaper. Some unobserved process assigned the articles to this situation; other unobserved processes prompted (unrepresentative) anonymous readers to write more or less insightful comments in response, from which I cherry-picked text to support a point. Guilty: the affordances of blogs induce even more sizzle and less steak. (As for the intemperate pridefulness, it’s not for my editorial skills, but for the journal.)

But the comparison was attractive in other ways. The texting article was a straightforward report of comparisons among three experimentally-induced conditions, and there was not much room for doubt in the review process (unless they just made up the numbers, or the paper was computer-generated gibberish, which is unlikely). The book review article started with a puzzle, proposed some possible explanations, and tested them within the bounds of what is feasible given naturally-occurring data. Any of us could think of a half-dozen ways the data could have gone one way or another; if you assemble diverse reviewers, you get a lot of options (and perhaps even more post-publication). Through the review process, reviewers raised possible alternative explanations; authors responded; reviewers replied; authors responded more. Papers improve through this back-and-forth in the review process, via a virtual conversation with people having varied expertise. It’s nice to know that you would have accepted this paper too, and I think you agree that the conclusion of the paper is correct; I’d be intrigued as to where in the pre- or post-publication process the back-and-forth would go, particularly if it involved things like re-running analyses or changing the sample.

And the larger point was not that traditional journals publish nothing but gold and that open access journals publish nothing but nonsense. There are traditional journals that are nothing but nonsense; there are open access journals that are full of gold. So far, Sociological Science is all gold, and given the people involved, I am confident that it will continue to be so in the future. Creating a journal is hard work, and sustaining it is even harder, so you all deserve congratulations. I do not expect any Cheetos in Sociological Science.

The larger point was that the kinds of journals and the structure of review processes we have generate incentives for the field that can be more or less constructive. Traditional journals can have lots of pathologies. The review process can be slow and arbitrary. In the name of making a paper as good as it can be, editors can drag authors through endless rounds of revisions, with new reviewers on every round, and then decide on round 6 to pull the plug (just as the author was submitting his/her tenure materials). Meritorious papers fail to get published, sometimes not on ground of the soundness of the work but on the magnitude of the contribution, which is arguably a matter of taste. And on the face of it, the idea that knowledge should be conveyed via articles of a relatively fixed length and format, on a particular schedule (e.g., quarterly), seems like a vestige of a bygone era, when journals were printed and had subscribers.

Steve points to a particular pathology of traditional journals: their speed. A breaking, topical finding that needs to be certified and gotten out into the world does not fit well with a system of multiple rounds of reviews, editing, and publication. Better to publish first and ask questions later. (Of course, there are steps that can be taken to speed up the review process and get papers into press quickly, as some journals have demonstrated, but there is an irreducible minimum with >1 round of review.)

This was a premise behind PLoS One, as I understand it. PLoS is an extremely well thought-out and structured experiment. It is fast, open to the broad public, and provides a repository for work outside the mainstream that makes good use of contemporary technologies of communication. It is non-discriminatory, in the sense that citizen scientists outside academia who do good work can get it published (if they have $1350), just as well as Stanford professors. And the idea that post-publication tools will allow the significance of publications to emerge, rather than being filtered by editors and reviewers, is intriguing. They have already published 92,000 papers (including 4224 in sociology), they list 5000 editors, and in any given day they might accept as many papers as some quarterly journals accept in a year. Really, there is a lot to love about this. In principle, we could abandon all journals and have nothing but one giant open access repository, with “quality” and “impact” judged ex post, thus avoiding the need to rely on the reputations of journals and the taste of editors.

What’s not to love? An endless sea of knowledge and insight! But there is now a growing genre of research done specifically to be media-friendly that the intemperate might call “Cheetos.” Tell-tale signs include op-eds by authors alluding to their research in the (inevitably) “peer-reviewed journal PLoS One.” This is not work that has been helpfully translated from the original journalese to inform the public and policymakers (which of course can be valuable, as Shamus noted), or complicated puzzle-solving that rules out alternatives.

Of course, PLoS One was not created to enable Cheetos, any more than the Internet was created to enable porn. It’s entirely possible that nearly all of those 92,000 articles are outstanding, cumulative science. But if media friendliness and the network dynamic of downloads end up being the ex post processes that distinguish those papers perceived to have impact, that will be bad, as demonstrated by the kind of things that become hits. Perhaps less Crick and Watson than “Gangnam style.”

So, let a thousand publication flowers bloom. Blog posts, op-eds, open access, traditional journals, books, interpretive dance, whatever – there is a suitable format for many forms of scholarly communication. But let’s be cognizant of the incentives these formats create and their potential effect on the field.

Finally: Steve, you are highly welcome to submit to ASQ, where you will encounter a speedy and developmental review process, and a time to publication that might be faster than you expect. (Not to be prideful. And knife optional.)

LikeLike

fabiorojas

February 28, 2014 at 2:27 pm
So much has been made here of the 15-minute duration of SM’s read of the paper that I had to say this:

If someone wants to understand every subtlety of one of my papers, particularly if they want to understand the motivation behind every analytical choice I made, they will need to spend more than 15 minutes reading it (and I certainly expect that of my reviewers).

But, Jesus, I know that very few people are going to do that, and I would be *delighted* to have someone smart spend a serious 15 minutes reading a paper of mine and then share comments. (If a skilled, trained professional can’t understand the main issues at stake — including the motivation behind my *main* analytic choices — in serious, focused 15-minute read, then I think I did something really wrong in the writing.) My biggest fear about my work is that *no one will read it*, or really think about it.

I’m just sharing this perspective, which comes from a junior person, because there seemed to be a suggestion that there is something unseemly about sharing reactions to a paper that one has spent only 15 minutes reading. (Which, I guess, means we can only really comment on those few papers that are so central to how we develop our own ideas that we spend hours and hours poring over them.) Maybe that’s how senior people editing journals feel about commentary on their and their colleagues’ work. But it’s certainly not how it seems to me! Y’all are welcome to engage with my work in 15-minute increments any time you have the desire.

LikeLike

Eliza

February 28, 2014 at 2:52 pm
As others have noted, the fact that we are having this discussion about Amanda and Balazs’ paper is, probably, one of the strongest endorsements for the Sociological Science model. Incidentally, the valuable discussion emerged only as a side conversation, which shows just how few opportunities we have to do this.

Aside from that, there is an implicit assertion in Jerry’s original post that, while ‘the public’ is often careless, misinformed, and irresponsible in their comments, reviewers are not. But if we did a quick poll among those posting here and we all shared what some reviewers (in top journals) have written about our papers, I am sure those comments would compete with the worst offenders signaled by Jerry. With the aggravating factor that a careless reviewer writing in haste can block a paper from publication–or at the very least can force it to spend needless time in the review wringer. In an open access format, anonymous comments get less credit immediately (see Jerry’s “if that is your real name” above) which can limit careless behavior and especially its consequences. Moreover, both the authors and discerning and careful readers can weigh in. To be sure, the success of Sociological Science will depend on the care and dedication of its editors but one thing we have learned from our recent experiences in Sociology journals it is that editors may matter even more in peer reviewed journals.

LikeLike

rodcanales

February 28, 2014 at 8:00 pm
I think Eliza’s point is worth emphasizing. It may take hundreds of hours to write, review and revise a paper. But the whole point of all that work is to make the first fifteen minutes of a properly trained peer’s reading worthwhile. It should be possible to quickly locate the answers to the questions the reader might have.

In fact, I think a lot of reviewers (and editors) sometimes make their work too difficult for themselves by spending more than 15 minutes (desk) rejecting a paper. A reviewer should have a set of criteria for how much should be clear after 15 minutes of reading (conclusion, method, theory, say). If it’s not clear, that’s a reason to reject the paper (an editor who feels that way after 15 minutes should desk reject it).

So, when the editors and reviewers have done their job, the reader can actually accomplish a lot in fifteen minutes. And one of the things s/he should be able to accomplish is to raise the kinds of questions Steve did. Now, given a world without blogs, and without a post that was trying to score the sorts of points Jerry was trying to score, Steve might spend another hour finding the answers to his questions in the paper. But as far as I can tell, he’s located stuff that may well remain significant weaknesses of the paper. Not reasons not to publish, as he points out, but perhaps bases for the critiques that the commenters made in the media coverage.

LikeLike

Thomas

February 28, 2014 at 8:25 pm
This has been a very interesting discussion. I, personally, learned a lot, and this discussion has affected what I think about “public discussion” of articles post-publication.

I have always been all for the transparency and truth and diversity of opinions that such an open discussion can bring out. I am still an advocate of that. But what I realized is that this can be really taxing on the authors. That is, if someone posts a comment / doubt about your paper, you as an author need to address that otherwise the last public record will be an unanswered doubt. This obligation to reply any future comments, however, means that the process never ends. And that is not something I look forward to. I am a kind of person who gets tired of a project during the publication process (I guess I’m not alone!). The main reason that I love getting a paper published is that then I can close the process and move on to other new and exciting projects. The key is “moving on.” The fact that in such public debates of my previously published papers I’d need to go back to old stuff, essentially takes away the biggest satisfaction I derive from publishing a paper.

So, to be constructive, I propose a compromise. I propose that in Sociological Science (or in any other journal that promotes post-publication discussion), there should be a time-window after the publication in which people can comment. Let’s make it, say, six months. A statute of limitations. Then the paper should be laid to rest and the author could move on.

(Dear Reader: If you don’t agree with me, can you explain to me that in Table 2 of your 1999 paper, why is the quadratic term insignificant? And what would happen if you were to take the logarithm of your “age” variable?)

LikeLike

Balazs Kovacs

February 28, 2014 at 8:41 pm
@Balazs: What satisfaction would you suggest your readers should derive from your paper? How long after reading it (how carefully) should they, too, “lay it to rest” and “move on”? At what point would you suggest it’s reasonable to stop thinking about the paradox of publicity in any great detail, and politely decline to discuss its basis in your research?

I must say, the suggestion that the literature of the social sciences is a kind of graveyard full of ideas that their authors would prefer not to talk about explains a great deal!

LikeLike

Thomas

February 28, 2014 at 9:11 pm
Balazs: Great reaction. If you read this page (http://www.sociologicalscience.com/for-authors/reactions-and-comments/), you will see that (a) all reactions to papers are moderated; and (b) they need to be timely. But we are still working on tweaking how to make this work, and your post is very very helpful food for thought.

LikeLike

ezrazuckerman

February 28, 2014 at 10:05 pm
@thomas & @anonymous: This is getting a little nasty now, isn’t it? I’d encourage you both to check yourselves. You actually aren’t entitled to command my attention and energy just because you have a reaction to a paper that I have published. You can use it. Or not. I would guess that Balazs and his co-author feel more than a little jacked by Steve Morgan – whose “reaction” to their work serves no point other than “whose-is-bigger?” point-scoring against Jerry Davis – and it must be very hard to resist the urge to defend. Here’s hoping they can continue to do so!

LikeLike

Lisa

February 28, 2014 at 11:42 pm
Also @thomas and @anonymous: I want to make clear that Balazs and Amanda did not volunteer for this duty. They were conscripts, dragged in by my post due to the random timing of the Times coverage on their piece. So, from their perspective, this was like walking into a room and being expected to deliver a lecture cold.

It’s reasonable to ask “Am I eternally obligated to answer every question that ever arises on the Internet about every paper I have ever written?” If not, than let’s join Ezra’s discussion: what’s a productive way to bound conversation about articles? How do we have generative conversations that advance the field?

LikeLike

jerrydavisumich

March 1, 2014 at 12:02 am
Comment quality is beginning to degrade, and if going forward people can’t abide by norms of basic online civility, we’ll shut down comments. I’m deleting anon’s comment. Balazs and Amanda responded to criticisms about their research in good faith and snarky anonymous wisecracks violate rule #6. In the future, I suggest that when engaging in a discussion about someone else’s work like this, people should reveal their true identities. I appreciate Amanda’s and Balazs’s willingness to engage.

LikeLike

brayden king

March 1, 2014 at 1:13 am
Jerry and Lisa, you are not seriously stepping in to dress down one anonymous and one unknown commenter on behalf two authors who have been published in the most prestigious journal in their field, and have received press coverage in the paper of record, are you?

Everyone who publishes in ASQ “volunteers” to have their work scrutinized, because, after such scrutiny, that work has a good shot at being influential. Since scrutiny always leads to questions, it really is disappointing to hear an author who manage to publish serious work in a serious journal say that he really just did it so he could put his discovery behind him, and not answer queries about his work after the last copy editor has been satisfied. There is nothing “nasty” about expressing that disappointment, and worrying openly about how many people publish in the major journals with that attitude.

Nothing about my comment suggest that anyone is “eternally obligated to answer every question”. My questions didn’t even go to the paper but to the attitude that after publication we should just let the authors rest on their well-earned laurels. We all agree that authors can answer whatever questions they think are relevant, leave unanswered those they assume others will also think are impertinent, and grant whatever points of criticism they themselves come to see are valid.

Finally, and to move in the direction of Ezra’s discussion, it is interesting to put Jerry’s promise of “smart and skeptical reviewers who put [ASQ’s authors] through the wringer, and a harsh and generally negative editor” alongside the fact that Balazs apparently got “tired of [the] project during the publication process”. Isn’t that exactly what Steve’s remarks have been about? Let’s try to imagine a less painful publication process after which papers and their authors emerge ready to talk about their results and their basis.

LikeLike

Thomas

March 1, 2014 at 1:40 am
(My comment was written while Brayden posted his. I really find all this very thin-skinned given the top notch certification of materials we’re discussing. And I think the thickness of the conversation’s skin is very much the issue, by the way.)

LikeLike

Thomas

March 1, 2014 at 1:42 am
Thomas, this isn’t really about someone being thin-skinned. The problem is that you are hijacking the comment thread. Comments are off.

LikeLike

brayden king

March 1, 2014 at 2:11 am
Comments are open again. Peace.

LikeLike

brayden king

March 3, 2014 at 10:31 pm
http://andrewgelman.com/2014/03/04/literal-vs-rhetorical/#comments Andrew Gelman’s take on this.

LikeLike

Anonymous

March 4, 2014 at 9:03 pm
One thing that has bothered me in this discussion is that J.D. has repeatedly used the example of the ‘texting while walking’ study as an example of a dumb research question that gets public attention. But I think it’s helpful to know exactly what people do with their gait when they text. It doesn’t strike me as a dumb research question at all.

Yeah, if your null is ‘People who are texting walk exactly like people who aren’t texting’ and all you conclude is ‘Untrue! They’re slower,’ then you haven’t learned much. But what the study actually identified is a series of very specific postural changes. This strikes me as potentially important for many reasons: it could tell us something about the body’s mechanisms for dealing with limited attention; it raises the potential for certain kinds of repetitive stress injuries; and, as the New York Times article notes, it suggests the interesting idea that texting grabs our attention in such a way that our bodies prioritize it over other physical needs in the moment.

Physiology (and attention) aren’t my fields, so maybe these things really are trivial; I wouldn’t actually know. But I hate to see academics doing the thing to academics in other fields that we all hate when people do to us, namely assuming that our work is useless because they’re not engaged in the conversation in which the work raises new questions.

LikeLike

Eliza

March 8, 2014 at 11:14 pm
@Eliza: you make an entirely fair point. The texting study was carefully done, using a 3D movement tracking system. (They’ve even added a post-publication correction.) I would recommend that people read it, but that does not always end as expected…
Note, however, that I absolutely did not say it was “dumb” (which it is not), but straightforward and hard to doubt. An up-or-down review process favors such work, and dis-favors work that requires colloquy with reviewers – you just can’t have the kind of back and forth about alternatives and doubts that is the strength (and bane) of traditional reviews. There is room for straightforward-and-hard-to-doubt studies as part of a balanced diet. Not every study needs to be a theoretical breakthrough, and a diversity of publication outlets enables many kinds of contributions. But a review process like PLoS One, premised on the idea that a submitted paper either meets a minimum quality threshold or not, is a trickier fit with papers that involve puzzle-solving, theoretical knottiness, or qualitative data. (From what I can ethnography seems absent from PLoS One. Hard to generalize about 92,000 papers, but I don’t see a lot of non-quantitative stuff.)

LikeLike

jerrydavisumich

March 9, 2014 at 2:51 pm

Comments are closed.

orgtheory.net