biernacki book forum, part 1: why we should think about coding very carefully

biernackibookRead Andrew Perrin’s review at Scatterplot.

This Spring, our book forum will address Richard Biernacki’s Reinventing Evidence in Social Inquiry: Decoding Facts and Variables. In this initial post, I’ll describe the book and give you my summary judgment. Reinventing Evidence, roughly speaking, claims that numerically coding extended texts is a very, very bad idea. How bad? It is soooo bad that sociologists should just stop coding text and abandon any hope of providing a quantitative or numerical coding of texts or speech. It’s all about interpretation. This is an argument that prevents a much needed integration of the different approaches to sociology, and it deserves a serious hearing.

In support of this point, Biernacki does a few things. He makes an argument about how coding text lacks validity (i.e., associating a number to a text does correctly measure what we want it to measure). Then he spends three chapters going back to well known studies that use content analysis and argues, at varying points, that the coding is misleading, obviously incorrect, or that there were no consistent standard for handling the text or the data.

As a proponent of mixed methods, I was rather dismayed to read this argument. I do not agree that coding of text is a hopeless task and that we should retreat into the interpretive framework of the humanities. There seem to be regularities in speech, and other text, that makes us want to group them together. If you accept that statement, then it follows that a code can be developed. So, on one level I don’t buy into the main argument of the book.

At a more surface level, I think the book does some things rather well. For example, the meat of the book is in replication, which many of us, like Jeremy Freese, have advocated. Biernacki goes back and examines a number of high profile publications that rely on coding texts and finds a lot to be desired.

Next week, we’ll get into some details of the argument. Also, please check out our little buddy blog, Scatterplot. Andrew Perrin will discussing the book and offering his own views.

Adverts: From Black Power/Grad Skool Rulz

Written by fabiorojas

April 2, 2013 at 12:45 am

6 Responses

Subscribe to comments with RSS.

  1. Fabio, I agree that, if we use common sense, then it does not take a lot of mental gymnastics to arrive at the conclusion that we should be able to derive numeric values to capture and meter patterns in textual material or data. On the flip side, I am somewhat sympathetic to over-reduction of text into numbers and numbers only. What I would like to see is some exceptional work, supported by theory, about the transformation of text into numbers and then, on balance, numbers back into text again (and I think your comment about repetition is right on — repeat a content analysis three times by three separate groups and I’d be more convinced in the findings). Surely, STS has been a home to some of this discussion about text to numeric translation-work; however, I’d like to see it enter into the mainstream discussions in sociology …



    April 2, 2013 at 1:34 pm

  2. […] Variables by Richard Biernacki. With Fabio breaking up his contributions into digestible chunks, starting out with what is largely a summary of the aims of the book, i wasn’t sure where this comment would fit into the discussion, so thought i’d just […]


    my $0.02 | re-musing

    April 2, 2013 at 7:44 pm

  3. I think Biernacki’s underlying idea is that the meaning of a text is not a “fact” about that text and can therefore not be discovered merely analyzing its properties. What he calls the “ritual” of coding is a means of attributing (by a kind of magic) a meaning to a text that belongs to it by virtue of properties it possesses even when decontextualized. (Such as the frequency or proximity of particular words and phrases.) But what a text means is entirely dependent on context, albeit not simply, as Andrew puts it, “the cultural milieux in which [it was] produced”. We should seek the meaning of a text in the context in which the text circulates, which will often be far from its source. So, for example, you don’t determine whether or not Tropic of Cancer is a misogynist work or a racist one by counting the occurrence of particular epithets, but by reading the text in the context of the critiques that have been levied against it.

    I like to put it this way: you can’t hope to know what a text means, you can only try to understand its meaning. Now, I don’t think the argument will be won or lost on this basic epistemological premise. (I also think Andrew is wrong to suggest that the book is simply epistemologically “indefensible”, though I grant many of the other flaws he suggests.) What I like about Biernacki is that he shows how and where coding simply fails, in particular cases where it has been taken to be successful. I have my own examples of similar cases of methodical misreading. Actually, I don’t think coding is the root of the problem. I think it’s the very idea that texts can be analyzed methodically instead of reading them carefully. What Biernacki is exposing is a further manifestation of methodolatry in social science.



    April 2, 2013 at 7:46 pm

  4. […] blog. Next Monday, I will resume blogging with Part 2 of our discussion of Reinventing Evidence. (Part 1, book review at […]


  5. […] Part 1, Scatterplot review by Andrew Perrin. […]


  6. This is a worthwhile debate. I personally feel that the big data and auto coding tools hold promise, and we have seen interesting applications. Google translate is one way to see how these big sets of linguistic rules (i.e. codes) can get us toward gleaning a general idea about the meaning, albeit with considerable noise. So broad coverage, but of course, the results have to be interpreted carefully.

    Now is that the same as holistic meaning interpretation, as we have when we as people do the coding? No, not really. Auto coding trades off fast parsing of big data for nuanced interpretation of meaning, and it is also a replicable technique. You could even save and run a script, as you might in R, run it again with new sources, etc. For this reason, I’d say that it sits more closely in the quantitative camp, where these issues of generalization, i.e. “does the demographic category checked by the respondent actually fit the category they identify with?” have already been worked out.

    But let’s not mistake this for the painstaking interpretation of meaning that we do in ethnography. When I do ethnographic coding, my codes are my labels for a type of phenomena or process, and this is my expert interpretation. Unfortunately, that sensemaking is both hard to automate, and it sure as hell isn’t transparent, even if I can give you an account of what I did.

    As an aside, Gioia’s methods are also amazing and indeed quite rigorous. What I find interesting is that he is in some ways more in the holistic camp, for example, in suggesting that one of his cases can only map onto one single publication.



    April 16, 2013 at 4:25 am

Comments are closed.

%d bloggers like this: