when is data collection (ever) over?

In a previous post’s discussion, Graham Peterson kindly shared a great link of videos with Howie Becker’s thoughts about conducting research, specifically at the graduate level.  I found the last video clip of particular interest.  In “10. Savoir finir : Comment achever une thèse alors que les données de terrain ne cessent d’affluer ? [Knowing how to end: how to finish a thesis when field data keeps arising?],” Becker discusses the issue of knowing when to end data collection.  Some signals make this clear – the funding runs out, the return ticket’s date shows up, or less frequently, the field site closes.  (For historians, the re-closure of an archive is the equivalent.)

I would add another possibility – sometimes the data collection doesn’t end, and a researcher continues to analyze, write, and publish along the way, particularly if the phenomena under study changes and sparks additional areas of inquiry.  New research questions may arise, or the researcher may add other field sites for comparison.  As I tell grad students who are deciding among research projects, it’s likely that a researcher will live with an ethnographic research project for years beyond the dissertation’s completion, particularly if s/he writes a book and needs to publicize it.  Of course, out of expedience or boredom, some researchers will quickly move onto another research project.  However, with ethnographic research, the researcher who can return to the same field site faces the dilemma of sunk costs – forming relations with informants and developing expertise and local knowledge all take time, and it may be difficult to give all that up, especially when the passage of time starts to reveal dynamics not readily apparent before.


Written by katherinechen

July 8, 2013 at 3:40 pm

Posted in research

Tagged with , ,

10 Responses

Subscribe to comments with RSS.

  1. In anthropology, the model is often continual. If you really want to study social and cultural change, you really have to revisit the field site for decades.



    July 8, 2013 at 5:11 pm

  2. So is the recommendation for, say, a graduate student to a) take a break from fieldwork and try to publish something after a reasonable amount of time; b) continue fieldwork AS s/he tries to publish based on already collected data; or c) wait a decade or so and publish something–probably a book–based on the entirety of fieldwork?



    July 8, 2013 at 5:48 pm

  3. J, my impression is that each round of field work generates articles or monographs. So the grad student does about 1-2 years in the field, then write up the diss/book, then apply for funding. Etc. In Ye Olden Days of anthropology, you might work on a big project, so you alternate writing/field work, and the PI worked on the funding part of it. I’d be interested if any anthropologists have thoughts on this.



    July 8, 2013 at 6:09 pm

  4. Fabio – yes, also the same is apparent in the sciences. For example, Peter and Rosemary Grant’s two-decades-long study of evolution among finches in the Galapagos.

    J, researchers travel different paths, but most will do a combination of what you describe. Each route will have trade-offs in terms of time spent and potential scholarly impact. I’ll address each of the options you listed:
    A) Deadlines for talks, conferences, and special issues in journals, for example, provide “natural” breaks as well as deadlines and orientations for presenting work in progress, getting feedback, and building relations with colleagues.
    B) Some will do follow-up research, but on a less intensive basis (just an occasional check-in), as they are writing up research. Others will exit the field completely during the write-up, only to return some time later. C) A few will elect to devote substantial time to focus upon crafting a book, including collecting additional research. This can be a high risk strategy if the researcher isn’t working on other publications in parallel since some academic institutions expect scholars to demonstrate yearly productivity.
    Others will decide to publish articles with say, 20 days or even fewer of field research – this is more apparent in fields such as marketing.
    Perusing the methods section across a variety of books vs. articles will help you understand some strategies that people use or, more likely, fall into.



    July 8, 2013 at 6:17 pm

  5. The facets discussed here: long-run research programs, revisiting primary data, letting data guide questions through “data mining,” updating, and heuristic search — should not in any way be limited to ethnographic research.

    In fact these are precisely the qualities quantitative researchers need *more than anyone* to imbue in their methodological consciousness, because it is in the Testing Specific and Precise Hypotheses, on Big Official Already Collected and Settled Data Sets which quantitative researchers fool themselves into believing they’re discovering Settled Objective Truths.

    This is, as I understand him, part of one of Becker’s older (and correct) complaints about quantitative research, whence he’s unfortunately been misread as a mere anti-quant complainer.


    Graham Peterson

    July 9, 2013 at 5:22 am

  6. “…the funding runs out, the return ticket’s date shows up, or less frequently, the field site closes …” or, alternatively, “…never…”

    “…deadlines for talks, conferences, and special issues in journals…”

    “…some academic institutions expect scholars to demonstrate yearly productivity….”

    Then, suddenly, a reference to the “methods section”. It’s strange to me that no actually methodological principles are being suggested here. Surely, the question of when to stop collecting data depends on when you have collected enough data to be able (“in principle” if you will) to answer the question that the data was being collected to answer. Surely, a research project can fail to yield useful data precisely because the funding runs out, the return ticket’s date shows up (and can’t be extended), or the field site closes. And then (surely!) it should make no difference that that a deadline for a special issue arrives or a year-end review, which forces you to publish results based on inadequate data.

    Surely, that is, data is collected according to criteria of adequacy that are defined in advance of the study, and for which an adequate amount of funding (and travel) is then arranged. (Failing which, the study is not undertaken and the knowledge is left unclaimed pending sufficient data.) Surely, social scientists don’t publish whatever data they happen to have in order to demonstrate that they are “productive” to their employer, regardless of whether that data is sufficient to answer an interesting research questions.

    Surely! Or what?

    On the view of data collection that is being presented here, it’s no wonder that so much social science is being retracted these days.



    July 9, 2013 at 12:13 pm

  7. Thomas, point taken, I should have made more explicit why I recommended reading a variety of methods sections. Such reading will reveal that the most commonly cited reason for the duration of research in ethnographic researchers’ methods sections is achieving “theoretical saturation.” However, talks such as Becker’s reveal that multiple factors can contribute to researchers’ decisions about how to pace their projects. External “cues” of deadlines can help remind researchers as they attend to various responsibilities (i.e., teaching, service) along the way, whether they are doing a short or long project, as I discuss in an earlier post “productively handling structured and unstructured time as a scholar”



    July 9, 2013 at 12:56 pm

  8. Yes, that’s the sort of thing I’m thinking of. I think, if we’re honest, we’ll find that we talk much more often about the pragmatism, if you will, of research, and that this has come to, often, to actually supplant methodological issues, crowding them out. Research is taken to reach “theoretical saturation” at a time when a variety of theoretical, methodological and practical conditions converge.

    This has in fact been my critique of Michele Lamont’s research on “how professors think”, which is essentially a pragmatist projecting her own pragmatist sensibilities on other researchers and, in the process, exposing a complete lack of principled reasoning … which she also, without quite realizing what a scandal it would be if it were true, attributes to her object (i.e,. researchers determining who gets research grants).

    I think what I’m after here is that social scientists should feel obliged to wait until their research results reach a point at which the quality of the data allows them to safely “transcend” the practical contingencies. The way we’re discussing it these days, however, and the way it is presented in this post (and subsequent comments) tends towards a kind shoulder-shrugging attitude about the adequacy of one’s empirical material to support one’s conclusions and some hand-waving about “theoretical saturation”, which I think we’ll have to agree is not an very rigorous notion, especially when we’re talking about qualitative data, right? That is, I’m afraid we’re not teaching students well enough, or with enough seriousness, what their obligations are as far as data collection goes.



    July 9, 2013 at 3:02 pm

  9. Great question, interesting comments.

    So, “theoretical saturation” is the acceptably humble euphemism for “I think I know something”?

    I want to approach the question by making a distinction between two kinds of “work,” private-sector data analysis and scholarship, both of which use “data.” I think the distinction matters because more and more sociology is being done in the private sector; for example, see a series of recent orgtheory posts on brain drain, big data, and computational sociology. And the norms aren’t the same.

    Data analysis in private action (where it seems most computational sociology is being done) is much more deadline driven than academia (and certainly more than Thomas would seem to prefer). When you need to know something, you need to know something, and it is because you need to move on to something else. Data rarely work that easily around deadlines, so you make your best guess of the data: your best model.

    Something like Thomas’s ideal:

    … “social scientists should feel obliged to wait until their research results reach a point at which the quality of the data allows them to safely “transcend” the practical contingencies.” …

    is not proposed with the world of private data analysis in mind.

    In contrast, academic departments (the source of the world’s scholarship) provide workers time to mull over their subjects and allow the workers to know more than they have to — to, relatively speaking, reach Thomas’s vision. And in my experience they do: sociologists invariably know their subject more than they have to. In contrast, programmers rarely know much more than what is needed to build a model and move on.

    I am not making a value judgment on either set of expectations, just saying that the answer to the question atop this post depends on the client, the organization, the mentor, or the department for whom you are doing the data work.

    One last comment, off track: It amazed me to find out the majority of data analysis in both academic sociology and programming is really fiddling with the model (or the code). A model is never more than a theory. Which suggests that analysis is first and foremost theory work.



    July 9, 2013 at 10:39 pm

  10. Austen, thanks for the nuanced insight. I agree; in the academy, I see colleagues who have issues with too much rather than too little data. The temptation to keep collecting more and more data often sends researchers down the rabbit hole.



    July 11, 2013 at 11:01 pm

Comments are closed.

%d bloggers like this: