Archive for the ‘mere empirics’ Category
A while back, I discussed a new technique for organizing and displaying information collected through qualitative methods like interviews and ethnography. The idea is simple: the rows are cases and the columns are themes. Then, you shade the matrix with color. More intense colors indicate that the case really matches the themes. Clustering of colors indicate clusters of similar cases.
Dan Dohan, who imported this technique in from the biomedical sciences, has a new article with Corey Abramson out that describes this process in detail. From Beyond Text: Using Arrays to Represent and Analyze Ethnographic Data in Sociological Methodology:
Recent methodological debates in sociology have focused on how data and analyses might be made more open and accessible, how the process of theorizing and knowledge production might be made more explicit, and how developing means of visualization can help address these issues. In ethnography, where scholars from various traditions do not necessarily share basic epistemological assumptions about the research enterprise with either their quantitative colleagues or one another, these issues are particularly complex. Nevertheless, ethnographers working within the field of sociology face a set of common pragmatic challenges related to managing, analyzing, and presenting the rich context-dependent data generated during fieldwork. Inspired by both ongoing discussions about how sociological research might be made more transparent, as well as innovations in other data-centered fields, the authors developed an interactive visual approach that provides tools for addressing these shared pragmatic challenges. They label the approach “ethnoarray” analysis. This article introduces this approach and explains how it can help scholars address widely shared logistical and technical complexities, while remaining sensitive to both ethnography’s epistemic diversity and its practitioners shared commitment to depth, context, and interpretation. The authors use data from an ethnographic study of serious illness to construct a model of an ethnoarray and explain how such an array might be linked to data repositories to facilitate new forms of analysis, interpretation, and sharing within scholarly and lay communities. They conclude by discussing some potential implications of the ethnoarray and related approaches for the scope, practice, and forms of ethnography.
Today, we’ll continue discussing George Vaillant’s The Triumphs of Experience, the 70 year long life course study. One of the major findings of the study is the importance of early childhood family conditions. The initial phases of the study asked participants to describe their childhood environment. Were their parents open and warm? Cold and removed? Divorced or still married? Also, the Grant study investigators had the opportunity to interview parents and other family members on occasions. Did the interviewer think the mother was involved or removed?
Using these data, the Grant Study investigators coded a number of variables reflecting family environment. The recorded stratification variables (employed v. unemployed, working class v. upper class), structure (divorced v. married) and emotional content (warm parents vs. cold parents). Then, they looked at the associations with a number of key life course variables.Two answers:
- First, having a warm father was associated with almost every positive life course outcome – flourishing in late age, not getting divorced, income. In some cases, the association is striking. In retirement, having a warm parent is associated with tens of thousands of dollars in additional income. That is amazing once you consider that this is an insanely biased sample of male Harvard grads. To push your income even higher in a batch of doctors, executives, and attorneys is stunning.
- Second, stratification variables don’t matter much. In other words, in this sample, having wealthy parents isn’t much of an asset.
- Third, divorce of parents does not seem to matter either once you account for having warm parents and having positive coping strategies.
Bottom line: Social networks seem to be very crucial for the life course. Not for their direct instrumental features (aka social capital), but mainly for allowing people to maintain an emotional composure that allows them to solve problems and thrive.
This week, I will spend quite a bit of time discussing a book called The Triumphs of Experience by George Vaillant. I’ve written briefly about the book before, but I didn’t appreciate the magnitude of the book until I assigned it for a class. Roughly speaking, the book follows a cohort of college men from the 1940s to the mid 2000s. Thus, the book tracks people from young adulthood to old age. It’s a powerful book in that it uses enormously rich data to analyze the life course and identify factors that contribute to our well being. You won’t find many other books that have such deep data to address one of life’s most important questions – What makes us happy? What is the good life?
In this first post, I want to briefly summarize the book and then note a few drawbacks. Later this week, I want to delve into two topics in more detail: alcoholism and parental bonds. To start: the Grant Study of Human development randomly selected a few hundred male Harvard undergrads for a long term study on health and the life course. It’s a biased sample, but it’s well suited for studying long life and work (remember, many women became home makers in that era) while controlling for educational attainment. The strength of this book is an ability to mine rich qualitative data on the life course and then mapping the associations over decades. The data is rich enough that the authors can actually consider alternative hypotheses and build multi-cause explanations.
A few drawbacks: Rhetorically, I thought the book was a bit wordier and longer than it needed to be. Also, I wish that the book had a glossary or appendix where one can look up definitions. More importantly, this book will note be convincing to folks who are obsessed with identification. It is very “1960s” in that they collect a lot of data and then channel their energies into looking at cross-group differences. But still, considering that doing RCT with your family is not possible and the importance of the data, I’m willing to forgive. Wednesday: The importance of your family.
Ray Fisman and Tim Sullivan, an emeritus guest blogger, have written an article in Slate about the clustering of LBGT workers into specific occupations. In other words, is there any truth to the view that LBGT people tend to go into specific professions like cosmetology? Fisman and Sullivan use an ASQ paper to discuss the issue. The idea is simple – LBGT people probably are attracted to jobs that either (a) require subtle interactional skills, which they have cultivated because they live in a hostile environment or (b) they seek jobs where they can work by themselves so they don’t have to deal with hostility or constantly trying to stay submerged. From Fisman and Sullivan’s analysis:
The central thesis of Tilcsik, Anteby, and Knight’s paper is that gays and lesbians will tend to be employed at high rates in occupations that require social perceptiveness, allow for task independence, or both. They test their theory using data from the American Community Survey—a gargantuan study of nearly 5 million Americans conducted annually by the U.S. Census Bureau—and the U.S. National Longitudinal Study of Adolescent Health (Add Health), an ongoing study that has followed the same group of Americans since 1994. All Add Health respondents were in middle or high school in the mid-1990s, so they were just beginning to settle into their careers around 2008, the year the study uses for its analyses. Both data sets include questions that can be used to infer sexual orientation, as well as information on respondents’ occupations.
The authors connected these data to assessments of the extent to which particular jobs require social perceptiveness and whether they allow for task independence, which come from ratings from the Occupational Information Network, a survey of employees on what they see as their job requirements and attributes. The survey seems particularly well-suited to the researchers’ task. One question asks the extent to which workers “depend on themselves rather than on coworkers and supervisors to get things done” (task independence), while another asks whether “being aware of others’ reactions and understanding why they react as they do is essential to the job” (social perceptiveness).
The link between these attributes and sexual orientation is immediately apparent from browsing the list of the top 15 occupations with the highest proportions of gay and lesbian workers. Every single one scores relatively high on either social perceptiveness or task independence, and most vocations score high on both. According to the authors’ calculations, the proportion of gays and lesbians in an occupation is more than 1.5 times higher when the job both has high task independence and requires social perceptiveness.
Clever paper! The paper is also an excellent contribution to studies of occupational segregation that go beyond stories of human capital. Recommended!
In 1994, The Social Organization of Sexuality was published. The authors, Ed Laumann, John Gagnon, Robert Michael and Stuart Michaels,conducted a large N survey of a random sample of Americans. I use the book in my freshman class to discuss sexual behavior. In today’s post, I will discuss what sociologists should take away from the book.
1. Doing a well crafted large N survey on an important topic is huge service to science. When we think of sociology, we often think of “high theory” as being the most important. But we often overlook the empirical studies that establish a baseline for excellence. American Occupational Structure is just as important as Bourdieu, in my book. Laumann et al is one such study and, I think, has not been surpassed in the field of sex research.
2. The book is extremely important in that good empiricism can abruptly change our views of specific topics. Laumann et al basically shattered the following beliefs: people stop having sex as they age; marriage means sex is less frequent; cultural change leads to massive changes in sexual behavior. Laumann et al showed that older people do keep on having sex; married people have more sex; and cultural moments (like AIDS in the 80s) have modest effects on sexual behavior. Each of these findings has resulted in more research over the last 20 years..
3. An ambitious, but well executed, research project can be the best defense against critics. The first section of Laumann at al. describes how federal funding was dropped due to pressure. Later, the data produced some papers that had politically incorrect results. In both cases, working from the high ground allowed the project to proceed. It’s a model for any researchers who will be working against the mainstream of their discipline or public opinion.
4. Quality empiricism can lead to good theory. Laumann et al’s sections on homophily motivated later theory about the structure of sexual contact networks and prompted papers like Chains of Affection. Also, by discovering that network structure affects STD’s, it lead to the introduction of network theory into biomedical science about a decade before Fowler/Christakis.
When we think of “glory sociology,” we think of succinct theoretical “hits” like DiMaggio and Powell or Swidler. But sociology is also profoundly shaped by these massive empirical undertakings. The lesson is that well crafted empirical research can set the agenda for decades just as much as the 25 page theory article.
Via Vox: A JAMA Internal Medicine article discusses how people systematically over estimate the benefits of medical treatment. This speaks to a broader issue – we under value things like exercise, diet, sanitation, and vaccination for health and over value “hero medicine” and fancy interventions.
A few weeks ago, I suggested that one can use techniques from computer science to assess, measure, and analyze the field notes and interviews that one collects during field work. The reason is that computer scientists have made progress in writing algorithms that try to pick up the emotional tenor or meaning of texts. Not perfect by any means, but it would be a valuable tool that can be used to help qualitative researchers identify themes and patterns in the text.
In the last round, there were two comments that I want to address. First, Krippendorf wrote: “Why call it computational ethnography and not just text analysis?” Answer: There are two existing modes of analyzing text and techniques like sentiment analysis and topic modelling new things in new ways. Allow me to explain:
- The traditional way of reading qualitative texts is simply for the researcher to read the texts and develop a grounded understanding of the meaning that the text represents. This is the standard mode among historians, most anthropologists, and some sociologists. Richard Biernacki in Reinventing Evidence in Social Inquiry argued that is the only valid mode of qualitative analysis.
- The other major way to deal with qualitative materials is to conduct a two step operation of having people code the data (using key words or other instructions) and then performing an inter coder reliability analysis (i.e., assign codes to texts and compute Krippendorf alpha’s).
So what is new? Techniques like topic models or sentiment analysis do not use people to code data. After you train the algorithms, it is all automated. This has advantages – speed, reproducibility, and so forth – for large data. Another novel aspect is that these algorithms are usually built with some sort of model of language in mind that gives you insight into how the text was coded. For example, the Stanford NSL package essentially breaks down sentences by grammar and then estimates the distribution of words with specific sentiment. Thus, there is an explanation for every output. In contrast, I can’t reproduce even my own codes over time. Give me a set of text next week, and it will be coded a little different.
Second, a number of commenters were concerned about the open ended nature of notes, the volume of materials, and whether the sorts of things that might be extracted would be useful to sociologists. These comments are easily addressed. Lots of projects produce tons of notes. I recently collected 194 open ended interviews. My antiwar project resulted in dozens and dozens of interviews. We have the volume. Sometimes they are standardized, sometimes not. That’s an empirical issue – how badly does it do with unstructured text? Maybe better than we expect. There is no reason for an a priori dismissal. Finally, I think a little induction is helpful. Yes, we can now pick up sentiment, which is an indicator of emotion, but why not let the data speak to us a little? In other, there’s a whole new world around the corner. This is one step in that direction.