Archive for the ‘mere empirics’ Category
In North Carolina, this is called the “Vaisey Cart.“
I’ve recently begun to work with a crew of computer scientists at Indiana when I was recruited to help with a social media project. It’s been a highly informative experience that has reinforced my belief that sociologists and computer scientists should team up. Some observations:
- CS and sociology are complimentary. We care about theory. They care about tools and application. Natural fit.
- In contrast, sociology and other social sciences are competing over the same theory space.
- CS people have a deep bucket of tools for solving all kinds of problems that commonly occur in cultural sociology, network analysis, and simulation studies.
- CS people believe in the timely solution of problems and workflow. Rather than write over a period of years, they believe in “yes, we can do this next week.”
- Since their discipline runs on conferences, the work is fast and it is expected that it will be done soon.
- Another benefit of the peer reviewed conference system is that work is published “for real” quickly and there is much less emphasis on a few elite publication outlets. Little “development.” Either it works or it doesn’t.
- Quantitative sociologists are really good at applied stats and can help most CS teams articulate data analysis plans and execute them, assuming that the sociologist knows R.
- Perhaps most importantly, CS researchers may be confident in their abilities, but less likely to think that they know it all and have no need for help from others. CS is simply too messy a field, which is similar to sociology.
- Finally: cash. Unlike the arts and sciences, there is no sense that we are broke. While you still have to work extra hard to get money, it isn’t a lost cause like sociology is where the NSF hands out a handful of grants. There is money out there for entrepreneurial scholars.
Of course, there are downsides. CS people think you are crazy for working on a 60 page article that takes 5 years to get published. Also, some folks in data science and CS are more concerned about tools and nice visuals at the expense of theory and understanding. As a corollary, it is often the case that some CS folks may not appreciate sampling, bias, non-response, and other issues that normally inform sociological research design. But still, my experience has been excellent, the results exciting, and I think more sociologists should turn to computer science as an interdisciplinary research partner.
I was recently working on a paper and a co-author said, “Yo, let’s slow down and Bonferroni.” I had never done that statistical position before and I thought it might hurt. I was afraid of a new experience. So, I popped out my 1,000 page Greene’s econometrics… and Bonferroni is not in there. It’s actually missing from a lot of basic texts, but it is very easy to explain:
If you are worried that testing multiple hypotheses, or running multiple experiments, will allow to cherry pick the best results, you should then lower the alpha for statistical tests of significance. If you test N hypotheses, your new “adjusted” alpha should be alpha/N.
Simple – ya? No. What you are doing is switching out Type 1 for Type 2 errors. You are increasing false negatives. So what should be done? There no consensus alternative. Andrew Gelman suggests a multi-level Bayesian approach, which is more robust to false positives. There are other methods. Probably something that should be built into more analyses. Applied stat mavens, use the comments to discuss your arguments for Bonferroni style adjustments.
A while back, Andrew and I got into an online discussion about the obesity/mortality correlation. He said it was true, I was a skeptic because I had read a number of studies that said otherwise. Also, the negative consequences of obesity can be mitigated via medical intervention. E.g., you may develop diabetes, but you can get treatment so you won’t die.
The other day, I wanted to follow up on this issue and it turns out that the biomedical community has come up with a more definitive answer. Using standard definitions of obesity (BMI) and mortality, Katherine Flegal, Broan Kit, Heather Orpana, and Barry I. Graubard conducted a meta-analysis of 97 articles that used similar measures of obesity and mortality. Roughly speaking, many studies report a positive effect, many report no effect, and some even report a negative effect. When you add them all together, you get a correlation between high obesity and mortality, but it is not true at ranges closer to non-overweight BMI. From the abstract of Association of All-Cause Mortality With Overweight and Obesity Using Standard Body Mass Index Categories: A Systematic Review and Meta-analysis, published in the 2013 Journal of the American Medical Association:
Conclusions and Relevance Relative to normal weight, both obesity (all grades) and grades 2 and 3 obesity were associated with significantly higher all-cause mortality. Grade 1 obesity overall was not associated with higher mortality, and overweight was associated with significantly lower all-cause mortality. The use of predefined standard BMI groupings can facilitate between-study comparisons.
In other words, high obesity is definitely correlated with mortality (Andrew’s claim). Mild obesity and “overweight” are correlated with less mortality (a weaker version of my claim). The article does not settle the issue of causation. It can be very likely that less healthy people gain weight. E.g., people with low mobility may not exercise or take up bad diets. Or people who are very skinny may be ill as well. Still, I am changing my mind on the basic facts – high levels of obesity increase mortality.
This is the last post for now about The Triumphs of Experience. In today’s post, I’d like to focus on one of the book’s major findings: the extreme damage done by alcoholism. In the study, the researchers asked respondents to describe their drinking. Using the DSM criteria and respondents’ answers, people were classified as occasional social drinkers, alcoholics and former alcoholics. Abstainers were very few so they receive no attention in the book. People were classified as alcoholics if they indicated that alcohol drinking interrupted their lives in any significant way.
The big finding is that alcoholism is correlated with nearly every negative outcome in the life course: divorce, early death, bad relationships with people, and so forth. I was so taken aback by the relentless destruction that I named alcoholism the “nuclear bomb” of the life course. It destroys nearly everything and even former alcoholics suffered long term effects. The exception is employment. A colleague noted that drinking is socially ordered to occur at night, so that may be a reason people can be “functioning” alcoholics during the day.
The book also deserves praise for adding more evidence to the longstanding debate over the causes of alcoholism. This is possible because the Grant Study has very rare, and detailed, longitudinal data. They are able to test the hypotheses that development of alcoholism is correlated with addictive personality (“oral” personality in older jargon), depression, and sociopathy. The data does not support these hypotheses.By itself, this is an important contribution.
The two factors that do correlate with alcoholism are having an alcoholic family member and the culture of drinking in the family. The first is probably a marker of a genetic predisposition. The second is about education – people may not understand how to moderate if they come from families that hide alcohol or abuse it. In other words, the family that lets kids have a little alcohol here and there are probably doing them a favor by teaching moderation.
Finally, the book is to be commended for documenting the ubiquity of alcoholism. In their sample, alcoholism occurs in about 25% of the sample of men at age 20. By the mid 40s, alcoholism reaches a peak, with about half of men being classified as alcoholics. After age 50, it then declines – mainly due to death and becoming a “former alcoholic.” If there is any generalizability at all to these findings, it shows that alcoholism has probably been wrecking the lives of millions and millions of people, somewhere between a quarter and half the population. That’s a profound, and shocking, finding.
A while back, I discussed a new technique for organizing and displaying information collected through qualitative methods like interviews and ethnography. The idea is simple: the rows are cases and the columns are themes. Then, you shade the matrix with color. More intense colors indicate that the case really matches the themes. Clustering of colors indicate clusters of similar cases.
Dan Dohan, who imported this technique in from the biomedical sciences, has a new article with Corey Abramson out that describes this process in detail. From Beyond Text: Using Arrays to Represent and Analyze Ethnographic Data in Sociological Methodology:
Recent methodological debates in sociology have focused on how data and analyses might be made more open and accessible, how the process of theorizing and knowledge production might be made more explicit, and how developing means of visualization can help address these issues. In ethnography, where scholars from various traditions do not necessarily share basic epistemological assumptions about the research enterprise with either their quantitative colleagues or one another, these issues are particularly complex. Nevertheless, ethnographers working within the field of sociology face a set of common pragmatic challenges related to managing, analyzing, and presenting the rich context-dependent data generated during fieldwork. Inspired by both ongoing discussions about how sociological research might be made more transparent, as well as innovations in other data-centered fields, the authors developed an interactive visual approach that provides tools for addressing these shared pragmatic challenges. They label the approach “ethnoarray” analysis. This article introduces this approach and explains how it can help scholars address widely shared logistical and technical complexities, while remaining sensitive to both ethnography’s epistemic diversity and its practitioners shared commitment to depth, context, and interpretation. The authors use data from an ethnographic study of serious illness to construct a model of an ethnoarray and explain how such an array might be linked to data repositories to facilitate new forms of analysis, interpretation, and sharing within scholarly and lay communities. They conclude by discussing some potential implications of the ethnoarray and related approaches for the scope, practice, and forms of ethnography.