Archive for the ‘mere empirics’ Category
While we’re running our Crowdsourced Sociology Rankings, people have been looking a little more closely at the U.S. News and World Report rankings. Over at Scatterplot, Neal Caren points out that U.S. News’s methods page has some details on the survey sample size and response rates. They’re bad:
Surveys were conducted in fall 2012 by Ipsos Public Affairs … Questionnaires were sent to department heads and directors of graduate studies (or, alternatively, a senior faculty member who teaches graduate students) at schools that had granted a total of five or more doctorates in each discipline during the five-year period from 2005 through 2009, as indicated by the 2010 "Survey of Earned Doctorates." … The surveys asked about Ph.D. programs in criminology (response rate: 90 percent), economics (25 percent), English (21 percent), history (19 percent), political science (30 percent), psychology (16 percent), and sociology (31 percent). … The number of schools surveyed in fall 2012 were: economics—132, English—156, history—151, political science—119, psychology—246, and sociology—117. In fall 2008, 36 schools were surveyed for criminology.
So, following Neal, this tells us the Sociology rankings are based on a survey of 117 Heads and Directors with a response rate of 31 percent, which is thirty six people in total. For Economics you have 33 people, for History 29 people, for Political Science 36 people, for Psychology 40 people, and for English 33 people. The methods page also notes that they calculate the scores using a trimmed mean, so they throw out two observations each time (the highest and the lowest). The upshot is that the average score of a department is likely to have rather wide confidence intervals.
But, don’t let all that get in the way of contemplating the magic numbers. The press releases from strongly-ranked departments are already coming thick and fast.
Update: These numbers are too low. Read on.
I guess it’s possible that U.S. News *might* mean that the *effective* N of, e.g., the Sociology survey is 117, and that’s the result of a larger initial survey which yielded a 31 percent response rate. On that interpretation they they initially contacted 378 departments (or thereabouts). That would be a non-standard way of describing what you did. Normally, if you give a raw number for the sample size and tell us the response rate, the raw number is the N you began with, not the N you ended up with. A quick check of the Survey of Earned Doctorates suggests that there were 167 Ph.D granting Sociology programs in the United States in 2010, which suggests that 117 is about right for the number who had awarded five or more in the past five years. Same goes for Economics, which has 179 Ph.D programs in the 2010 SED. Then again, the wording in the methods can also be read as saying every department might have received two surveys (“Questionnaires were sent to department heads and directors of graduate studies … at schools that had granted a total of five or more doctorates … during the five-year period from 2005 through 2009″). Looking again at the available SED data for 2006 to 2010 (one year off the USNWR dates, unfortunately), I found that 115 Sociology Departments met the stated criteria of having awarded five our more doctorates in the previous five years. If both the Dept Head and DGS in all those departmetns got a survey, this makes for an initial maximum N of 230, which is still quite far from the 378 or so needed, if 117 is supposed to mean the 31 percent who responded rather than the total number initially surveyed.
It seems like the most plausible interpretation is that for Sociology the number of schools surveyed is in fact 117, that every school received two copies of the questionnaire (one to the Head, one to the DGS or equivalent), but that the 31 percent response rate means “schools from which at least one response was received”, and so the total N surveys for Sociology is somewhere between 36 and 72 people, with a similar range of between 30 and 80 for the other departments.
Update: While I was offline dealing with other things, then looking at the SED data I’d downloaded, then writing the last few paragraphs above, I see others have come to the same conclusion as I do here by more direct and informed means.
As many of you are by now aware, U.S. News and World Report released the 2013 Edition of its Sociology Rankings this week. I find rankings fascinating, not least because of what you might call the “legitimacy ratchet” they implement. Winners insist rankings are absurd but point to their high placing on the list. Here’s a nice example of that from the University of Michigan. The message here is, “We’re not really playing, but of course if we were we’d be winning.” Losers, meanwhile, either remain silent (thus implicitly accepting their fate) or complain about the methods used, and leave themselves open to accusations of sour grapes or bad faith. They are constantly tempted to reject the enterprise and insist they should’ve been ranked higher, and so end up sounding like the apocryphal Borscht Belt couple complaining that the food here is terrible and the portions are tiny as well.
The best thing to do is to implement your own system, and do it better, if only to introduce confusion by way of additional measures. Omar Lizardo and Jessica Collett have already pointed out that U.S. News decided to cook the rankings by averaging the results from this year’s survey with the previous two rounds. They provide an estimate of what the de-averaged results probably looked like. Back in 20011, Steve Vaisey and I ran a poll using Matt Salganik’s excellent All Our Ideas website, which creates rankings from multiple pairwise comparisons. It’s easy to run and generates rankings with high face validity in a way that’s quicker, more fun, and much, much cheaper than the alternatives. So, we’re doing it again this year. Here is OrgTheory/AOI 2013 Sociology Department Ranking Survey. Go and vote! Chicago people will be happy to hear can vote as often as you like. So, participate in your own quantitative domination and get voting.
A few weeks ago, I argued that the era of overt racism is over. One commenter felt that I needed to operationalize the idea. There is no simple way to measure such a complex idea, but we can offer measurements of very specific processes. For example, I could hypothesize that it is no longer to legitimate to use in public words that have a clearly derogatory meaning, such as n—— or sp–.*
We can test that idea with word frequency data. Google has scanned over 4 million books from 1500 to the present and you can search that database. Above, I plotted the appearance of n—– and sp—, two words which are unambiguously slurs for two large American ethnic groups. I did not plot slurs like “bean,” which are homophones for other neutral non-racial words. Then, I plotted the appearance of the more neutral or positive words for those groups. The first graph shows the relative frequencies for African American and Latino slurs vs. other ethnic terms. Since the frequency for Asian American slurs and other words is much lower, they get a separate graph. Thus, we can now test hypotheses about printed text in the post-racial society:
- The elimination thesis: Slurs drop drastically in use.
- The eclipse thesis: Non-slur words now overwhelm racist slurs, but racist slurs remain.
- Co-evolution: The frequency of neutral and slur words move together. People talk about group X and the haters just use the slur.
- Escalation: Slurs are increasing.
This rough data indicates that #2 is correct. The dominant racial terms are neutral or positive. Most slurs that I looked up seem to maintain some base level of usage, even in the post-civil rights era. The slur use level is non-zero, but it is small in comparison to other words so it looks as if it is zero. Some slure use may be derogatory, while some of it may be artistic or “reclaiming the term.” I can’t prove it, but I think Quentin Tarantino accounts for for 50% or more of post-civil rights use of the n-word.
Bottom line: Society has changed and we can measure the change. This doesn’t mean that racial status is no longer important, but it does mean that one very important aspect of pre-Civil Rights racist culture has receded in relative importance. Some people just love racial slurs, but that its likely not the modal way of talking about people. Is that progress? I think so.
* Geez, Fabio, must you censor? Well, it isn’t censoring if it’s voluntary. I just don’t want this blog to be picked up for slurs. Even my book on 1970s Black Power, when people used the n-word a bit, only uses it once, in a footnote when referring to the title of H. Rap Brown’s first book.
Like most of us in the world of organization studies, I was saddened to hear of Michael Cohen’s passing. I only met him once and he was very gracious. In the spirit of his work, let me me draw your attention to his last research project – an analysis of “handoffs.” The issue is that doctors can’t continuously watch patients. Whenever a doctor leaves to go home, a new doctor comes in and there is a “handoff.” Cohen wrote a nice summary for the Robert Wood Johnson Foundation website:
1. To be effective, a handoff has to happen.
It may seem incredibly commonplace, but all too often preventable injuries or even deaths trace back to handoffs that were abbreviated, conducted in awkward conditions, or downright skipped. The easy cases to identify are things like leaving before handoff is done, or rushing the handoff in order to get out the door.
Unfortunately, many other causes are also in play. Some major examples derive from schedule or workload incompatibilities. If patients are sent from the PACU (post-anesthesia care unit) to a floor unit during its nursing report, the nurses accepting the patients will necessarily miss out on the handoff of existing patients. If a patient is moved from the Emergency Department (ED) before her doctor or nurse has time to complete phone calls to the destination unit, the patient endures some period of having been transferred without benefit of handoff. If there is a shift change in the ED just before a patient moves, the handoff is conducted by a doctor or nurse who has only second-hand familiarity with the events. To improve handoffs, we may need to teach participants to think about the organizational structures that make it hard to do them well.
I am currently working on a super cool project and I was thinking about the following distinction: modelling of data vs. prediction with data. If you give data to a physical science or engineering type, then they want prediction. They want to come up with an accurate prediction of some future state. You want tiny errors. In contrast, most social scientists are interesting in modelling general trends. We understand that statistical models have error terms, so prediction is inherently hard. It’s even beside the point in some sense. If X perfectly predicts Y, you’ve probably just measured the same thing twice. Instead, you want an imperfect, but unexpected, relationship between variables. Neither approach is wrong, but they do represent different philosophies of data analysis.
Scatterplot has a discussion on one my favorite topics, low response rates. The observation is that political polls have low response rates, but they produce decent answers, contrary to standard sociological advice. For years, I have argued that response rates do not logically entail biased data. It is simply a logical fallacy to deduce that survey data is biased only because of the response rate. Two examples that show the logical fallacy of deducing bias from response rates alone:
- High response rate, very biased: Let’s say that I fielded a survey that everyone responded to, except for Jews. They didn’t respond at all because I printed a swastika on the envelope. Every single Jewish respondent just threw it in the trash. The result? A response rate of about 97%. High response rate? Yes – textbook perfect. Bias? Yes – any question regarding Judaism (e.g., is R Jewish?) will be biased.
- Low response rate, no bias: Let’s say that I fielded a survey on Oct 1, 2012 in New York City. Say all 1,000 people who got the survey responded. Great! On October 21, I decide to use research funds to draw an extra sample of 9,000 names and send them the same survey. Oh no! Hurricane Sandy hits and nobody responds. Response rate? 10%. Biased? No – because not responding was a random event. The people in wave 1 were randomly chosen.
The issue isn’t the response rate – it’s selection into the study. If selection is correlated with the data (a religion survey that alienates a religious group), then the data is biased. If selection is random, then you have no bias. Selection biases can occur or not occur over the range of response rates from 1% to 99%.
Ok, you say, but maybe it’s not a logic issue. Sure, logically low response rate doesn’t *have* to lead to lead to bias. But in practice, low response is empirically related to bias. May low response rates means only really weird people answer the phone or send back the survey.
This is actually a fair point, but it’s wrong. You see, the bias-low response rate connection is an assumption that can be tested. And guess what? Public opinion researchers have actually tested the assumption through a number of studies. For example, Public Opinion Quarterly in 2000 published the results of an experiment where a survey was run twice. The first time, you just let people do whatever they want (response rate 30%). The second time, you really, really bug people (response rate 60%). The result? Same answers on both surveys. Follow up studies often find the same result.
In fact, in discussing this issue with John Kennedy, our recently retired director of survey research, I found out that this is an open secret among survey professionals. Response rates are a completely bogus measure of bias in survey data. It’s a shame that social scientists have held on to this erroneous belief, despite the work being done in public opinion research.
A graduate student asked me if the following sources for Congressional district voting data are reliable:
The only book for PhD students: Grad Skool Rulz
There’s a statistical (!) twitter fight this evening – Jennifer Rubin tweets “when do we break it to them that averaging polls is junk?” Hilarity ensues. There are actually some important subtle points about averaging poll data:
- Averaging bad data doesn’t make it better. On this broad point, Rubin is correct.
- Averaging good data does help. The purpose is to not be swayed by outliers that are produced by sampling. If you want to know the average family income in the US, you should average things so you won’t be swayed by the time Bill Gates appeared in the sample. If you believe that the typical polling firm is doing a decent job, it’s actually intuitive to average multiple polls.
- There’s actually research showing that poll averages close to the election aren’t terribly far off from the actual final numbers. See Nate Silver’s review on the subject.
A few days ago, I noted that Obama is slightly behind in the polls mainly because of the South. If it weren’t for the South, Obama would easily have about 51% of the vote in rest of the country. Kieran went back and compared the October Gallup polls in 2008 and 2012 to produce this picture:
You’ll hear all kinds of post-hoc explanations of the election outcome in November. But they’re probably wrong unless they start with the fact that the South really, really, really hates Obama more than the rest of the country for some inexplicable reason.
A common problem in social research is selection bias – the people who choose to respond to your survey may be systematically different than the population. We have some methods, like the Heckman model, for adjusting your final model if you have some data that can be used to model study participation. If you don’t have a decent selection model, you can still make some assessment using the methods suggested by Stolzenberg and Relles, which have you decompose your models and study the properties of the different parts (e.g., look at the degree of mean regression under certain conditions).
Question for readers: What is the state of art on this issue? Is there something better than Heckman or playing games with Mills ratios?
What makes a study interesting? Is it the empirical phenomena that we study or is it the theoretical contribution? For those of who are really paying attention (and I applaud you if you are), you’ll notice that I’ve asked this question before. It’s become a sort of obsession of mine. For the field of organizational theory, it’s an important discussion to have, although it’s not one that will likely yield any consensus. Scholars tend to have very strong opinions about this. Some people feel that as a field we’ve fetishized theory to the point of making our research inapplicable to the bigger world we live in. Others claim that by making “theoretical contribution” such a key component of any paper’s value, we ignore really important empirical problems. But in contrast, some scholars maintain that what makes our field lively and essential is that we are linked to one another (and across generations) via a stream of ideas that constitute theory. What makes an empirical problem worthy of study is that it can be boiled down to a crucial theoretical problem that makes it generalizable to a class of phenomena and puzzles.
At this year’s Academy of Management meetings, I was involved in a couple of panels where this issue came up. It was posed as a question, should we be interested in problems or theory? If we are interested in studying problems, we shouldn’t let theoretical trends bog us down. We should just study whatever real world problems are most compelling to us. If we’re interested primarily in theory, we need to let theory deductively guide us to those problems that help us solve a particular theoretical puzzle. Some very senior scholars in the field threw their weight behind the former view. I don’t want to name any names here, but one of the scholars who suggested we should be more interested in real-world problems is now the editor of a major journal of our field. He offered several examples of papers recently published in that journal that were primarily driven by interesting observations about empirical phenomena.
One of the new assistant professors in the crowd threw a pointed objection to the editor. And I paraphrase, “This all sounds great. I’d love to study empirical problems, but reviewers won’t let me! They keep asking me to identify the theoretical gap I’m addressing. They demand that I make a theoretical contribution.” Good point young scholar. Reviewers do that a lot. We’ve had it drilled into us from our grad school days that this is what makes a study interesting. If the paper lacks a theoretical contribution, reject it (no matter how interesting the empirical contribution may be)! This is a major obstacle, and I don’t think the esteemed editor could offer a strong counter-argument to the objection. Editors, after all, are somewhat constrained by the reviews they get. I think what we need is a new way to think about what makes a study valuable. We need new language to talk about research quality.
A key empirical question in social network analysis is whether Americans have more or less friends over time. Famously, Robert Putnam argued that indeed, we were “bowling alone.” In contrast, critics contend that these are misinterpreted results. Some types of networks disappear, while other appear.
On the social network listserv, Claude Fischer provides the latest round in the debate. Fischer uses 2010 GSS data to claim that the decline in strong personal relationships reported by McPhereson et al. (2006 in the ASR) is due to survey question construction. I’ll quote Fischer’s entire announcement: Read the rest of this entry »
Attention Stata people (esp. Sr. Rossman): Let’s say I have a data base of articles. I have a variable with the author’s name. Then I want to match the author’s name with other data (e.g., Fabio Rojas is matched with height 5′ 8″).
Merge 1:m is the command, but there’s a problem. Let’s say that my author data base doesn’t use the same spelling (e.g., Fabio G. Rojas or fabio rojas). Then the merged data set will have missing data.
Is there a way in Stata to offer the programmer a choice of possible matches to minimize missing data caused by variations in spelling? If not, what program or language has an easy to use tool box for this sort of stuff?
Last week, I argued that retractions are good for science. Thomas Basbøll correctly points out that retractions are hard. Nobody wants to retract. Good point, but my argument wasn’t about how easy it is to retract. Rather, it’s about the fact that science is exceptional in that it has a built in error correction mechanism.
In reviewing the debate, Andrew Gelman wrote:
One challenge, though, is that uncovering the problem and forcing the retraction is a near-thankless job…. OK, fine, but let’s talk incentives. If retractions are a good thing, and fraudsters and plagiarists are not generally going to retract on their own, then somebody’s going to have to do the hard work of discovering, exposing, and confronting scholarly misconduct. If these discoverers, exposers, and confronters are going to be attacked back by their targets (which would be natural enough) and they’re going to be attacked by the fraudsters’ friends and colleagues (also natural) and even have their work disparaged by outsiders who think they’re going too far, then, hey, they need some incentives in the other direction.
A few thoughts. First, fraud busting should be done by those who have some security – the tenured folks – or folks who don’t care so much (e.g., non-tenure track researchers in industry). Second, data with code should be made available on journal websites, with output files. Already, some journals are doing that. That reduces fraud. Third, we should revive the tradition of the research note. Our journals used to publish short notes. These can be used for replications, verifications, error reporting and so forth. Fourth, we should rely on journal models like PLoS. In other words, the editors will publish any competent piece of research and do so in a low cost and timely way. Fraud busting and error correction will never be easy, but we can make it easier and it’s not hard to do so.
A focus of network research since, say 1999 or so, has been to identify “laws” that generate large networks with certain properties.* For example, the small world network is built by rewiring a grid. Various processes generate power-law networks (i.e., the node distribution is described by a power law).
I can see two justifications for this type of research. The first is diffusion theory. The speed at which something diffuses in a network is definitely governed by the structure. The second is a sort of physical science justification, where you think of a network as a “system” and you show that some micro-process (e.g., preferential attachment) creates that network.
Is there any other behavioral implication of studying power laws/small worlds or other specific large scale properties? In other words, why should I care about scale free or small world networks aside from diffusion theory?
* Let’s leave aside recent criticism of power-law centric research for the sake of the post.
I’m still mulling over some of the issues raised at the Chicago ethnography and causal inference conference. For example, a lot of ethnographers say “sure, we can’t generalize but ….” The reason they say this is that they are making a conceptual mistake.
Ethnography is generalizable – just not within a single study. Think of it this way. Data is data, whether it is from a survey, experiment or field work. The reason that surveys are generalizable is in the sampling. The survey data is a representative sub-group of the larger group.
What’s the deal with ethnography? Usually, we want to say that what we observe in fieldwork is applicable in other cases. The problem is that we only have one (or a few) field sites. The solution? Increase the number of field sites. Of course, this can’t be done by one person. However, there can be teams. Maybe they aren’t officially related, but each ethnographer could contribute to the field of ethnography by randomly selecting their field site, or choosing a field site that hasn’t been covered yet.
Thus, over the years, each ethnographer would contribute to the validity of the entire enterprise. As time passes, you’d observe new phenomena, but by linking field site selection to prior questions you’d also be expanding the sample of field sites. This isn’t unheard of. The Manchester School of anthropology did exactly that – spread the ethnographers around – to great effect. Maybe it’s time that sociological ethnographers do the same.
a response to andrew gelman on the statistics discipline, but not scott because he thinks i’m a sad distraction in higher education and that like, totally, hurt my feelings
On Friday, I write a semi-humorous post about the interaction between statisticians and non-statisticians. The issue that brought it up was that sometimes statisticians like to work on asymptotic results. This, by itself, isn’t bad. It’s good to know what an estimator does when you have a nice big sample that behaves well. My beef is that sometimes small samples – the ones that most social scientists work with – are treated as an inconvenient afterthought. That rubs me the wrong was because mathematical elegance is accorded more importance than addressing the core problem of statistics – which is to accurately model, measure and study the relationships between variables.
Andrew Gelman wrote a simple response, which is that I am hanging around with the wrong people. There is some truth. The last time I had the “n–>00″ argument was with a visitor. Indiana has hired some exceptional applied statisticians, like Stanley Wasserman. The program has also hired people with non-statistics PhDs, like sociology and economics. I have consulted with these folks and it is easier to get concrete guidance on statistical practice.
But still, as multiple comments noted at orgtheory and Gelman’s blog, there are a lot of people with the title “statistician” who do treat issues of model estimation with small samples as an afterthought. This does happen, though maybe not as much as it used to.
Let me conclude this post with a comment about the sociology of the statistics profession. Statistics is a discipline that is analogous to computer science. Computer science can be math, engineering, applied science, or even philosophy (think artificial intelligence). Statistics is the same way. It can be mathematical, applied, or even visual. Consequently, there is no standardized cultural template for what a statistics department is.
Sometimes, statistics lives inside a math department. Sometimes it is distinct. At Indiana, they are trying an interdisciplinary approach where you have stat, math, and social science PhDs in the same unit. Each organizational environment creates pressures for different research.
If you live in a math department, you almost certainly can’t get promoted unless you study functional analysis or numerical analysis as applied to statistical issues. That produces people who are probably incapable of interacting with others who aren’t interested in the mathematics. Once you have your own department, you diverge from this model. Some statisticians are highly applied and many PhD graduates get jobs in professional schools and social science programs. These multiple pressures mean that you probably get a wide range of people, some of whom think statistics is just a field of mathematics while other can actually help people with real world statistical problems.
Here’s a conversation I’ve had a few times with statisticians:
Statistician: ” … and these simulations show how my results work.”
Me: “What does your research tell us about a sample of, say, a few hundred cases?”
Statistician: “That’s not important. My result works as n–> 00.”
Me: “Sure, that’s a fine mathematical result, but I have to estimate the model with, like, totally finite data. I need inference, not limits. Maybe the estimate doesn’t work out so well for small n.”
Statistician: “Sure, but if you have a few million cases, it’ll work in the limit.”
Me: “Whoa. Have you ever collected, like, real world network data? A million cases is hard to get.”
Statistician: “The Internet is a network with millions of nodes.”
Me: “Sure, but the Internet is one specific network. Most real world networks have hundreds or thousands of nodes. Like a school, or firms that trade with each other. Network data is expensive to collect. Some famous social science papers analyze networks of dozens of people.”
Statistician: “Um… the Internet! Scaling! Big networks! The Internet is a network! Facebook! FACE. BOOK!”
Me (rolls eyes): “What-EVER!”
This illustrates a fundamental issue in statistics (and other sciences). One you formalize a model and work mathematically, you are tempted to focus on what is mathematically interesting instead of the underlying problem motivating the science. An economist works on another equilibrium theorem rather than, say, taxes. The physicist works on the mathematics of super string theory, even when the experimental evidence isn’t there.
We have the same issue in statistics. “Statistics” can mean “the mathematics of distributions and other functions arising in statistical models.” Or it can mean the traditional problems of statistics like inference, measurement, model estimation, sampling, data collection/management, forecasting, and description. The problem for a guy like me (a social scientist with real data) is that the label “statistician” often denotes someone who is actually a mathematician who happens to be interested in distributions. That’s why they are happy with limit theorems, because limits smooth out hard problems and produce elegant results.
What I really want is a nuts and bolts person to help me solve problems. I may tease economists for their bizarre obsession is identification at the expense of all else, but at least identification is a real issue that needs to be taken seriously.
Let’s say you are doing discrete logit event history analysis. You are simply pooling all cases and time periods and just estimating a logit , where Y = failure event. See Yamaguchi’s (1991) book, chapter 2.
Question: why don’t people do a fixed effects kind of model, or cluster by case? There may person level heterogeneity that you want to account for. One way to address this is to do a logit w/fixed effects for each person in the population. Another way to do it is to try control for inter-person correlation (i.e., person X’s observation in time T and T+K are probably correlated).
This sort of adjustment is standard in panel data. Event history data has the same basic set and the same issues with correlated errors within cases, but most event history papers (including my own) don’t deal with this. Why?
These questions came up during orgtheory training last week. I did not have good answers:
- A lot of performativity research focuses on stock options, less on futures. Why?
- Are there good studies of performativity of theory that aren’t about the economic profession?
My lame answers: 1. Everyone is taught Black-Scholes first, but no reason performativity theory couldn’t be applied in other types of markets, 2. economics is the most influential intellectual group that has a theory of social behavior that is inaccurate (which makes performativity possible) . Post your answers in the comments.
Michael Bishop, of the Permutations blog, has set up a web site to archive R code for Add Health. Rather than have every Add Health researcher reinvent the wheel, he wants to sponsor an open source community that will provide R code.
Over at Salon, Alex Pareene made fun of people who try to guess presidential politics. Fair enough, there are a lot of lame guesses. However, there are patterns. It’s not as hard as you think. Basically, in American politics the *only* people who ever make any headway are people who have/recently had the following positions:
- Vice Presidents
- Cabinet Secretaries
All major party nominees come from this group. Of course, not everyone in the group has an equal chance of winning. Generals only seem to get nominated if they win a big war. In the post-war era, cabinet secretaries and representatives never get nominated, though a few may get a VP nod.
Even among governors and senators, there seems to be a rule that only recently elected leaders have the energy and resources to win. So the guy who’s been in Congress 30 years is unlikely to be the nominee. The governor with a term or two under the belt is in the position to win. So you could probably produce a list of people who have a decent chance of getting the nomination. This list includes 30 or 40 recently elected governors and senators as well as a few others, like sitting VPs or popular generals.
If you lower the bar and ask who is influential in presidential elections, you’ll find that the pool expands a little bit. Here you get the occasional rich dude who wins a state in the primary (Forbes ’96) or goes independent (Perot ’92), as well as the Representative who fights for a constituency (Chisholm ’72). Sometimes military figures step in. Wesley Clark actually won a state in the ’04 Democratic primary. But still, most of the action is in the recent governor/senator/VP pool.
The only person who has ever had any real impact in an election that wasn’t in this pool is Jesse Jackson, who won 3 primaries in ’84 and 10 in ’88. He did better than Al Gore, a well entrenched establishment figure. Jackson represents a political type that is fairly rare in American politics, the social movement leader who has a mass following. But that’s truly unique – the Civil Rights movement was extremely successful and then transitioned into the Democratic party. Don’t expect a similar figure any time soon. Few other movement leaders would have such a strong base that it would trump traditional party politics.
Bottom line: You never know what will happen in presidential politics, but you can come up with a reliable list of eligible bachelors.
Our friend Kieran has a series of posts on his research at Leiter Reports, the leading academic philosophy blog. Aside from writing on economic sociology, Kieran has begun an ambitious project analyzing the way that philosophers evaluate each other. Three posts so far, each well worth reading:
- The overall pattern of department evaluations.
- Descriptive analysis of who does the ratings.
- Specialties and raters.
I’ve seen this project presented in workshops. There is much more and it is very good. Can’t wait to see more posts.
Over at Evil Twin, Nicolai Foss gently chides Bloom and Van Reenen for publishing a paper in the AER proceedings called “New Approaches to Surveying Organizations.” The issue is the validity of survey data versus other types of data:
As a rule register data are not available that can be used to address numerous interesting issues in organizational economics, labor economics, productivity research and so on. Scholars working on these issues have to resort to those softy surveys and interviews that have been the workhorses of business school faculty for decades. This is a new recognition in economics. Case in point: A recent paper by Nicholas Bloom and John Van Reenen, “New approaches to surveying organizations.” There is absolutely nothing, I submit, in this short, well-written paper that would surprise virtually any empirically oriented business school professor (i.e., virtually all bschool professors) to whom this would not be anything “new” at all, but rather old hat.
This is not a critique of Profs. Bloom and Van Reenen at all (on the contrary, it is excellent that they educate their economist colleagues in this way). It is just striking and a little bit amusing, however, that we have had to wait until 2010 until empirical approaches that have been mainstream in management research for decades reach the pages of the American Economic Review.
I agree. In the comments, Bloom argues that he didn’t find any papers addressing these issues. This is odd, a lot of the suggestions for surveys make sense and many are well discussed in the literature on surveying individuals. For example, did they consult Dillman’s works? There are also handbooks discussing surveying organizations. There’s a huge industry of people who study survey bias.
A few additional comments: I have heard multiple economists express survey skepticism. The correct response is that reliability of survey responses varies and some questions are better than others. For example, people seem to be pretty good at reporting health, while they outright lie about attending church. Surveys by themselves aren’t good or bad, but individual questions can be high quality or low quality. Also, a lot of our most important data is from self-reports – like the Census, CPS, HRS, etc. I don’t see people ditching the Census.
Second, the real problem in survey research in organizations isn’t bias. It’s response rate. There’s all kinds of tricks to boost response rates for people, but getting people to respond at work (or about work) is really, really hard. And it’s miserable for longitudinal work. If Bloom and Van Reenen can produce a solution to low response rates from orgs, I’ll be really impressed.
mass media, you so dumb, you can’t count delegates. don’t come to math class, i’ll come to your house and give you an f and save you the trip.
You’re seeing a lot of headlines about how Tuesday’s results somehow put Santorum back on track. Romney got spanked. It’s now a two man race. Right…
Let’s look at the box and check the rules of the game. You need 1,144 delegates to win the nomination. Ok, so let’s count the pledged delegates awarded on Tuesday.
- Romney: 11 (AL) + 6 (American Samoa) + 9 (HI) + 12 (MS) = 38.
- Santorum: 18 (AL) + 0 (American Samoa) + 5 (HI) + 13 (MS) = 36.
That’s right, Romney actually *expanded* his lead on Tuesday. By a small amount, but he was the actual winner Tuesday.
We’re now in a replay of the 2008 Democratic primary. The delegate math heavily favors the well organized candidate who won some early states, racked up delegates in ignored states, and avoided delegate blow outs in other states. Just like Ohio and Pennsylvania didn’t derail Obama after he got that healthy delegate lead in February 2008, losing Southern states isn’t going to sink Romney. Toss in winner takes all states like California, and Romney has an obvious and likely path to victory. As long as Romney limits losses and keeps the delegate count close, he’ll slowly slog to the nomination and the primary fight will have no effect on the final outcome of the election.
At the Chicago ethnography conference, I saw an excellent presentation by Dan Dohan and Corey Abramson. The ethnography addresses how cancer patients are assigned to potentially life-saving clinical trials. Dan has collected an amazing amount of data on a crucial subject. I want to talk about how he illustrated this data. He used something called “microarrays.”
The concept is simple. The microarray is 2×2 matrix. Each row is a case. The columns represent variables. Columns are clustered according to similarity. The color of each cell represents some variable. Red might mean “high on the scale.” The combined effect is to create visual map of where the action is happening in the data.
This tool was invented by geneticists. In their case, rows are individuals. The columns are genes, clustered by similarity. Colors indicate whether the person has the gene. Intense colors indicate a cluster of people who have genes of importance. Here’s an example from the wiki.
Dan used this technique to illustrate his data. In his case, rows are people and columns are life course events. Colors indicate evaluations of the event, as reported by respondents. I have never seen ethnographic data displayed in this way before. It’s simple and intuitive. It can stand by itself, or be used as a guide for further qualitative or quantitative analysis.
In the Q&A, I asked Dan is he thought that ethnography was limited by narrative and verbal description (like vignettes). Maybe they should explore new ways to use all their data. The microarray shows the full power of data gathered through ethnography. Maybe there are other powerful ways to display qualitative data. In response, he focused on narrative and said that in the health world people needed to hear narrative. I’m not sure. Regardless, I think Dan should run with this. This opens up a whole new world for qualitative research and it needs to be explored.
“Aren’t we all Wittgensteinians here? Yes?” – Andreas Glaeser
1. Ethnography is watching everything that is the case.
2. Maybe not. It’s hard to see everything, but you don’t see nothing. That’s gotta count for something.
3. Something is better than nothing. That’s useful.
4. Pragmatism is the new black.
5. It gets better than pragmatism. We can be positivist.
6. Positivists like mechanisms, which grandma used to call process.
7. What about Uncle Mack? He doesn’t like abstract discussion, he likes cases and mid-level theory.
8. Lunch was good. I like the sandwiches with pretzel bread.
9. I feel weird when quantizoids show up. We can be friends.
10. Screw that. We’re talking about Bhaskar and critical realism. Fight club ensues.
10.1. I have to read a book on critical realism before someone can explain to me what critical realism is.
10.2. Critical realism means never having to choose a variable, level, or mechanism.
11. Chris Winship mentions orgtheory and scatter plot.
11.1. Margarita and Cassidy tweeted. #inferenceandethnography.
12. You can communicate ethnography with great diagrams.
12.1. If you can illustrate ethnographic data, does that mean ethnographers limit themselves with verbal and narrative forms of data presentation?
12.2. Dan doesn’t take the bait. Ethnography is about narrative. Period.
12.3 Dan is wrong. There is a world where Tufte meets Whyte.
13. Diane Vaughn, patron saint of orgtheory, speaks. Ethnography can generate ideas and mechanisms. I sweat with joy.
14. I am a stranger to a group of people who defined themselves as strangers to other groups. Yet, I am not in that group of strangers. This is called “Rojas’ Paradox.”
15. You make inferences by observing things.
15.1. Talk vs. action.
15.2. Chains vs. correlations.
15.3. Variance within the field site.
15.4. Counterfactuals are out there. Ethnographers can see them.
15.5. Simple modes of inference allow you to talk of these things.
16. Whereof one cannot observe, one must remain silent.
I will be in Chicago on Thursday and Friday for the causal inference & ethnography conference. Please email me if you want to hang out. Most of the time, I will be at the UoC. We can have heady discussions in the Sem Coop. Fri afternoon is flexible. Also, I will be live tweeting (@fabiorojas) the proceedings. Hashtag: #inferenceandethnography. Email/tweet your questions. Will see if I can ask them.