Deep within the recesses of anybody who has gone to grad school in a PhD program in social science, there exists a small neural net that stores information about interaction effects and interaction models. Interaction effects are cool because they allow us to make nice counter-intuitive hypotheses, or hypotheses that serve to bring together warring swaths of literature with the conciliatory proposition that “it depends.”
So the standard “interaction” hypothesis is:
hypothesis 1: X has an effect on Y, when condition Z is within a certain range of values (let’s say Z=0) and does not have an effect (or has the opposite effect) when condition Z is within a different range of values (let’s say Z=1).
Thus in contrast to the unconditional model, which is written:
The interaction hypothesis can be tested by specifying:
A while back I blogged about some really fascinating political science research on symbolic racism and the Republican Southern strategy. It showed that when Y=vote republican, X=negative racial attitudes toward blacks and Z=lives in the south, you find a positive interaction effect for the more recent period (1980s and 1990s) in contrast to its null effect in the 1960s and 1970s.
So we are agreed interaction models are awesome. However, as your stats 101 teacher told you, you have to be careful about two things: (1) never omit the main effects. Thus you don’t test hypothesis 1 using any of these specifications:
or Alanis forbid:
And (2) when interpreting b1 and b2 in the fully specified model, remember that those effects are conditional on the value of the other variables. b1 is now the effect of X on Y when Z=0 and b2 is now the effect of Z on Y when X=0. If your variables don’t have a meaningful zero point (like a racial attitudes scale), center them at their mean so that you can say “b2 is the effect of being Southern on voting republican for those who have average levels of racial animus towards blacks.”
Seems simple. Everybody knows this. Why am I even explaining this to you? Well, as noted by Branbor, Clark and Golder (2006) in a recent article in Political Analysis, a survey of 156 articles published in the major Political Science journals shows that only 10% of researchers specified their interaction models correctly. A large chunk of them outright omitted main effects, which can lead to incorrect significance tests of the interaction term. In some of these articles the entire contribution was riding on the interaction term. So things are not so simple. Consider the horror:
In an award-winning article in the American Political Science Review, Boix (1999) examines the factors that determine electoral system choice in advanced democracies. He makes two main conclusions. First, ethnic or religious fragmentation encourages the adoption of proportional representation in small and medium-sized countries (621). He draws this conclusion based on a model that includes an interaction term between ethnoreligious fragmentation and country size. However, he does not include either of the constitutive terms. When these terms are included, there is no longer any evidence that ethno-religious fragmentation ever affects the adoption of proportional representation (italics added).
You should read the article to see other horror stories. The lesson: if your dissertation/paper is riding on an interaction effect, don’t be a fool. Estimate a fully specified model.