Archive for the ‘political science’ Category
The first “tweets/votes” paper established the basic correlation between tweet share and vote share in a a large sample of elections. Now, we’re working on papers that try to get a sense of who is driving the correlation. A new paper in Information, Communication, and Society reports on some progress. Authored by Karissa McKelvey, Joe DiGrazia and myself, “Twitter publics: how online political communities signaled electoral outcomes in the 2010 US house election” argues that the tweet-votes correlation is strongest when people compose syntactically simple messages. In other words, the people online who use social media in a very quotidian way are a sort of “issue public,” to use a political science term. They tend to follow politics and the talk correlates with the voting, especially if it is simple talk. We call this online audience for politics a “twitter public.” Thus, one goals of sociological research on social media is to assess when online “publics” act as a barometer or leading indicator of collective behavior.
One of the more serious anti-immigration arguments is that immigration is correlated with welfare state expansion. The argument hinges on a normative evaluation of social services, but, at the least, it is a coherent argument. The issue then is empirical evidence – does immigration actually precede welfare state expansion? An op-ed in the Investor’s Business Daily summarizes research that claims that there simply isn’t any association. Written by Alex Nowratesh and Zachary Gouchenour:
.. we show that, historically, immigrants and their descendants have not increased the size of individual welfare benefits or welfare budgets and are unlikely to do so going forward. The amount of welfare benefits is unaffected by the foreign origin or diversity of the population.
Since 1970, no pattern can be seen between the size of benefits a family of three gets under welfare programs like Temporary Aid for Needy Families (TANF) and the level of immigration or ethnic and racial diversity.
We compared individual states because they largely decide the benefit levels for many welfare programs, and states’ levels of ethnic diversity vary tremendously along racial, ethnic and immigrant lines. For instance, in 2010 only 1.2% of West Virginia’s population was foreign-born while 27% of California’s was.
Furthermore, the amount of TANF benefits also varied by states with similar demographics. For instance, in 2010 a California family of three received $694 a month in TANF benefits. But in Texas, an identical family received only $260. The size of the Hispanic population in each state is the same: 39%.
For every California with many immigrants, considerably diverse, and a vast welfare state, there is a Florida or a Texas with similar demographics but a smaller welfare state.
In other words, there is no actual link between welfare state generosity and a state’s immigration population. So, basically economic research shows small or no effects on wages and this research shows no effect on political outcomes. The arguments against immigration are extremely flimsy.
university of chicago visit – everything you wanted to know about tweets and votes, but were afraid to ask
I will be a guest of the computational social science workshop at the University of Chicago this coming Friday. I will present a very detailed talk on the more tweets/more votes phenomena called “Everything You Wanted to Know About the Tweets-Votes Correlation, but Were Afraid to Ask.” If you want to chat or hang out, please email me.
Refreshments will be served.
The ASA section on Political Sociology has published their Fall 2013 newsletter. They had a symposium on the topic of implications of social media for democracy and other good items. Articles include:
- Zeynep Tufecki on digital empowerment.
- Discussion of recently deceased political sociologist Juan Linz.
- Interview with Chris Bail on his recent research
- My essay – “Digital Democracy is Here – Let’s measure it!”
36 pages of great stuff. Recommended!!
More Tweets, More Votes news:
- I thank Alex Hanna for mentioning this work in a new Foreign Policy piece that discusses how social media can be used to monitor elections in nations where polling is rare, a possibility that I mentioned in my Washington Post article on MTMV. Alex and co-author Kevin Harris use social media data to track Iranian public opinion, because quality polling is not common there. A must read for people who want to see how social media can be used to measure and evaluate democratic processes.
- The peer reviewed version of MTMV is now out in PLoS One. The paper presents the tweet share/vote share correlation for the 2010 and 2012 House elections and discusses possible mechanisms.
- The working paper version of MTMV at Social Science Research Network has had over 1,200 downloads in its short life, pushing it into the top 10 most downloaded papers on models of elections and political processes at SSRN. Congratulations to my co-authors Joe DiGrazia, Karissa McKelvey, and Johan Bollen. Outstanding work.
Insider tip: New results be presented at the computational social science workshop at the University of Chicago in January 2014. Details forthcoming.
Control Point Group, a political consultancy firm, asked my opinion on a recent Pew study of public opinion and twitter. I’ll quote Politico reporter Dylan Byers, who summarized the Pew study:
Sixteen percent of U.S. adults use Twitter and just half that many use it as a news source, making it an unreliable proxy for public opinion, according to a new survey from the Pew Research Center and tyhe John S. and James L. Knight Foundation.
Take last year’s Republican primary, for example: “During the 2012 presidential race, Republican candidate Ron Paul easily won the Twitter primary — 55% of the conversation about him was positive, with only 15% negative,” Pew writes. “Voters rendered a very different verdict.”
Or the Newtown school shooting: “After the Newtown tragedy, 64% of the Twitter conversation supported stricter gun controls, while 21% opposed them. A Pew Research Center survey in the same period produced a far more mixed verdict, with 49% saying it is more important to control gun ownership and 42% saying it is more important to protect gun rights.”
That’s worth keeping in mind next time you see the reaction-on-Twitter piece in the wake of any major national news event. However, Twitter may be a more reliable indicator of youth sentiment.
This is a subtle point. Pew is doing what computer scientists call a sentiment analysis. Roughly speaking, you write a program that guesses whether some text, in this case a tweet, reflect a positive or negative sentiment. The literature (including the Pew study cited) shows very mixed results. The take away point for me is that sentiment is either tricky to measure feelings properly or that emotional context of text doesn’t correlate well with behaviors that we care about.
In contrast, our research (and that done by others) shows that relative shares of mentions, regardless of sentiment, do show a positive correlation with some political behaviors, like voting. My hypothesis is that the relative volume of talk is simply a proxy for buzz, name recognition, popularity, or some other variable. Regardless, the correlation is there.
At last week’s PLEAD conference on social media and political processes, Alex Hanna tweeted a summary of a talk by Mark Huberty of UC Berkeley political science, which raised some questions about using social media data to forecast electoral results. Alex suggested that we could have a good discussion about Mark’s talk. In these comments, I rely on Alex’s summary. If I mis-characterized a point, please email me or correct me in the comments.
1. Huberty noted, correctly, that incumbency highly correlates with electoral wins. The implication is that social media data is not valuable, or important, or accurate, because incumbency accounts for a lot of the variance in electoral outcomes.
Well, it depends on what your goals are. If you are making a claim that “A causes B”, then finding out that C account for much of the variance is extremely important. It shows that A isn’t causing B. However, if your claim is that “A is a decent measurement of B,” then finding out that C is a strong correlate of B is simply irrelevant. The claim isn’t about what is some fundamental cause of B, just what tracks with B.
Different claim, different standard of proof. That’s we care about polls. Incumbency predicts elections better than polls, but as long as we don’t claim that polls cause election outcomes, we remain satisfied with the well documented correlation between voter surveys and final votes.
Also, incumbency is not a reasonable variable to benchmark against because incumbency is simply a word for “the person who won last time in the same election with a very similar group of voters.” As good social scientists know, a lot of human behavior is seriously auto-correlated. What I ate yesterday is the best predictor of what I’ll eat tomorrow. Politics is no different.
Thus, in a lot of social science, we aren’t interested in these sorts of time series because we know that answer already. X_t is almost certainly strongly correlated with X_t-1. The interesting question is why the time series is X_1, X2,… and not Y_1, Y_2, … Similarly, we might interested in “extracting a signal” from some new source of data to help us measure X_i or build a causal explanation that doesn’t fall back on trivial auto-correlated time series explanations. In other words, “The guy is an incumbent because there are a lot Black voters” is a much more meaningful statement than “The guy won this time because he won last time.”
That is ultimately why I remain interested in social media and electoral outcomes. Social media is a record of what people think that is different than polls and traditional print or broadcast media. It deserves a serious examination as a signal. And given the work by Huberty himself, Tusmajan, Juengher, Beuchamp, the Indiana group, and others, the “social media as measurement of political sentiment” hypothesis is important and, as far as I can tell, supported to varying degrees by the Twitter data. Incumbency is a non-issue as long as researchers and political professionals avoid claims of causation.
2. Alex also indicated that Mark Huberty was concerned about how social media data is created. Here, I also agree. Transparency is important. All data is imperfect – people lie on polls, surveys has selection biases, etc. There is a discussion about the properties of the samples that Twitter produces for researchers that might lead one to think that there might be an issue. The more we know about the way social media samples are generated, the better.
Still, the issue is *how much* of a problem this is. On this point, I urge Mr. Huberty to be bluntly empirical.The blunt empiricist, I would argue, would just put it to the test. The empiricist would look for natural experiments in the data (transparent data vs. others) or well chosen comparisons to see how much it affects the social media-vote correlation. Rather than point to possible problems, research would actually identify them. It might not matter, or it might be a big deal. Let’s figure it out!
In my Washington Post column, I discussed the possibility that social media data might displace the traditional political poll. After writing the column, I thought that I might have gone overboard. But after reading some recent research, I realized that I am really onto something. Recent research shows that social media data, when modeled correctly, does provide very good measurements of public opinion trends.
Nick Beauchamp is political scientist at Northeastern University. He has a new working paper called “Predicting and Interpolating State-level Polling using Twitter Textual Data.” This paper is the vital intermediate step between noticing that tweets correlate with votes and using social media data by itself to forecast elections. The abstract:
Presidential, gubernatorial, and senatorial elections all require state-level polling, but continuous real-time polling of every state during a campaign remains prohibitively expensive, and quite neglected for less competitive states. This paper employs a new dataset of over 500GB of politics-related Tweets from the nal months of the 2012 presidential campaign to interpolate and predict state-level polling at the daily level. By modeling the correlations between existing state-level polls and the textual content of state-located Twitter data using a new combination of time-series cross-sectional methods plus bayesian shrinkage and model averaging, it is shown through forward-in-time out-of-sample testing that the textual content of Twitter data can predict changes in fully representative opinion polls with a precision currently unfeasible with existing polling data. This could potentially allow us to estimate polling not just in less-polled states, but in unpolled states, in sub-state regions, and even on time-scaled shorter than a day, given the immense density of Twitter usage. Substantively, we can also examine the words most associated with changes in vote intention to discern the rich psychology and speech associated with a rapidly shifting national campaign.
In other words, if you do some sensible model fits and combine with content analysis, social media time series mimic the trends produced by polls. The next step is obvious: combine election results and social media data, model the error, and if the results are reasonable, you will no longer need big polls.
A question for my brothers and sisters in political theory: There are certain individuals who embody the role of the activist/intellectual. They are highly influential in movement politics and they write or speak about movements in a very theoretical way, offering justification for the movement’s goals and strategies. For socialist politics, this role is filled by Lenin, who provided an explanation of what the communist party is supposed to do. For non-violence, King and Gandhi fill this role.
My question: Is there an analog for mass political parties in democratic societies? In other words, who is the master politician who articulates the purpose and function of the party in modern democracies? Does this person talk about how the party should manage/exploit various constituencies, especially rowdy ones like protest movements?
In a recent essay in the NY Times, Ross Douthat explains the motivations behind conservative politics. This clip nicely summarizes the issue:
… For the American mainstream — moderate and apolitical as well as liberal — the Reagan era really was a kind of conservative answer to the New Deal era: A period when the right’s ideas were ascendant, its constituencies empowered, its favored policies pursued. But to many on the right, for the reasons the Frum of “Dead Right” suggested, it was something much more limited and fragmented and incomplete: A period when their side held power, yes, but one in which the framework and assumptions of politics remained essentially left-of-center, because the administrative state was curbed but barely rolled back, and the institutions and programs of New Deal and Great Society liberalism endured more or less intact.
I think that’s a good summary … for one small part of the conservative movement. And it is true. There is definitely an anti-statist element of the modern conservative coalition. There are people who genuinely think that more services should be shifted to the private sector and that the size of the tax obligation and the federal government should be shrunk.
However, the committed anti-statist part of the conservative coalition is only a small part of the story. When we take a broad look at policy, we see that conservatives routinely support all kinds of government services. For example, calls for shrinking government almost always exclude the military. Then, if we look at Medicare we find that conservative voters do not favor privatization. In other areas, conservatives have no problem expanding the size of government – building walls on the Mexican border, jailing millions of African American for drug possession, or creating more and more regulation of reproductive medical procedures such as abortion, stem cell research, and birth control. All of these require massive intrusions on the safety and privacy of millions of people who are doing no wrong to others.
So what’s the real story? I think it’s fairly simple. Committed anti-statists are the “beard” for other factions that really don’t care about the size of government. A theory of personal liberty is important and draws attention from what might be the ulterior goal. And these other factions have all kinds of goals. National security conservatives love war because it shows that they’re tough. Social conservatives simply want to roll back, or circumvent, the progress made by women, minorities, LBGT people, immigrants, and other groups that were openly repressed and discriminated against in previous eras. And there’s what I call the business conservative, who just wants tax breaks and could care less about anti-gay crusades, but has to tolerate the social conservatives in order to get these perks.
Whenever I hear a conservative claim they are for liberty or limited government, I’m always a little skeptical. The arguments for liberty, tolerance, and protection from government harassment apply to themselves, and others like them, but are rarely applied with the same vigor to people or social practices they find distasteful. The bottom line is that I’m willing to engage with writers like Ross Douthat, but not until they tell their fellow travelers that gays and Mexicans are really nothing to worry about.
I am a believer that public policy should be transparent. People should know the rules. Opaque rules favor the wealthy, the privileged, and the established. It is no wonder that Mitt Romney’s IRA account is $102 *million.* And I bet every single thing he did was completely legal.
Steve Teles takes this up in an article called “Kludgeocracy in America,” published at National Affairs. Teles notes that American governance is arcane and complex:
The price paid by ordinary citizens to comply with governmental complexity is the most obvious downside of kludgeocracy. For example, one of the often overlooked benefits of the Social Security program — which represents an earlier era’s approach to public policy — is that recipients automatically have taxes taken out of their paychecks, and, then without much effort on their part, checks begin to appear upon retirement. It’s simple and direct. By contrast, 401(k) retirement accounts, IRAs, state-run 529 plans to save for college costs, and the rest of our intricate maze of incentivized-savings programs require enormous investments of time, effort, and stress to manage responsibly. But behavioral economics — not to mention common sense — makes clear that few investors are willing to make these investments, and those who do are hampered by basic flaws in decision-making.
Kludgeocracy is also a significant threat to the quality of our democracy. The complexity that makes so much of American public policy vexing and wasteful for ordinary citizens and governments is also what makes it so easy for organized interests to profit from the state’s largesse. The power of such interests varies in direct proportion to the visibility of the issue in question. As Mark Smith argues in his book American Business and Political Power, corporations are most likely to get their way when political issues are out of the public gaze. It is when the “scope of conflict” expands that the power of organized interests is easiest to challenge. That is why business invests so much money in politics — to keep issues off the agenda.
Democratic success hasn’t just weakened the antiwar movement. Though the Obama administration has been criticized by environmentalists and civil libertarians for various failures, real and perceived, the energy behind these movements tends to wane under Democratic administrations, and not just because Democratic administrations are more likely to accept the legitimacy of environmentalist and civil libertarian claims. Similarly, conservative calls for fiscal consolidation and abortion restrictions have tended to be more muted under Republican administrations, though it is possible that this will change in the future.
Indeed. I’m glad that Salam framed it as a general political issue. The deeper point is that in a world where there is strong political polarization, where movements are strongly connected to one of the major parties, it is hard for movements to act independently of electoral cycles. The result is a paradoxical situation where the movement is strongest when it is least likely to have an impact.
In the Spring, I was teaching our first year graduate course. I start with rational choice theories and then move on. To illustrate the difference, I used gun legislation. After the Newton shootings, did the class think that we’d have more gun control? The hypotheses:
- Median voter theorem – NO – the average voter is happy with current gun laws.
- Elite theory – YES – it was clear Obama and Biden wanted more gun control.
Now we know the answer, the Median voter won. The rest of the class went with elite theory. Somebody owes me some money!
Russ Roberts interviews political scientist Mike Munger on the topic of rules and institutions, using sports as an example. One of the most interesting things about sports is that there are informal rules governing fighting. A few key ideas:
- To decrease overall fighting, you allow a little bit. It acts as a deterrent.
- In sports with little protection, like hockey or baseball, you get ritualized fighting.
- In sports with ritualized fighting, you get fight specialists. You don’t want skilled players getting injured.
- In low fighting sports, like football, you need to slow things down with heavy referee intervention.
- Once you protect athletes with equipment, fighting goes up because it is less damaging.
- If sports becomes lucrative, then norms change to reduce fighting. You don’t want your money generating stars missing the game.
A nice discussion of how norms, rules, and technology all affect each other.
If you are interested in reading the media coverage of More Tweets, More Votes, here are the links to selected coverage:
- My op-ed in the Washington Post
- the Wall Street Journal
- The Daily Rundown/MSNBC
- C-SPAN’s Washington Journal
- National Journal
- The Atlantic
Thanks for checking in.
This week, there has been substantial media coverage of the More Tweets, More Votes paper, which was presented on Monday at the ASA meeting in New York. Scholars and campaign professionals have been asking questions about the draft of the paper, which can be found here. Since we have received many requests and clarifications, I will address comments through this blog post.
1. Your tweets/votes R-squared is small. The correlation between tweets and votes is actually really small when compared with other factors (such as incumbency).
Commenters have asked about the size of the twitter correlation in comparison with other models. First, no claim was made about this issue and it not relevant to the major point of the paper. The point of the paper is that social media has important information. This information may be correlated with other data. However, we can compare the twitter bivariate correlation with other correlations. The twitter correlation with Republican vote margin, for example, is .53. Incumbency has a correlation of .73 with vote margin. The proportion of people with a college education has a correlation of .15. Thus, the twitter measure is in the middle of the range of the variables we look at.
2. 404 out of 406?: In your SSRN draft, the analysis does not predict the winner in 404 out of 406 competitive races, which is what Fabio Rojas said in the WaPo op-ed. (http://www.washingtonpost.com/opinions/how-twitter-can-predict-an-election/2013/08/11/35ef885a-0108-11e3-96a8-d3b921c0924a_story.html?wpisrc=emailtoafriend)
A number of commenters have asked about the number of correctly predicted races. In the original paper, we do not perform this analysis. For the purposes of presenting the research to the public, we computed the rate of correct predictions (within the data), which was about 92.5%. I then multiplied this by all races (435). Therefore, the extrapolated number of correctly predicted races is 404 out of 435. If we use only the contested race subsample, we get 375 races out of 406 contested races. This is a correction of what I wrote in the op-ed, which accidentally combined these two estimates. The op-ed now contains the correction.
3. You don’t predict an election. “[...] just in case someone is paying attention: You, Have, To, Predict, In, Advance. If you don’t want to follow my advice follow that of Lewis-Beck (2005):”the forecast must be made before the event. The farther in advance [...] the better”. Gayo-Avello (http://di002.edv.uniovi.es/~dani/PFCblog/)
Professor Gayo-Avello and other commenters have raised the issue of prediction. He is correct in that we didn’t use contemporary data to predict elections in the future. Rather, we use “predict” in the statistical sense. We use social media data to estimate a dependent variable within the sample.
4. The Pollyanna effect is unsubstantiated. There is no support to say negative tweets are a good thing for a candidate.
The Pollyana effect is merely a hypothesized explanation for what we find. It requires further research and study. We make no claim that it has been established.
5. Twitter user base is not representative of the population, self-selection bias, spam, propaganda, lack of geolocation of tweets.
A number of commenters have focused on the fact that we know little about the people who write tweets, nor do we estimate whether tweets are positive or negative. This is true, but the point of the paper is not to make an estimate of who people are, or to interpret what they say. Rather, it is simply to show that that social media contains informative signals of what people might do. Remarkably, the data shows a correlation even though Twitter users are not a random sample of the population. We are simply measuring the relative attention given to a political candidate.
6. Vote share is a more natural way than vote margin to analyze and present the results, as well as consistent with prior Political Science research. (http://themonkeycage.org/2013/04/24/the-tweets-votes-curve/)
Some readers noted that traditional political science uses vote share rather than vote margin. Our updated paper corrects that. The original paper is a non-peer reviewed draft. It is in the process of being corrected, updated, and revised for publication. Many of these criticisms have already been incorporated into the current draft of the paper, which will be published within the next few months.
The shoe has dropped for the political scientists. The NSF has suspended funding, probably out fear of Congress.
My take away? Don’t be so dependent on one customer. Sociology doesn’t get that much from NSF anyway, but we should think about alternate sources.
Here’s a simple idea. Why not take all that sweet ASR subscription money and funnel it into an ASA controlled foundation that supports sociological research? That way, we have independence.
My dear friend and collaborator Michael T. Heaney has some new work that will be of interest to many readers. In the journal Social Networks, he has an article called Multiplex networks and interest group influence reputation: An exponential random graph model:
Interest groups struggle to build reputations as influential actors in the policy process and to discern the influence exercised by others. This study conceptualizes influence reputation as a relational variable that varies locally throughout a network. Drawing upon interviews with 168 interest group representatives in the United States health policy domain, this research examines the effects of multiplex networks of communication, coalitions, and issues on influence reputation. Using an exponential random graph model (ERGM), the analysis demonstrates that multiple roles of confidant, collaborator, and issue advocate affect how group representatives understand the influence of those with whom they are tied, after accounting for homophily among interest groups.
In the journal Interest Groups and Advocacy, he has a forthcoming article: Coalition Portfolios and Interest Group Influence Over the Policy Process, with Goeff Lorenz.
In the More Tweets, More Votes paper, we established that Twitter share correlates with future Congressional election results (e.g., % of tweets that mention GOP in a district correlates with the GOP vote share in the district). The deeper question – why? We’ve got a working paper that suggests an answer: Twitter, in some respects, mimics conventional text, which means that is close enough to the grass roots. In other words, people are more likely to use technology if it resembles what they know – an idea going back to a classic paper by Kwon and Zmud.
We can tease out testable implications. Specifically, technologies that are more sophisticated will be less likely to correlate with mass politics. In others, social media that is easy to use and relies mainly on pre-existing language skills are more likely to correlate with social trends than social media that require higher levels of functionality.
We test this with our tweets/votes data. We measured three types of candidate tweet share – “free text,” @mentions, and #hashtags. Free text is the “people’s” method of tweeting, while @mentions and #hashtags are syntaxes that require more knowledge. The grassroots hypothesis implies free text mentions of candidates will have a stronger correlation with election outcomes than @mentions or #hashtags. The results? Free texts correlate (as per the original paper) but the others are not significantly different from zero. The picture says it all.
Stark result. The implication is profound for social scientific studies of social media. If your data requires distinctly Internet based skills, it is less likely to speak to population level trends. Sophistication is probably the mark of connoisseur. Indeed, additional analysis of our data shows that @mention and #hashtag users are “intense” Internet users. For example, they have bigger median followers and are more likely to be “verified” by Twitter.
At Pacific Standard Time, an article about interesting state politics research. They list 10 cool findings. My favorite:
05. Members of the California Assembly from moderate districts tend to give moderate answers on political surveys. However, they still largely vote the same as the most extreme members of their parties. (Jim Battista, Josh Dyck, and Megan Gall).
Check it out.
Brendan Nyhan has a nice post on the sociology of scandal. He summarizes his research on presidential scandal in this way:
My research suggests that the structural conditions are strongly favorable for a major media scandal to emerge. First, I found that new scandals are likely to emerge when the president is unpopular among opposition party identifiers. Obama’s approval ratings are quite low among Republicans (10-18% in recent Gallup surveys), which creates pressure on GOP leaders to pursue scandal allegations as well as audience demand for scandal coverage. Along those lines, John Boehner is reportedly “obsessed” with Benghazi and working closely with Darrell Issa, the House committee chair leading the investigation. You can expect even stronger pressure from the GOP base to pursue the IRS investigations given the explosive nature of the allegations and the way that they reinforce previous suspicions about Obama politicizing the federal government.
In addition, I found that media scandals are less likely to emerge as pressure from other news stories increases. Now that the Boston Marathon bombings have faded from the headlines, there are few major stories in the news, especially with gun control and immigration legislation stalled in Congress. The press is therefore likely to devote more resources and airtime/print to covering the IRS and Benghazi stories than they would in a more cluttered news environment.
I’d also add that “events” have properties. It is easier to scandalize, say, the IRS investigation issue because it is simple. In contrast, the issue of whether the attack in Libya should have been labeled terrorism is probably to esoteric for most folks. If you buy that argument, you get a nice story about the “scandal triangle.” The likelihood of scandal increases when partisan opposition, bored media, and clearly norm-broaching events come together.
A number of people have asked me a very important question about the More Tweets, More Votes paper. Do relative tweet rates merely correlate with elections or is there is a causal link?
The paper itself does not settle the issue. The purpose of the paper is merely to document this striking correlation. Given that qualification, let me explain the argument from both sides and my priors.
- Correlation: Twitter is a passive record of how excited people are. If a candidate somehow garners the attention of the public, they get excited and start talking about it, which translates into a higher twitter presence.
- Causal: The unusual attention that a candidate attracts in social media sways undecided or weakly committed voters. In a sense, highly active twitter users are the “opinion leaders” of modern society.
My prior: 75% correlation, 25% cause. How would tease out these arguments? For example, what variable could instrument the district level tweet counts? Interesting to find out.
When people read our More Tweets, More Votes paper, they often wonder – where is the “sentiment analysis?” In other words, why don’t we try to measure whether a tweet is positive or negative? Joe DiGrazia, the lead author, addressed this in a recent interview with techpresident.com:
DiGrazia said the researchers were “kind of surprised” that they saw a correlation without doing sentiment analysis of the Tweets. “We thought we were going to have to look at the sentiment,” he said. He speculated that one reason for the correlation could be a so-called Pollyanna Hypothesis, “that people are more likely to gravitate toward subjects that they are positive about and are more likely to talk about candidates that they support.”
The idea is simply this: the frequency of speech is often a relatively decent approximation of how imporant people think that topic is relative to salient alternatives. If people say “Obama” a little more often than the competition, then it’s not unreasonable to believe that he is more favored. And you don’t need content analysis to suss that out.
Unit of analysis: US House elections in 2010 and 2012. X-Axis: (# of tweets mentioning the GOP candidate)/(# of tweets mentioning either major party candidate). Y-axis: GOP margin of victory.
I have a new working paper with Joe DiGrazia*, Karissa McKelvey and Johan Bollen asking if social media data actually forecasts offline behavior. The abstract:
Is social media a valid indicator of political behavior? We answer this question using a random sample of 537,231,508 tweets from August 1 to November 1, 2010 and data from 406 competitive U.S. congressional elections provided by the Federal Election Commission. Our results show that the percentage of Republican-candidate name mentions correlates with the Republican vote margin in the subsequent election. This finding persists even when controlling for incumbency, district partisanship, media coverage of the race, time, and demographic variables such as the district’s racial and gender composition. With over 500 million active users in 2012, Twitter now represents a new frontier for the study of human behavior. This research provides a framework for incorporating this emerging medium into the computational social science toolkit.
The working paper (short!) is here. I’d appreciate your comments.
* Yes, he’ll be in the market in the Fall.
A few weeks ago, Senator Tom Coburn of Oklahoma tried to ban the NSF from supporting political science research. And of course, a lot of folks in the academy voiced their objection. But there’s a broader question for political science: Why is the political science profession so reliant on NSF funding? Repeatedly, people said that a majority of political science projects are funded with NSF funds. Is this true? If so, then it is a precarious state of affairs.
Academic disciplines should rely on a diverse group of supporters. If Congress deems social science a worthy effort, then great. But if they don’t, then we should still be ok. Relying on the NSF is analogous to a business having a single wealthy customer. That’s usually a bad business model. Instead, social scientists should actively court different sources of funding ranging from the public sector, non-profits, individuals, and the corporate world. If you look at sociology, you see many important projects funded by all kinds of folks. The General Social Survey is your typical big project funded by the NSF. Ron Burt obtained a lot of his data from private consulting gigs. Merton’s reference group research was done for the Dept of War during WWII. A lot of Columbia sociology in the 50s and 60s was sponsored by for-profit groups in New York.
It is up to each researcher to decide what kind of funding they are willing to pursue. But collectively, we should encourage funding from many sources, or we’ll be at the mercy of the Tom Coburns of the world.
Sullivan is shocked that the level is high. A few comments: First, the rejection of the war has been stable since around 2006. Second, my hypothesis is that there is a baseline level of support for the war. These are mostly strong conservatives who identify with the Republican party. Third, the wording of the question probably inflates the answer a little. Do you think the United States made a mistake? A lot of people don’t like admitting their country is ever wrong. Sullivan’s post even notes that the support for the war drops in a different wording – whether the war was worth the cost. A bit more impersonal. Bottom line: People know things went badly in Iraq, but nationalism suppresses the feeling.
Hypothesis: The House GOP likes sequestration because it allows them to cut defense but blame the other side. The GOP base loves guns and defense so you just can’t cut defense. But you have to cut defense if you are actually believe in deficit reduction and cutting overall spending.
So you sell the base this deal where you say you’ll cut defense only if the other side cuts some welfare state programs. The base buys it because they think that the other side just loves welfare so much that they’ll never let sequestration happen. Thus, defense is never in real jeopardy. But the Struassian leadership knows the historical record – presidents usually win budget fights. Well, maybe not all the time, but they rarely just surrender. Presidents usually just dig in their heels while the public gets mad at Congress. This time, the House GOP is motivated by the base, not the average voter. So they won’t roll over like earlier Congresses. Neither move, and sequestration, and budget cuts (however small), actuall kick in for defense.