use twitter, make money fa$t

My colleague at Indiana University, Johan Bollen has patented an algorithm that allows him to link Twitter traffic to stock price fluctuations. Click on the link for the TV news item. A clip from the report:

An IU professor and researcher just received a patent for software that crunches hundreds of millions of tweets, to predict where the stock market is headed…

Think of this way: The thoughts of two or three million people probably don’t add up to much, but if you multiply that by tens or hundreds of millions of people, then you may have something.

“We find that when people get more anxious, then there is a great likelihood of the market dropping 3-4 days later and vice versa,” Bollen said.

Definitely check it out.

Adverts: From Black Power/Grad Skool Rulz 

Written by fabiorojas

March 3, 2013 at 12:03 am

Posted in education, fabio, networks

9 Responses

Subscribe to comments with RSS.

  1. What’s the behavioral mechanism behind the 3-4 day lag? Paper?


    Graham Peterson

    March 3, 2013 at 1:32 am

  2. Found it:

    Most of the paper leans on significance tests (of Granger-caused lag variables) to claim mood is predictive of stock market fluctuations. Signal/noise ratios are not tests of social significance. I also don’t see the twitter variables competing with other already-established consumer and producer sentiment (done with surveys currently) indices, nor with material controls like supply shocks.

    Interesting stuff though. Go NLP!


    Graham Peterson

    March 3, 2013 at 1:48 am

  3. Depends on your goals. If you want explanation, you’ll need more. But if you merely need to establish that variable A really does correlate with B in a reliable way, it’s enought. Causation isn’t the only goal of quantitative analyis. Measurement and model fitting are other goals.



    March 3, 2013 at 6:11 am

  4. The bone I’m picking isn’t the old description v. prediction debate. Similarly on method I don’t have a preference for hypothesis testing v. data mining.

    I agree measurement and model fitting are premium goals. Statistical hypothesis testing accomplishes neither of these. They’re a measure of the quality of a sample – the same way a signal/noise ratio is a measure of the quality of a guitar cable — it says nothing about the music coming through the cable.

    That there’s a very good chance your Beta doesn’t equal zero doesn’t mean your Beta means anything at all, though this interpretation is unfortunately conventional. It’s not a good way to eliminate/keep variables. Say I have a coefficient on rubber ducky sales with 8,584,383 observations, determining the Dow average. I have a significant coefficient, and have said nothing.

    More importantly — measurement is precisely the spirit that significance tests lack — your metric is the *size* of your coefficient — not the likelihood that it’s not zero given your sample size.

    The regressions aren’t much better than correlation coefficients in a psychological experiment — in fact much worse because the psychologist has already controlled for random variation before observing the variable of interest. Ex ante regressions do not.


    Graham Peterson

    March 3, 2013 at 6:39 am

  5. To be clear — I’m all for the work being done in the paper, and want to see much more momentum gathering around it. A much better approach than population-level twitter observations, I think, would be observing a corpus of financial news publications for variations in sentiment. This has the same Big Data and NLP zing, implies the same relevance of culture in markets, and likely allows the establishing of an actual causal mechanism, which Bollen’s paper does not. To his credit — he does not at any point claim to either — and in fact is very straightforward about what his data reveal. The wording in the news article is misleading, though this we should expect.

    Any of these studies should run a few regressions with lagged stock prices themselves on the RHS — I’ve read variously that stock prices follow Markov chains. And this is I think actually one of the major criticisms of Efficient Markets Hypothesis in finance, and the application of static and dynamic optimization models of financial markets. Prices are endogenous. That’ll bake your noodle.


    Graham Peterson

    March 3, 2013 at 6:47 am

  6. What I find interesting is that there is something important to be found in non-expert discourse. Financial news, of course, would be relevant. But it would be an important finding if general trends in culture, generated by people with little economic expertise, can be meaningfully be linked to economic fluctuations.

    Also, twitter is more “real time” than financial news, which is often worked on way in advance. It’s akin to using google search data to track various trends. It’s telling you something much different than expert discourse. And it’s worth thinking about more.



    March 3, 2013 at 6:54 am

  7. Absolutely. This is in fact an extremely important question in content and discourse analysis. Assuming that sentiments do in fact cascade (or just migrate) through populations, we need to establish whether they descend from the top of hierarchies (this seems to be the major assumption among communications theorists, whom favor Power theories), or whether they gather momentum from somewhat disparate adopters in the street. I think we might be surprised to find that generally they accrete in the street — though this would certainly be disappointing to the self-perceptions of academics. *kicks the dirt; pulls down hat*


    Graham Peterson

    March 3, 2013 at 7:02 am

  8. I saw this paper presented last year. What struck me then is that online data like tweets can be a useful resource for sociologists who want to measure changes in collective perceptions and emotions. They’re a bit like dynamic public opinion data.


    brayden king

    March 3, 2013 at 3:57 pm

  9. TODAY ONLY: Free Commissions For Life! ($3,000 A Day)

    Free Training: $10,000 a month in 5 minutes!



    April 23, 2013 at 1:10 pm

Comments are closed.

%d bloggers like this: