Saturday, October 15, 2011

The Twitter Hedge Fund Series: Using Twitter As a Sentiment Indicator - Part 1

I was reading a paper entitled Constructing consumer sentiment Index for U.S. Using Internet Search Patterns by Messrs  Nicolas  Della Penna and Haifang Huang, when it occurred to me that their methodology could be applied to Twitter.


In their paper, Messrs Penna and Huang assert the following:
  • The search term "Bankruptcy" is an indicator of adversarial financial conditions. The higher its frequency; the more adversarial the financial conditions.
  • The search phrase "Office Furniture" is an indicator of improving business conditions. The higher its frequency; the higher the vibrancy of business conditions.
  • The search phrase "Luxury Goods" is an indicator of increasing willingness to spend on discretionary items. Generally, the higher its frequency; the more bullish the economic outlook.
  • The search phrases "Oil and Gas", "Electricity", "Alternative Energy" and "Hybrid vehicles" indicate attention to energy cost - which shows a bearish outlook. The higher their frequency; the more bearish the economic outlook.
As a rough test of the applicability of their methodology to Twitter, I picked the terms "Bankruptcy", "Luxury"and I extracted the frequency with which they appeared in tweets from the 24th of April to the 9th of October. Here is what I got:

Chart 1: The encircled points indicate the things that immediately stood out
Click on image for better visibility


Find below my commentary on each of the points:
  • The first encircled point between 1-8 May: Here the term 'Luxury' is twitted with the greatest frequency. And the differential between the frequency with which the term is twitted  and the frequency with which the other terms are twitted is at its greatest. This indicates that the outlook around the 5th of May was at its most bullish. Since the S&P 500 is an indicator of general market sentiment, this should reflect in the index anytime between the 1st and the 8th.  However, in chart 2, the S&P 500 actually is on a downtrend during that period... Nonetheless, I would be interested in finding out how stocks of luxury oriented products performed then.
  • The second encircled point between 12-19 June: The frequency with which the term 'Electricity' appears in tweets hits a peak, 'Bankruptcy' also hits a peak and while 'Luxury' is on a dip. In aggregate, this indicates bearish sentiment. In Chart 2, between June 13 and June 14 the S&P 500 actually dips -  i.e. the market trend is in sync with the twitter trend! As a side note, I would be interested in finding out how energy stocks performed then.
  • The third encircled point between Jun 26-July 3: 'Bankruptcy' hits the highest peak and the other two terms also peak. In aggregate two out of the three indicators indicate bearish sentiment. So, this should be taken as a sign of bearish sentiment. In chart 2, the S&P 500 actually dips between the 26-28th of July -  i.e. the market trend is in sync with the twitter trend! As a side note, I would be interested in finding out how an index of luxury stocks performed then.
  • The  forth encircled point between August 28 and September 4: 'Bankruptcy' hits a trough and 'Electricity' and 'Luxury' are on an uptrend. Two out of the three indicators are bullish, so this should be a sign of a bullish outlook. In chart 2, on the 29th of August the S&P 500 opened at 1,177.91 and closed at 1,210.08 -  i.e. the market trend is in sync with the twitter trend! As a side note, I would be interested to find out how energy stocks performed during that time period.
Chart 2 below shows points that correspond with the discussed sentiments:


Chart 2 S&P 500 movements that correspond to the sentiment terms
Click on Image for better visibility

Although I used a very crude method, the sentiment indicators do appear to have some reflective power. So, I will dig deeper to see if they may have any predictive power.


I'll first start by creating a content aggregator that draws each of the above mentioned terms from Twitter. For this exercise I will use Yahoo Pipes and my aggregator  looks like this:


Chart 3: Content Aggregator
Click on Image for Better Visibility


The data feed that I will mine and draw trends from can be accessed from here  (Click on the 'List tab'for the aggregated raw tweets)

In a subsequent post I will create a Twitter sentiment indicator, from the data I draw from the aggregator, and see if it has any predictive power.


Stay tuned, that is the interesting part!