UK Elections 2015

A Study for UK Elections 2015 based on Twitter and News Media

Exploring the right mix of polls and Twitter features

The methodology followed within this study is based on the idea that when twitter features are incorporated into poll-based regression models, the predictions get significantly better compared to more conventional poll-based regression. Intuitively, one may think that the Twitter-based features in such a model act in some form to “correct” the polls. But is achieving better results than poll-based regression good enough? What if there is a polling outlier on the very last day before the elections? Would our Twitter features be able to correct that and, if so, to what extent? Would it be preferable if we didn’t consider the polls conducted over the last period before the election day, but instead rely strictly on our Twitter features during that period, to avoid such phenomena?

In order to answer to these questions, we have decided to test two different models:

Poll enhancer: We have been following polls until the very last day before the elections (6th of May), extracting daily features from them to serve as our target. In this approach, our final election prediction is heavily influenced by the polls, especially those conducted during the final days before the Election Day.

Reduced poll-influence: In order to reduce the influence of opinion polls in our final prediction model, as well as to test the impact of Twitter features, we have removed poll-based features that were used by the Poll enhancer within the last week before the Election Day. Thus, by using polls published up to one week before the Election Day and Twitter data tracked between 21st of March and 6th of May, we have trained a model that shifts away from the latest polls and makes a more independent election prediction.​

The results presented in the home page are the ones produced by Poll enhancer, while the ones presented here refer to the Reduced poll-influence. Quite interestingly, we find that there exist important differences in their outputs. The election results will tell which of the two models gave a better approximation of the final outcome. However, in order to understand the importance of the contribution of the various parameters across different days, we plan to further investigate the two models as part of our post election analysis.

Conservative: 33.56
Labour: 33.10
UKIP: 13.05
Lib Dem: 8.34
Green: 5.66
SNP: 4.27
Other: 2.03

Is data from Twitter pointing towards a last-minute swing?

National opinion polls ahead of the 2015 UK general election have remained static for months. They’ve typically reported that Labour and the Conservatives are tied on 32-35% of the vote each, that UKIP and the Liberal Democrats have around 13% and 8% respectively, and that the Greens and the SNP are both hovering around 5%. This has prompted some analysts to ponder whether these polls are really telling the whole story. Lord Ashcroft – one of the most frequently cited pollsters – has labelled national polls mere “mood music”, and has been conducting additional polls at the constituency level in order to better predict how many actual seats each party can expect to win.

Our approach – which is based on using data from Twitter to make faster and more accurate predictions of the vote share – also allows us to paint a more detailed picture of the election than the national polls imply. It allows us to spot changes in the public mood early on, as well as the more subtle fluctuations that averaged national polls might paper over. Ultimately, we hope that properly understanding these shifts will help us to make better predictions – particularly if they occur just before polling day when much opinion polling will have stopped.

What’s our data telling us at the moment? Throughout the election period, the Conservatives have been more commonly referred to than any of the other parties, with around 40% of all tweets about the election referring to them in some way. Labour are in second place with around 30%. UKIP and the SNP are in third and fourth with 17%, and the Lib Dems and the Greens fifth and sixth with about 7%.


However, the way in which these numbers have changed over time is likely to be more revealing, as this may indicate a change in the public mood. Though there were brief peaks when each of the parties launched their respective manifestos, this pattern has held surprisingly well. Until recently, that is, when Labour overtook the Conservatives for the first time. As our charts (and the image above) show, this shift coincided with the broadcast of the BBC’s Question Time: Election Leaders Special, and occurred shortly after Ed Miliband was interviewed by the comedian and activist Russell Brand.

Of course, parties can be referred to in both a positive and negative sense, so just looking at the number of tweets may be misleading. Using an approached based on automated sentiment analysis, we are able to look at just those tweets that express positive sentiment. If we do this, we see that throughout the election period, positive tweets have most often referred to the Green party, with the SNP in second place, and the four other main parties closely bunched together behind them.

positivetweets labourpositivetweets

Interestingly, if we look at the chart showing just the positive tweets referring to the Labour party, it could suggest that the recent rise in the total number of tweets is driven by positive sentiment towards them. Could this be the early signs of a shift that is only reflected in the opinion polls later on? Time will tell.