The initiation of a nation-wide election is like the sunrise to statisticians and pollsters. Instead of the quintessential morning coffee, this most peculiar of creatures brews tea and, as a morning exercise, attempts to predict the ways in which the tea leaves will move.
Pollsters attempt to use statistics in an effort to model voting behaviour and predict election outcomes. Henry Durant, Britain’s first ever official pollster, once jested that his was ‘the stupidest of professions.’ He was left with egg on his face when his own predictions reflected that of electoral reality: Clement Atlee beating Winston Churchill in 1945. With this result, the stupidest of professions garnered a suit and became serious; guessing what voters were going to do was all the rage, with every major party from then on hiring their own internal pollsters.
In 2015, the guessing profession began to suffer an erosion on its hard-fought reputation of seriousness. Pollsters had said the 2015 election would be a photo finish between Labour and the Conservatives; this prediction only had the effect of cruelly getting Ed Miliband’s hopes-up only to then dash them, like ketchup in a bacon sandwich. The reputation of pollsters then suffered an existential crisis when the Brexit referendum result materialised a leave win – an outcome none of the guessers had accurately modelled. Pollsters sufficed to tweak their guessing methods, as much of the variance in voting behaviour simply could not be predicted with the tools at their disposal.
The pollster’s models appeared to be under powered. The models in use did not have a wide enough scope of data to accurately account for certain factors and their miniscule affects on voting behaviour. In 2015, it was also found that the Labour turnout was overestimated because of data sampling issues; most data was gathered online, which neglected the effects of older voters and those who are not as politically vigilant. Pollsters, moreover, make a decision to put more credence on some responses over others. These responses are said to be more reliable and predict more of the outcome variance – so these responses attain a greater weight within the model. A good example of this is voter age: the older someone is, the more likely they are to vote; thus, voter age attains a greater weight within most models. Weighting is mechanism needed simply because not enough of a sample can ever be attained for each response to plainly give away it’s importance (only after an election can we know for sure). Therefore, if the weights are assigned incorrectly, the model is liable to be mistaken. This could be a problem during the election.
The majority of problems in the prediction business, however, arise from the dreaded prediction of election turnout. Most models in 2017, for instance, based their assumptions on election turnout on the 2015 plebiscite. This effected most polls’ (apart from the Survation poll) ability to correctly guess Labour’s turnout: more students cast their votes for Labour than they had done in 2015. The polls, then, had completely underestimated the likelihood of a big student vote, which torpedoed most of their model’s vote share predictive power. Some polls have now opted to return to the traditional method of determining potential voter turnout: asking people how likely they are to get out and vote.
The vast polling methodology on the market has led to many sparse poll results. One poll gave the Tories a 17-point lead over Labour, whilst another only gave the Conservative’s only a modest 4-point lead over Labour. Its also tough for pollsters to translate their predicted party vote shares to seats won at Westminster. The method of applying the change in vote share to each constituency has been successful in the past. But, Because of volatile voting patterns and the likelihood of tactical voting, this method will probably not hold water.
Pollsters have been borrowing out of the data scientists’ toolbox in an effort to better predict how the national vote relates to seats won at Westminster. One method proving popular has the kerfuffle of the name multiple regression and post stratification (MRP). This method combines normal prediction methods with census data to ascertain the number of people of each demographic category that reside in each area. Armed with this data, pollsters have been able to apply their predictor variables (i.e. age, educational attainment, SES, remainer or leaver, personal voter history) at the constituency level. YouGov and Best for Britain (a pro-remain group) are the latest to have architected these MRP models and have already begun putting them to good use. A week before 2017 election day YouGov bravely forecast the Tories severely underperforming. Many were sceptical, but their prediction of the Conservatives winning between 274 and 345 seats was finely tuned, with the election returning only 317 seats for the Tories.
Although polling is beginning to reinvent itself and show ingenuity in the face of voter volatility in an attempt to revive its reputation, politicians should ask themselves if they want to lead, or simply be a sheep to the polls. MPs and leaders seem to think that the only way to win is to go to where the voter is, rather than where they would like the voter to be. Margaret Thatcher once quipped “if you are guided by opinion polls, you are not practicing leadership – you are practising followership”.