How we conducted our 2019 general election polling

2019 has been a testing year for pollsters, as much as we love a challenge. The European Parliament elections in the summer where the traditional parties looked like they might turn to dust has been followed by a general election campaign where the public seem to be returning to type. This means we are constantly fighting to ensure that our methodology is up to date and reflecting the world around us.

This year we have gone through a range of changes and, as such, there are so many questions that we thought it was best to summarize everything in one place.

The principles behind our 2019 methodology review

The Sturgis review into the polling inaccuracies of 2015 concluded with quite simply “the primary cause of the polling errors in 2015 was unrepresentative sampling”. Those two words encompass such a large number of issues from the way that panels are recruited all the way through to sample frames and weighting that it’s simply impossible to go into all of it now.

The result of this unrepresentative sampling in 2015 was to underestimate the Conservative vote share and overestimate Labour’s. So the industry to one degree or another responded with some form of ‘political weighting’ or ‘correction’, more or less finding ways to boost the Conservative share either directly or indirectly. The 2017 general election showed that samples might not always be unrepresentative in the same way.

The lesson we took was twofold.

Firstly, we should seriously re-examination what we might not be getting right in terms of how our samples are comprised. What do we currently weight by? What could we weight by? One sample of 2,000 UK adults can’t be exactly representative of the public on 20 demographic measures, but we can ensure 10 (for example), as long as they are the ‘right’ ones.

Secondly, straightforward solutions such as weighting up the core group of one party’s supporters over another isn’t the answer. The answer to the problems of 2015 and 2017 should be the same for us; clearly there are groups who voted Tory in 2015 and not in 2017 that we underestimated. But finding out who these groups are is the challenge.

The biggest change – the resident public vs the electorate

The first change we implemented this year was to account accurately for the proportion of the public who are actually registered to vote.

This might sound like a simplistic answer, but actually it is crucial. Only those who are registered to vote can cast a ballot at any stage, and not all of them are actually entitled to participate in general elections.

For example, at the last general election the electorate totally less than 47 million despite the population being over 52 million. Shockingly, this means that at the last general election more than one in ten of the resident public could not participate.

This has endless implications. If the aim is to achieve a representative sample, then how do we get them? We get official statistics about the mix of age, gender, ethnicity, home ownership, etc. These all represent the resident public, not the voting public. If we get the mix of the former spot on but don’t think about the latter then we will have accepted that our sample is unrepresentative in a key way before we get onto any other issue.

In terms of the raw data fall out from our sample, we underestimate the proportion of those not entitled to vote in general elections by half. It means weighting to correct for gives a greater chance that the demographic makeup of the electorate is more likely to be correct when we weight to correct for any imbalances in the overall resident population.

A great example of this shift is in London, and how not correcting for it can make the overall national sample unrepresentative from a political perspective. London makes up 13.1% of the UK’s resident population, but it makes up only 11.5% of the electorate. With London’s skew towards one party and its rather unique political and demographic situation, correcting for something like this is crucial.

Moving onto other demographic factors

Once we had planned for this, we could move onto other demographic factors.

We wanted to start with the least number of preconceptions possible. The aim is to use the right demographic criteria currently relevant in determining how people vote.

To do this we went far and wide to collect the broadest range of demographic criteria we could possible find – everything from car ownership to whether or not you hold a passport. We then asked about the same demographic criteria on a nationally representative sample alongside how participants in the European Parliament elections cast their vote. Then we simply tested an endless number of permutations. Which weighting combination got us organically closest to the actual European parliament election result?

The combination we alighted upon was as follows: gender, age, region, employment status, car ownership, occupation and past vote from 2015, 2016 and 2017.

 This wasn’t just a case of what we could add. We also that found our old social grade weight made the results less accurate, so we replaced it with a far more targeted question around occupation.

Accounting for false recall in past vote

We also noticed that past vote recall isn’t particularly brilliant. Roughly one in ten voters do not accurately remember who they voted for two years ago, and a higher proportion struggle with how they cast their ballot in 2015.

The only vote we tested that this wasn’t true of was the 2016 EU Referendum, where survey participants had a very high degree of accuracy in remembering how they voted in the 2016 EU Referendum. Another sign of how the 2016 vote has split us into two clearly defined camps.

In short, we needed to find a way to take this phenomenon into account, especially for general elections. Despite how it might look on the face of it, shares of the vote for political parties from one election to the next change by incredibly small amounts. 2017 might have been drastic but the Conservative and Labour vote shares changed by a much smaller amount between 2005 and 2015.

Not accounting properly for those switching their vote but ‘conveniently’ forgetting they ever voted for another party has a big impact, even when we are still only talking about a small percentage of the overall sample.

To add to the challenge, we do not have the past vote for everyone on our panel from the time they actually cast their ballot. To solve this, we simply ask it in every survey and pull the data form the best source we can find. If they answered our ‘day of poll’ survey from 2017 and 2015, we use that data. If they answered a survey in the immediate aftermath of the election, we choose that. If we have none of these things we rely on what they told us in each survey as it happens as a last resort, but fortunately we need to rely on this for only a minority of participants.

Correcting for which constituency the parties are standing in

The only change that we made during the campaign – and by change, we really mean additional stage – is only showing the parties that are standing in each constituency. Our system identifies which constituency a participant lives in and then shows the parties that we know are standing in that constituency. This change was brought in once nominations had closed and we had a firm list of where the Brexit Party were and weren’t standing.

To maintain a comparable dataset, we still ask the broader national question around which party you are planning to vote for in the general election. However, we follow this up with another question explicitly stating that these are the only parties standing in their constituency and, on that basis, who would they plan to vote for.

Keeping both questions proved to be useful. When we first included the constituency-specific question it appeared as though the Conservative lead had shot up to 19 points. In reality no votes had actually shifted week-on-week, but for the first time in hundreds of Conservative seats potential Brexit Party voters were offered no option but to vote for one of the other main parties.

In 2017 we also suffered from larger vote shares being given to smaller parties, because we did not correct for the fact that in hundreds of seats the party wasn’t standing.

All in all these changes hopefully mean that:

  1. We have an accurate understanding of the makeup of the electorate as well as the resident population.
  2. We use the best possible demographic criteria for getting headline voting intention to accurately reflect previous voting patterns without simply weighting to some form of overtly political criteria outside of past vote.
  3. We can account for the fact that votes cast at UK general elections are an aggregate of 650 constituency battles, not just one national poll.