How we conducted our 2019 general election polling

Thu 12 Dec 2019

How we conducted our 2019 general election polling

Blog Political

2019 has been a testing year for pollsters, as much as we love a challenge. The European Parliament elections in the summer where the traditional parties looked like they might turn to dust has been followed by a general election campaign where the public seem to be returning to type. This means we are constantly fighting to ensure that our methodology is up to date and reflecting the world around us.

This year we have gone through a range of changes and, as such, there are so many questions that we thought it was best to summarize everything in one place.

The principles behind our 2019 methodology review

The Sturgis review into the polling inaccuracies of 2015 concluded with quite simply “the primary cause of the polling errors in 2015 was unrepresentative sampling”. Those two words encompass such a large number of issues from the way that panels are recruited all the way through to sample frames and weighting that it’s simply impossible to go into all of it now.

The result of this unrepresentative sampling in 2015 was to underestimate the Conservative vote share and overestimate Labour’s. So the industry to one degree or another responded with some form of ‘political weighting’ or ‘correction’, more or less finding ways to boost the Conservative share either directly or indirectly. The 2017 general election showed that samples might not always be unrepresentative in the same way.

The lesson we took was twofold.

Firstly, we should seriously re-examination what we might not be getting right in terms of how our samples are comprised. What do we currently weight by? What could we weight by? One sample of 2,000 UK adults can’t be exactly representative of the public on 20 demographic measures, but we can ensure 10 (for example), as long as they are the ‘right’ ones.

Secondly, straightforward solutions such as weighting up the core group of one party’s supporters over another isn’t the answer. The answer to the problems of 2015 and 2017 should be the same for us; clearly there are groups who voted Tory in 2015 and not in 2017 that we underestimated. But finding out who these groups are is the challenge.

The biggest change – the resident public vs the electorate

The first change we implemented this year was to account accurately for the proportion of the public who are actually registered to vote.

This might sound like a simplistic answer, but actually it is crucial. Only those who are registered to vote can cast a ballot at any stage, and not all of them are actually entitled to participate in general elections.

For example, at the last general election the electorate totally less than 47 million despite the population being over 52 million. Shockingly, this means that at the last general election more than one in ten of the resident public could not participate.

This has endless implications. If the aim is to achieve a representative sample, then how do we get them? We get official statistics about the mix of age, gender, ethnicity, home ownership, etc. These all represent the resident public, not the voting public. If we get the mix of the former spot on but don’t think about the latter then we will have accepted that our sample is unrepresentative in a key way before we get onto any other issue.

In terms of the raw data fall out from our sample, we underestimate the proportion of those not entitled to vote in general elections by half. It means weighting to correct for gives a greater chance that the demographic makeup of the electorate is more likely to be correct when we weight to correct for any imbalances in the overall resident population.

A great example of this shift is in London, and how not correcting for it can make the overall national sample unrepresentative from a political perspective. London makes up 13.1% of the UK’s resident population, but it makes up only 11.5% of the electorate. With London’s skew towards one party and its rather unique political and demographic situation, correcting for something like this is crucial.

Moving onto other demographic factors

Once we had planned for this, we could move onto other demographic factors.

We wanted to start with the least number of preconceptions possible. The aim is to use the right demographic criteria currently relevant in determining how people vote.

To do this we went far and wide to collect the broadest range of demographic criteria we could possible find – everything from car ownership to whether or not you hold a passport. We then asked about the same demographic criteria on a nationally representative sample alongside how participants in the European Parliament elections cast their vote. Then we simply tested an endless number of permutations. Which weighting combination got us organically closest to the actual European parliament election result?

The combination we alighted upon was as follows: gender, age, region, employment status, car ownership, occupation and past vote from 2015, 2016 and 2017.

This wasn’t just a case of what we could add. We also that found our old social grade weight made the results less accurate, so we replaced it with a far more targeted question around occupation.

Accounting for false recall in past vote

We also noticed that past vote recall isn’t particularly brilliant. Roughly one in ten voters do not accurately remember who they voted for two years ago, and a higher proportion struggle with how they cast their ballot in 2015.

The only vote we tested that this wasn’t true of was the 2016 EU Referendum, where survey participants had a very high degree of accuracy in remembering how they voted in the 2016 EU Referendum. Another sign of how the 2016 vote has split us into two clearly defined camps.

In short, we needed to find a way to take this phenomenon into account, especially for general elections. Despite how it might look on the face of it, shares of the vote for political parties from one election to the next change by incredibly small amounts. 2017 might have been drastic but the Conservative and Labour vote shares changed by a much smaller amount between 2005 and 2015.

Not accounting properly for those switching their vote but ‘conveniently’ forgetting they ever voted for another party has a big impact, even when we are still only talking about a small percentage of the overall sample.

To add to the challenge, we do not have the past vote for everyone on our panel from the time they actually cast their ballot. To solve this, we simply ask it in every survey and pull the data form the best source we can find. If they answered our ‘day of poll’ survey from 2017 and 2015, we use that data. If they answered a survey in the immediate aftermath of the election, we choose that. If we have none of these things we rely on what they told us in each survey as it happens as a last resort, but fortunately we need to rely on this for only a minority of participants.

Correcting for which constituency the parties are standing in

The only change that we made during the campaign – and by change, we really mean additional stage – is only showing the parties that are standing in each constituency. Our system identifies which constituency a participant lives in and then shows the parties that we know are standing in that constituency. This change was brought in once nominations had closed and we had a firm list of where the Brexit Party were and weren’t standing.

To maintain a comparable dataset, we still ask the broader national question around which party you are planning to vote for in the general election. However, we follow this up with another question explicitly stating that these are the only parties standing in their constituency and, on that basis, who would they plan to vote for.

Keeping both questions proved to be useful. When we first included the constituency-specific question it appeared as though the Conservative lead had shot up to 19 points. In reality no votes had actually shifted week-on-week, but for the first time in hundreds of Conservative seats potential Brexit Party voters were offered no option but to vote for one of the other main parties.

In 2017 we also suffered from larger vote shares being given to smaller parties, because we did not correct for the fact that in hundreds of seats the party wasn’t standing.

All in all these changes hopefully mean that:

We have an accurate understanding of the makeup of the electorate as well as the resident population.
We use the best possible demographic criteria for getting headline voting intention to accurately reflect previous voting patterns without simply weighting to some form of overtly political criteria outside of past vote.
We can account for the fact that votes cast at UK general elections are an aggregate of 650 constituency battles, not just one national poll.

Cookie	Duration	Description
__cf_bm	1 hour	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	1 hour	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
_GRECAPTCHA	6 months	Google Recaptcha service sets this cookie to identify bots to protect the website against malicious spam attacks.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Analytics" category.
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Necessary" category.
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie stores user consent for cookies in the category "Others".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
JSESSIONID	session	New Relic uses this cookie to store a session identifier so that New Relic can monitor session counts for an application.
opinium_language	273 years 9 months 13 days	No description
PHPSESSID	session	This cookie is native to PHP applications. The cookie stores and identifies a user's unique session ID to manage user sessions on the website. The cookie is a session cookie and will be deleted when all the browser windows are closed.
pll_language	999 years	This cookie is set by Polylang plugin for WordPress powered websites. The cookie stores the language code of the last browsed page.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
_c_	session	No description
_cfuvid	session	Description is currently not available.
cookiecookie	session	Description is currently not available.

Cookie	Duration	Description
__hstc	6 months	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gat_gtag_UA_15903633_3	1 minute	This cookie is set by Google and is used to distinguish users.
_gcl_au	3 months	This cookie is used by Google Analytics to understand user interaction with the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
_hjAbsoluteSessionInProgress	30 minutes	No description available.
_hjFirstSeen	30 minutes	This is set by Hotjar to identify a new user’s first session. It stores a true/false value, indicating whether this was the first time Hotjar saw this user. It is used by Recording filters to identify new user sessions.
_hjid	1 year	This cookie is set by Hotjar. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInPageviewSample	2 minutes	No description available.
_hjSession_*	1 hour	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjSessionUser_*	1 year	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjTLDTest	session	No description available.
hubspotutk	6 months	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Insight