Polls in the US elections: what went wrong?
I wake up the day after election day knowing, as I grab my phone, my morning will go either of two ways. I never appreciate the power of a poll until BBC News loads and my expectations are tested with the real truth, not the predicted one.
The majority of articles I read following the US Presidential Election on 3 November 2020 label the polls carried out by well respected polling agencies as “wrong”. “Wrong” depends on your viewpoint. First, I’ll tell you how wrong they were in comparison to the national popular vote and in predicting the Electoral College.
I’ll consider the well-respected political science and statistics website FiveThirtyEight’s poll aggregator, developed by US Statistician Nate Silver. Nate Silver had been impressively accurate with his US Election predictions in the past, predicting 49 out of 50 states correctly in 2008 and all 50 states in 2012. That is until 2016, when Silver got only 45 states right and allocated 70 electoral college votes incorrectly.
In 2020, it looks like FiveThirtyEight has predicted 47 of the 50 states correctly, if we suppose the current leading party in the remaining states win the electoral votes. However, the polling models for 2020 are embarrassingly similar in one aspect to those of 2016.
In two of the last three Democratic defeats, if the Democrats had taken Florida’s 29 electoral votes they would have taken the White House
Donald Trump’s support in both elections was underestimated. Florida, along with Michigan, Ohio, and Pennsylvania, is one of the most electoral college vote-heavy swing states. The colour Florida turns on the night is an important sign for how the election night may go. In two of the last three Democratic defeats, if the Democrats had taken Florida’s 29 electoral votes they would have taken the White House. FiveThirtyEight gave Biden a 2.5% advantage in Florida, but with 99% of votes counted, Trump took the state with a 3.3% lead over Biden. The polls had been off by nearly 6%.
In other states, at first glance the polls were correct. Biden won both Wisconsin and Michigan, but overestimated the Democrats’ lead by 7.7% and 5.5%, respectively. Nate Silver did say the election would be a “fine line between a landslide and a nail biter”, and that’s maybe one of the most accurate of Silver’s predictions.
How do you think that polling agencies predict the polls? Frustratingly for them they can’t read minds. Let’s pretend to be a pollster in the US election. First, we need to collect some responses from the electorate. Pre-election polling suggested out of 500 responses, 255 would vote for Biden and 243 would vote for Trump.
However, you need to consider how many electorates did not respond. Statistician Andrew Gelman argues that response rates are as low as one in 100 people for some polls, so let me rephrase the data we’ve collected. Now out of 10,000 people, we know 255 would vote for Biden, 243 will vote for Trump and we don’t know how the other 9,500 will vote.
Republican pollster Frank Luntz also argues that Trump voters are much less keen to respond to polls as “that would help the pollster manipulate them”
Who picks up the phone automatically to talk to the pollster is not random. Independent pollster Richard Baris finds that highly educated voters are much more likely to talk to pollsters compared to other members of the electorate. Republican pollster Frank Luntz also argues that Trump voters are much less keen to respond to polls as “that would help the pollster manipulate them”.
Extrapolating the results of the polling sample to the wider population relies on comparing predictive variables and information they collect to historical data. Over 20 years, you only have a maximum of five historical data sets to use to predict the next election. Additional variables, such as the success of Covid-era campaigning and the influence of technology on voters, are unquestionably important but nevertheless difficult to base on historical data.
So the question is, are polls useful? I think it depends how big a salt pinch you take to season your poll viewing. As government professor Chris Edelson says, polls are “a probability, not a prediction and certainly not certainty”. I would argue that this election polling could rely on less historical data than it would like to. An election during a global pandemic is unprecedented, to what extent can you expect a poll to be “correct”?
My statistics lecturer mentions British statistician George Box’s words when we’re modelling data. He tells us to remember that “all models are wrong, but some are useful”. After 40,000 election simulations, FiveThirtyEight gave Biden an 89 in 100 chance of winning, and he did win. To me, they made the polling data pretty useful; only you can decide whether you agree with me.
Comments