You may have noticed that this was a rough year for polling and election forecasting.
As I pointed out in my previous post, polls significantly under-estimated Donald Trump's level of support, especially at the state level. As a consequence, the short range projections by a number of forecasters missed the mark and indicated that Clinton would likely win, and quite handily at that.
Of course, we know at this point that's not what happened.
The question that inevitably comes up in hindsight is, "What did we miss?"
The polling community is already trying to figure that out, at least from the perspective of how the polls could have been so wrong. But for those of us in the forecasting community who are dependent on the polls, that is of little consolation. Unless the data going into the model is accurate, whatever comes out won't be. Figuring out why the data was flawed still doesn't change the fact that we were wrong in our forecast.
But what if we could have seen the error coming? What if we had been able to incorporate that into our models? Would that have made a difference?
Well, as it turns out, there's evidence that maybe we could have seen it coming. Not necessarily the structural failures in the polling itself, but the pro-Trump wave that the polling might have been missing.
I mentioned Helmut Norpoth in my previous post. Norpoth received a great deal of attention in the last weeks of the campaign for his forecast that showed Trump would win. He wasn't the only one. Alan Abramowitz is another forecaster whose model predicted a Trump victory. Many have pointed out that both Norpoth and Abramowitz were technically wrong in their forecasts because they predicted that Trump would win the popular vote, and Clinton is likely to be the popular vote winner. Even so, their forecasts pointed in a direction that very few others did.
So, what did their models do that others' didn't? All three (Norpoth has two: one that came out with a forecast not long after the 2012 election, and another that generated its forecast in Feburary) have a very basic premise: It's just difficult for a party to hold on to the White House for a third consecutive term.
In my previous post, I mentioned another model that I created last year that also showed a potential Trump victory. Unfortunately, for reasons I explained in my previous post, I set it aside in favor of the one I've been using with Tom Holbrook for the last 4 elections. One key difference between the two is that the one I came up with also models what Abramowitz has called the "two-term penalty:" Support for the incumbent party just automatically goes down after two terms as people are more likely to desire change.
What that tells us, or should have, is that there was likely a core of increased support for the Republican candidate despite what the polls showed. Many of us refused to see it. Abramowitz doubted his own forecast because of what he called the "Trump effect." To be sure, there does appear to be some evidence that Trump under-performed what another Republican might have been able to do. That may help explain why he failed to win the popular vote despite Norpoth and Abramowitz's forecasts that he would.
It got me thinking, what if we tweaked our model? What if I added a variable for the two-term penalty just like I'd done in my Long-Range model? Would it have made a difference?
If we had used this model to forecast the outcome in place of our usual September model, we still would have gotten the outcome wrong, but it would have been much closer to what actually happened. The table below shows the side-by-side comparison of the two projections, along with what will likely be the actual outcome (according to the unofficial results as they've been reported on November 13).
As you can see, this simple adaptation of the model would have generated a more accurate prediction of the national popular vote, coming within 1% of the actual result. As the official results get reported, we may very well find that the projection from this new model might even be closer. We still would have gotten the Electoral College vote incorrect because it shows Clinton winning more than 270 Electoral Votes, but it would have been closer to the actual result and the associated win probability would have been a more accurate representation of how close the election actually turned out to be. [Edit: The table has been changed to show the official results, which shows that both the original model and the revised model were very close to the actual popular vote result.]
This still shows that the model is quite vulnerable to the pervasive error that plagued the polls this year, but with a simple addition we could have at least moderated some of its effects. It suggests there was support for Trump out there that the polls simply didn't catch, and that support was at least somewhat predictable Once the official results come in, we'll most likely regenerate the model with the new adaptation and will use it to generate our forecast for 2020. With the Republicans defending just a single term that year, it won't have much of an effect. The first true test of this new model will come in 2024 at the earliest.
Come back then. I'll still be here, looking at numbers, because it's what I do.