Editor-in-Chief’s Note: The jolliest part of launching and editing HDSR? The sheer freedom to try the inconceivable—or perhaps even the impermissible. A Murder on the Nile or Murder by Decree theme for a data science journal? Why not! After all, the suspense and nail-biting tension stirred up by election polling numbers rivals these fictional whodunits. But just how much trust can we place in those numbers? I tackled such questions in Meng (2018), uncovering the sobering fact that opinions from 2.3 million respondents fared no better at prediction than a simple random sample of 400—when free from the selection biases that blindsided us in 2016.
So, I’m thrilled and grateful to Michael Bailey for blending technical prowess with witty humor, crafting a mystery that transforms polling angst into intrigue. Here’s to better prepare HDSR’s broadest readership for another election roller coaster, while also channeling our anxieties into a curiosity about how polls are staffed—and stuffed.
Keywords: public opinion, polling, surveys, 2024 presidential election, mystery
Survey research is hard enough given low response rates and rapidly changing technology and methods. Add to this the stress of a highly contested election, and many in the survey field are exhausted. We thought it would be a nice change of pace to bring these challenges to life in the form of an Agatha Christie–style mystery. The characters are a bit over-the-top; and of course, none of the characters are based on actual individuals. Scratch at the story a bit, though, and you will see glimmers of real challenges and ironies in polling today. For good qualitative overviews of these challenges, see Leonhardt (2020) and Morris (2022). For more specialized but still accessible discussions, see Bailey (2023) and Clinton et al. (2021). On data quality in nonprobability polls, see Mercer et al. (2024).
The setting is an English manor. Elegant, but not ostentatious. A library with a working fireplace and, of course, seating for 12. It is now a B & B, unusually full for the first week of November.
The show opens with our pollsters straggling in for breakfast.
The probability pollsters arrive first. They are an attractive middle-aged couple. Thin. Botox, but not too much. They are dressed smartly. You can’t buy their clothes in the airport—even at the nicer stores. You need to know someone to dress like them.
Everyone knows that the probability pollsters follow the Science of Polling. They named their firstborn ‘N.’ On Ash Wednesday they leave margins of error on their foreheads all day.
Their surveys aren’t based on talking to just anybody. They only talk to the right people—people they have randomly contacted from a list of all voters. Getting their samples this way is expensive, to be sure, but do it well or not at all—am I right?
The probability pollsters remember your name and ask about your children (‘finding themselves’). You don’t hear about their children until later in the conversation (Yale, since you asked).
Catch them at the right time, though, and you sense that they are haunted by dark secrets. Yes, they contact people randomly as the Book of Polling prescribes. But will anyone notice that hardly anyone responds? In May they told us Trump would romp to victory after they heard from 4,000 people in six battleground states. If they had a truly random sample—meaning that they randomly picked 4,000 voters, called them, and they all answered—their estimate would be remarkably accurate.
In fact, the probability pollsters called 400,000 people to get their 4,000 respondents. They know that the 1% who responded are weird in that they answer calls. What if this 1% is also weird in the way that they view politics? The Book of Polling is silent. Science does not answer their prayers.
In a panic, the probability pollsters calculated confidence intervals as if 100% of their random sample responded. Will anyone find out about the 396,000 nonrespondents? That’s a lot of skeletons, even for their massive his-and-hers walk-in closets.
Next to arrive is the nonprobability pollster. A morning person, he is dressed crisply and moves athletically. Short hair, sculpted with a bit of gel. He looks like he can get a cat out of a tree. It’s not that he thinks that the probability pollsters are wrong or bad. It’s just that who has that kind of money? He gets it done for a tenth of the price.
How does he do this? The first step is that he’s got a buddy who can get him some respondents. Instead of calling hordes of people who don’t want to be bothered, this guy knows where to find folks jonesing to answer polls. And where is that? This is a bit of a touchy subject. It’s not like his buddy gets his respondents from some guy in a Buick under the Williamsburg Bridge, OK?
No, this isn’t shady at all. His buddy gets his stuff from pop-up ads on the internet.
The nonprobability pollster likes that he can get data pretty much whenever he wants it. But, yeah, it’s not always clean. He’s seen some things. He’s seen samples where the respondents were two-thirds women or mostly college graduates. He even saw a sample with only one Black guy under 25—and a Trump-supporting Black guy under 25 no less.
There are other rumors about his supply. Some respondents in nonprobability polls fake their demographics to get the reward for responding. Other respondents answer surveys as a hobby. Like a lot. Sure, that’s weird, but a respondent is a respondent, and our nonprobability pollster has a business to run.
The nonprobability pollster’s métier is that he scrubs his data by running it through sampling weights algorithms. These algorithms can get tricky. Thankfully, his intern handles the code. If the survey sample runs low on young Hispanics, double count young Hispanics in the final numbers. Too many old people in the survey sample? Cut them back a bit.
Does weighting make our nonprobability pollster a criminal? You can’t get sausage any other way. Even Mr. and Ms. Probability weight their data. Whether it’s charcuterie from The New York Times or a Sausage McMuffin from an online pollster, everyone in the business knows that if it’s ‘nationally representative,’ it has been weighted.
The din of light conversation is broken by the arrival of a heavily bearded partisan Republican pollster. He’s not tall, but he takes up a lot of space. ‘Thick,’ some would say. The partisan Republican pollster will be the first to tell you he is a data guy. But it’s not just data. You have to know the American people, as he has been known (by all who know him) to say. He predicted two out of the last zero Republican takeovers of the Senate. He predicted both of Trump’s victories.
The Republican pollster had some run-ins with the law. He did some time after his Red Wave failed to appear in 2022. He got rung up in 2018 with a bit of inflated numbers for the GOP. He unskewed some polls for Romney in 2012, although that was expunged. But looking—and, if we’re honest, smelling—guilty isn’t always the same as being guilty. In 2016, people said he was cooking his books when he predicted good things for Trump. In 2020, he was wrong to say Trump would win, but his numbers were closer—by a lot, actually—to the final results than the drivel that came from all the do-gooder types.
His secret sauce is simple: nerds respond to polls and nerds are Democrats. So, take whatever data you get and add 4 percentage points to Trump’s numbers. That’ll pretty much get you the right answer—especially in real America.
Moments later, the partisan Democratic pollster joins the group. Sandals and cargo pants. Tank top, with tattoos peeking out. Single and childless, she recently began hiding the fact that she has a cat. Two cats, really, if you count ‘Karen,’ the weirdly named cat her ex left her to babysit a few years ago. Her special skill is sussing out funny business in the cross-tabs; there is no way Black men like the big orange fraud. This helps her focus on serious polls. She laughed when her friends gave her a coffee mug that said, ‘My daily dose of Hopium.’
She doesn’t seem like a killer. But no one can deny there were bodies in the hotel room after she checked out in 2016 and 2020.
Finally, Mr. Holmes, Miss Marple, and Mssr. Poirot join the group. As it happens, each came separately on holiday for some well-deserved R & R.
Before we cut the scene, it is a bit of shock to realize that we missed someone. He’s been here for a while. (Maybe even all night?) During breakfast he has been on his laptop in the corner. He’s easy to miss because he mostly stays out of the conversation. He ate 6/7 of his eggs and 3/8 of his toast, as if this were some kind of ideal ratio. He’s on the smaller side, wearing a T-shirt and old shorts. They call him ‘The Modeler.’ If Nate Silver were human, this is what he would look like.
He doesn’t think much of the others. He’s unimpressed by the probability pollsters and their 1% response rate. He is unconvinced that the people in the nonrandom samples are typical in the ways they need to be. And, of course, he recognizes the squalor of the partisan pollsters.
The Modeler’s trick is to average all polls. It’s clever: maybe people who answer probability polls run liberal and people who click on pop-up ads run conservative. Average it out and it works great. Unless that’s not how the biases go, as happened in 2016 in the Midwest, in 2018 in Maine, in 2020 in the Midwest, in Brazil in 2022, and …
‘Stop!’ cries the Modeler. ‘If I was wrong, would I have this big-ass boat?’ He’ll keep listening to his computer, thank you very much.
A good mystery requires a good crime. And here, dear reader, you may have noticed a glaring omission in our whodunit: there is no it for a who to have dun.
And that is the key to our story. We feel a crime is coming. Poirot, Marple, and Holmes don’t check in to any old B & B. We can be pretty sure that late on the night of November 5, there will be a scream in the dark.
And yet, what will be the crime?
At this point, we can only speculate. But we have looked into the heart of polling enough to know the darkness therein. One shudders to think what might happen.
In this scenario, the B & B guests awake Wednesday morning to find that Trump has exceeded expectations at the ballot box. We saw this in 2016. We saw it again in 2020, although Trump didn’t exceed expectations enough to keep the Diet Cokes flowing at 1600 Pennsylvania Avenue.
In this scenario, Trump does it again.
Dr. Watson—who has shown up late after some trouble at the airport with Holmes’s prescriptions—is quick to offer a theory. It’s a rather pedestrian theory, but it fits the facts: nonresponse killed the polls during the sampling stage.
‘Some people are disinclined to respond to polls,’ he says. ‘It’s not that they are embarrassed; it’s just that talking to strangers about politics is not their thing. Midwestern White people without college degrees in particular tend not to respond to polls. If the Midwestern noncollege White people that pollsters heard from were typical of all Midwestern noncollege White people, then a little weighting would fix the problem. But the problem in 2016 and 2020 seemed to be that the Midwestern noncollege White people who talk to pollsters are more barista and less bricklayer, meaning that they weighted up responses from an overly Democratic response pool.’
The Republican pollster jumps in, ‘That’s exactly what I said would happen…’
‘In 2022, mon ami,’ Poirot interrupts archly. Clearly, the partisan pollster’s theory cannot explain every election.
Holmes breaks his silence, suggesting a new weapon weighting by party.
‘We know pollsters clean their data by weighting them to national benchmarks. Some of these benchmarks are straightforward: age, gender, race, education. They don’t want too many old or educated people in their data, so they up and down weight observations in the survey sample to make the sample look like the electorate.’
‘Doing so is harder than it sounds. We might know the percentage of the population that is over 60, but what will be the percentage of the electorate that is over 60? And even if they pull off simple demographic weights, the sample could be skewed in other ways. A sample with too many Democrats and too few Republicans will be biased toward Harris. A sample with too many Republicans and too few Democrats will be biased toward Trump.’
‘Hence, many pollsters weight by party. They set targets for the distribution of each party and up weight and down weight observations based on party so that the weighted sample has the specified partisan breakdown.’
Holmes pauses, adding credibility to the idea that weighting by party is hard. He does some math in his head (impressively flashed on screen while he’s talking) and notes that the percentage of people who are Democrats is neither set in stone nor known to the pollster.
Poirot chimes in to note that weighting by party is particularly tricky because so many people do not identify with a party—even though they may vote as if they do. This is particularly true for young people who might reliably vote for Democrats or Republicans but who tell pollsters they are independent. They get put in the independent bucket for weighting.
Feeding off Poirot, Holmes speaks, as if in a trance, ‘Suppose the Harris brat summer has energized Democrats so that they are more likely to respond such that a survey sample has “too many” Democrats compared to what the pollster has decided Democrats “should” get.’ [Note to director: have Holmes manically lean into air quotes here].
‘The overrepresentation of Democrats who say they are Democrats will be soaked up by the weights. The overrepresentation of Democrats who say they are independents will not be soaked up by the weights, skewing polls toward Harris.’
‘Old crime, but new weapon.’
As the credits roll, the detectives pass around scones in the garden while uniformed police officers guide the handcuffed nonprobability pollster’s intern into a squad car.
In this scenario, the B & B guests awake on Wednesday morning to find that Harris outperformed the polls. We haven’t seen an outcome like this before, so it is a bit of a shock.
The partisan Republican pollster immediately bellows, ‘Fraud! Stolen election!’
The Modeler pivots quickly, tweeting, ‘I knew that Harris had a 100% chance of winning when I said she had a 17% chance of winning. Sad that haters don’t understand probability.’
The other pollsters are puzzled. Miss Marple gently pulls them aside. ‘This is hard for you. I know you said that 2016 and 2020 had a pro-Democratic bias.’ She sips her tea. ‘Was it because you had more Democrats or because you had more people engaged with politics?’ The question lingers in the air for a moment.
She goes on to explain. ‘In 2020, Trump ran even with Biden among those who were not interested in politics but was 20 percentage points behind those who said they were very interested in politics. If the survey respondents were—as most people suspect—more likely to be interested in politics than nonrespondents, that would explain why that survey showed Biden with a much larger lead than he actually had.’
‘What if the bias in polls is not that samples have too many Democrats, but that they have too many people who are eager to talk about politics? In 2020, the people who wanted to talk about politics were more liberal, but in 2024, the people who wanted to talk about politics were more conservative. When Biden was the nominee, Democrats would rather talk about bunion pus than politics. This likely changed when Harris took over, but perhaps bias remained.’
‘And,’ Marple continues, ‘there could even be bias within the bias. We know that polls tend to overrepresent politically engaged people. The extent of bias in topline presidential numbers reflects whether these biases balance or tip toward one party. In 2020, it appears that in some subgroups—noncollege White people, for example—the kind of people who wanted to respond ran hotter for Democrats than Republicans, making the polls more positive toward Democrats than reality. For much of 2024, the progressives have been underwhelmed by their party, viewing the Biden administration as too cautious on climate and too aggressive on Palestine. Whereas polls typically have too many strong Democratic partisans, perhaps they have had too few in 2024.’
In short, nonresponse has struck again, but with a new weapon and a new victim.
The scene ends with Poirot and Marple pepper spraying the Republican pollster as he attempts to scale the wall of the manor.
We save the craziest scenario for last. What if pollsters spend months hearing from the 1% of the most politically engaged Americans plus the online polling hobbyists, weighting them to mythical national benchmarks, reporting fantastical levels of precision and then … and then, they get it right?
In this scenario, the crime is that there is no crime.
On Wednesday, Miss Marple, Mr. Holmes, and Mssr. Poirot sleep in, trundle off to a rose garden, and end the day with a light dinner on the patio.
That would be the biggest twist of them all.
Michael A. Bailey has no financial or non-financial disclosures to share for this article.
Bailey, M. A. (2023). A new paradigm for polling. Harvard Data Science Review, 5(3). https://doi.org/10.1162/99608f92.9898eede
Clinton, J., Agiesta, J., Brenan, M., Burge, C., Connelly, M., Edwards-Levy, A., Fraga, B., Guskin, E., Hillygus, D. S., Jackson, C., Jones, J., Keeter, S., Khanna, K., Lapinski, J., Saad, L., Shaw, D., Smith, A., Wilson, D., & Wlezien, C. (2021). Task Force on 2020 Pre-election Polling: An evaluation of the 2020 general election polls. American Association for Public Opinion Research. https://aapor.org/wp-content/uploads/2022/11/AAPOR-Task-Force-on-2020-Pre-Election-Polling_Report-FNL.pdf
Leonhardt, D. (2020, November 30). ‘A black eye’: Why political polling missed the mark. Again. The New York Times. https://www.nytimes.com/2020/11/12/us/politics/election-polls-trump-biden.html
Mercer, A., Kennedy, C., & Keeter, S. (2024, March 5). Online opt-in polls can produce misleading results, especially for young people and Hispanic adults. Pew Research. https://www.pewresearch.org/short-reads/2024/03/05/online-opt-in-polls-can-produce-misleading-results-especially-for-young-people-and-hispanic-adults/
Morris, G. E. (2022). Strength in numbers: How polls work and why we need them. W. W. Norton and Company.
©2024 Michael A. Bailey. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.