Why & How Pollsters Weight, Part II

October 7, 2004October 1, 2019 Mark Blumenthal17 Comments

In my last post on party identification, I promised to turn next to pollsters who weight by party. I’d like to finish up this thread with a discussion today of those who weight on party using exit polls and another post to follow on some middle-way alternatives.

First, a bit of review: In the last post, I introduced what we might call the “purist” model of weighting. Most of the surveys done by national media outlets mathematically weight (or adjust) their samples of adults to match known demographic population estimates available from the U.S. Census. These organizations do not, however, adjust the number of Democrats and Republicans in their samples. The surveys that use this weighting philosophy include CBS/New York Times, ABC/Washington Post, Gallup/CNN/USA Today, Time, Newsweek, Pew, Annenberg, and some others I didn’t mention last time: SurveyUSA, The American Research Group (ARG) and the YouGov/Economist surveys and Fox News/Opinion Dynamics (although Fox weights by demographics after screening for registered or likely voters). [Note to lurking pollsters: if your organization is not on this list but should be, please email me].

The “purists” never weight by party identification because, unlike true demographics, party ID is an attitude that can change at the individual level. A study by the Pew Foundation found 16% of the individual respondents changed their party leanings in two successive interviews, one just before the 1988 election and the second roughly a week later. The individual changes translated into an overall shift from a 33%-33% split to a 35%-30% Democratic advantage. As a result of this potential for change, the National Council on Public Polls – an association polling organizations that includes most of those listed above – describes weighting by party as “little better than a guessing game where the pollster is substituting his or her judgment for scientific method.”

But some pollsters chose to weight by party. Why?

One reason is that party identification, like anything estimated with a survey, can vary quite a bit due to ordinary sampling error (and remember, if we focus on the margin between Democrats and Republicans, the typically +/- 3-4% margin of error doubles to +/- 6-8%). Since party correlates strongly with the vote preference (the recent CBS survey shows 93% of Republicans support Bush and 87% of Democrats support Kerry), this random variation has obvious consequence. We see results that seem to shift wildly, when much of the variation is random. As Charlie Cook put it in a recent column, “Pollsters acknowledge variances from one poll to the next in gender, race, income and education, and they correct for it, but refuse to acknowledge that partisan numbers fluctuate just the same, and need to be corrected.”

Others have argued recently that many of the recent reported shifts in party ID are implausibly big, even allowing for sampling error. Yes, they say, party ID can change, but not that much. Let us set aside (just for now) the issue of whether the population of “likely voters” ought to change from week to week (I’ll get there soon…promise). Even among registered voters, a group whose population should remain relatively stable from week to week, the Gallup poll showed a shift from a nine-point Republican advantage (40% to 31%) just before the first Bush-Kerry debate, to a two-point Democratic advantage (38% to 36%) after (these numbers were apparently obtained directly from Gallup by The Left Coaster).

Ruy Teixeira and his frequent contributor, Emory University Political Science Professor Alan Abramowitz, have argued that what is changing is not individual attitudes as much as the willingness of Democrats and Republicans to be interviewed at any given time. Abramowitz, put it this way in an email to me a few days ago:

I would expect that interest in the campaign would correlate with willingness to participate
in a political poll. Even if there is a small difference on this between Dems and Reps, it could have a substantial impact on estimates of proportions of Dems and Reps in the electorate due to the very low overall response rates in these polls…Do I know that this is what’s going on? No. Is it at least as plausible as a real 10 point GOP advantage in party id as the pre-debate Gallup poll showed? I think so

It certainly seems plausible that low response rates (another topic I need to get to soon) may be working to exaggerate short-term shifts in party identification. It is obviously desireable to try to eliminate any such bias as well as the purely random changes in party identification that occur by chance. The question is, how?

Pollster John Zogby has championed one simple answer. After screening for “likely voters,” Zogby weights every poll he conducts to match the characteristics reported by the exit polls conducted in the most recent comparable election. In a recent column on his website, Zogby said he weights every national survey this year to the Party Identification result from 2000: 39% Democrat, 35% Republican and 26% Independent. “I know that to some pollsters I am a heretic,” Zogby recently told Reuters, “but I have found that weighting for party ID is a proven way of ensuring you have a proper sample.”

Regular readers will anticipate my skepticism of Zogby’s approach. If the purist model allows for too much variation in party identification, Zogby assumes too little. He forces every survey conducted during 2004 to conform to the snapshot of party from Election Day 2000, allowing no room for the possibility that this year may be different or that some individuals may alter their views of the parties — moving from, say, Democrat to Independent and back again — in the wake of a political convention or a debate. The reliance on past exit polls also has at least two more methodological challenges:

* Telephone surveys use an interviewer to read each question. Exit polls involve a paper questionnaire that respondents fill out by themselves. These different methodologies can lead to different results on many questions, including past voting behavior (although I know of no experimental research looking specifically at the impact on party identification).

* Exit polls do not include absentee or early voters in many states. Since absentee votes have been historically more Republican, their absence could make a difference of a point or two in the national party ID estimate.

Do these various pitfalls have practical consequence? Abramowitz, who supports the idea of weighting by party (though in fairness, prefers a different approach than Zogby’s), says no. “It is just not obvious that weighting by party id,” he writes in his email, “or some compromise like using a rolling weight based on combined samples over time, produces less accurate results than not weighting by party id.”

I’d argue that there is evidence to the contrary, at least regarding Zogby. In 2002, he “called” over 29% of his 2002 Senate and/or Gubernatorial races for the wrong candidate, despite polling more races than all but one other company. At the same time, the missed call rate for all of the pollsters combined was only 13% (hat tip to DalyThoughts for these statistics culled from an NCPP report on polling in the 2002 elections).

But Abramowitz mentions “a compromise” like a “rolling weight.” Is there a third way between the purist model and Zogby’s rigid use of old exit polls? I’ll take that up in Part III.

Correction: Abramowitz is now spelled correctly; my original version was in error

[Continue with Why & How Pollsters Weight, Part III]

Mark Blumenthal

Mark Blumenthal is political pollster with deep and varied experience across survey research, campaigns, and media. The original "Mystery Pollster" and co-creator of Pollster.com, he explains complex concepts to a multitude of audiences and how data informs politics and decision-making. A researcher and consultant who crafts effective questions and identifies innovative solutions to deliver results. An award winning political journalist who brings insights and crafts compelling narratives from chaotic data.