« Veep Debate | Main | Housekeeping: Extended Posts? »

October 07, 2004

Why & How Pollsters Weight, Part II

In my last post on party identification, I promised to turn next to pollsters who weight by party. I'd like to finish up this thread with a discussion today of those who weight on party using exit polls and another post to follow on some middle-way alternatives.

First, a bit of review: In the last post, I introduced what we might call the "purist" model of weighting. Most of the surveys done by national media outlets mathematically weight (or adjust) their samples of adults to match known demographic population estimates available from the U.S. Census. These organizations do not, however, adjust the number of Democrats and Republicans in their samples. The surveys that use this weighting philosophy include CBS/New York Times, ABC/Washington Post, Gallup/CNN/USA Today, Time, Newsweek, Pew, Annenberg, and some others I didn't mention last time: SurveyUSA, The American Research Group (ARG) and the YouGov/Economist surveys and Fox News/Opinion Dynamics (although Fox weights by demographics after screening for registered or likely voters). [Note to lurking pollsters: if your organization is not on this list but should be, please email me].

The "purists" never weight by party identification because, unlike true demographics, party ID is an attitude that can change at the individual level. A study by the Pew Foundation found 16% of the individual respondents changed their party leanings in two successive interviews, one just before the 1988 election and the second roughly a week later. The individual changes translated into an overall shift from a 33%-33% split to a 35%-30% Democratic advantage. As a result of this potential for change, the National Council on Public Polls - an association polling organizations that includes most of those listed above - describes weighting by party as "little better than a guessing game where the pollster is substituting his or her judgment for scientific method."

But some pollsters chose to weight by party. Why?

One reason is that party identification, like anything estimated with a survey, can vary quite a bit due to ordinary sampling error (and remember, if we focus on the margin between Democrats and Republicans, the typically +/- 3-4% margin of error doubles to +/- 6-8%). Since party correlates strongly with the vote preference (the recent CBS survey shows 93% of Republicans support Bush and 87% of Democrats support Kerry), this random variation has obvious consequence. We see results that seem to shift wildly, when much of the variation is random. As Charlie Cook put it in a recent column, "Pollsters acknowledge variances from one poll to the next in gender, race, income and education, and they correct for it, but refuse to acknowledge that partisan numbers fluctuate just the same, and need to be corrected."

Others have argued recently that many of the recent reported shifts in party ID are implausibly big, even allowing for sampling error. Yes, they say, party ID can change, but not that much. Let us set aside (just for now) the issue of whether the population of "likely voters" ought to change from week to week (I'll get there soon...promise). Even among registered voters, a group whose population should remain relatively stable from week to week, the Gallup poll showed a shift from a nine-point Republican advantage (40% to 31%) just before the first Bush-Kerry debate, to a two-point Democratic advantage (38% to 36%) after (these numbers were apparently obtained directly from Gallup by The Left Coaster).

Ruy Teixeira and his frequent contributor, Emory University Political Science Professor Alan Abramowitz, have argued that what is changing is not individual attitudes as much as the willingness of Democrats and Republicans to be interviewed at any given time. Abramowitz, put it this way in an email to me a few days ago:

I would expect that interest in the campaign would correlate with willingness to participate in a political poll. Even if there is a small difference on this between Dems and Reps, it could have a substantial impact on estimates of proportions of Dems and Reps in the electorate due to the very low overall response rates in these polls...Do I know that this is what's going on? No. Is it at least as plausible as a real 10 point GOP advantage in party id as the pre-debate Gallup poll showed? I think so

It certainly seems plausible that low response rates (another topic I need to get to soon) may be working to exaggerate short-term shifts in party identification. It is obviously desireable to try to eliminate any such bias as well as the purely random changes in party identification that occur by chance. The question is, how?

Pollster John Zogby has championed one simple answer. After screening for "likely voters," Zogby weights every poll he conducts to match the characteristics reported by the exit polls conducted in the most recent comparable election. In a recent column on his website, Zogby said he weights every national survey this year to the Party Identification result from 2000: 39% Democrat, 35% Republican and 26% Independent. "I know that to some pollsters I am a heretic," Zogby recently told Reuters, "but I have found that weighting for party ID is a proven way of ensuring you have a proper sample."

Regular readers will anticipate my skepticism of Zogby's approach. If the purist model allows for too much variation in party identification, Zogby assumes too little. He forces every survey conducted during 2004 to conform to the snapshot of party from Election Day 2000, allowing no room for the possibility that this year may be different or that some individuals may alter their views of the parties -- moving from, say, Democrat to Independent and back again -- in the wake of a political convention or a debate. The reliance on past exit polls also has at least two more methodological challenges:

* Telephone surveys use an interviewer to read each question. Exit polls involve a paper questionnaire that respondents fill out by themselves. These different methodologies can lead to different results on many questions, including past voting behavior (although I know of no experimental research looking specifically at the impact on party identification).

* Exit polls do not include absentee or early voters in many states. Since absentee votes have been historically more Republican, their absence could make a difference of a point or two in the national party ID estimate.

Do these various pitfalls have practical consequence? Abramowitz, who supports the idea of weighting by party (though in fairness, prefers a different approach than Zogby's), says no. "It is just not obvious that weighting by party id," he writes in his email, "or some compromise like using a rolling weight based on combined samples over time, produces less accurate results than not weighting by party id."

I'd argue that there is evidence to the contrary, at least regarding Zogby. In 2002, he "called" over 29% of his 2002 Senate and/or Gubernatorial races for the wrong candidate, despite polling more races than all but one other company. At the same time, the missed call rate for all of the pollsters combined was only 13% (hat tip to DalyThoughts for these statistics culled from an NCPP report on polling in the 2002 elections).

But Abramowitz mentions "a compromise" like a "rolling weight." Is there a third way between the purist model and Zogby's rigid use of old exit polls? I'll take that up in Part III.

Correction: Abramowitz is now spelled correctly; my original version was in error

[Continue with Why & How Pollsters Weight, Part III]

Related Entries - Weighting by Party

Posted by Mark Blumenthal on October 7, 2004 at 07:20 AM in Weighting by Party | Permalink

Comments

I too have a website on sample weighting, which I invite people to visit:

http://www.hs.ttu.edu/hdfs3390/weighting.htm

Although Mark notes a couple of limitations of exit poll data, one undeniable benefit is that -- rather than pollsters' having to divine who is a "likely voter" as in a pre-election survey -- the people surveyed in exit polls are people we know have actually voted.

As for Zogby's relative inaccuracy in 2002, I certainly don't know what he used for his party weighting template, but I can see problems with using 2000 or even 1998 as a benchmark.

We know that off-year elections have lower overall turnout than do presidential elections and perhaps different D-R-I proportions, as well, thus suggesting that presidential years might not provide a good baseline.

But even if we look at previous midterm years, it's clear that some states with a "hot" U.S. Senate race in 2002 might not have had any Senate race in 1998 (i.e., if the state's other senator was up in 2000, not 1998). Obviously, the dynamics of a state's off-year turnout will be greatly affected by the existence and perceived closeness of a Senate race.

Posted by: Alan R. | Oct 7, 2004 1:17:28 PM

Mr. Blumenthal,

Nice post. Can I add a thought though (why am I asking, that's why the comments section is here, isn't it)?

Those who weight by exit poll data are basing all of their polling on a single poll. That poll, like any others, has a margin of error. Therefore, for every poll conducted based on the exit polls, you need to be mindful of the margin of error due to sampling in the current survey, as well as the margin of error due to sampling in the exit poll.

So you have the margin of error for the current poll, plus the margin of error for the baseline (exit) poll.

But there's more! Exit polls are not conducted like other polls. There is no attempt to ensure that every voter (or close to every voter) in a given state has the same chance of being polled. A few precincts are chosen as being representative of wide swaths of the state, and these precincts are sampled. But the process of choosing these precincts introduces another error point, particularly if the choice somehow overlaps with a particular get-out-the-vote effort that is localized.

So you have the margin of error of the current survey, the margin of error of the baseline (exit) survey, and the error introduced in the selection of sampling precincts.

And that's all assuming that nothing has changed. But the only thing that is constant in life is change.

By the way, Mr. Blumenthal, are you certain that ARG does not weight by party id?

Posted by: Gerry | Oct 7, 2004 1:46:42 PM

If weighting by party ID is a faulty model for sampling voter sentiment prior to an election, it is either astounding Zogby has been so accurate in the last two presidential election cycles, beating all of his more orthodox rivals, or he is incredibly lucky.

I think one makes his own luck and that Zogby is an early adopter of what other polls will likely mimic if Zogby repeats his accuracy this time.

Posted by: adaplant | Oct 7, 2004 2:32:33 PM

Good comments! First, you can call me Mark. No need for formality here. :-)

Second, yes Gerry, the comments *are* here for a reason, so no need to ask. Dissenting opinions are always welcome.

Third - Alan, regarding Zogby, as far as I know, his comment about using the 39D, 35R, 26I weighting apply to this year's polls only. I don't know what he did in previous years.

Fourth - Gerry, regarding ARG, you made me go back and check, and no, I'm now not absolutely certain that ARG does not weight by party.

I included ARG on the list because I had emailed them back in September to ask whether they weight by party. The reply from Dick Bennett said "no." But I now see that Bennett went on to describe the questions they ask about party registration, and one can read his answer as implying they they *might* weight by party registration in registration states. I've emailed for a clarification, and will post when/if I hear back.

Like you perhaps, I have long suspected that ARG weights by party, given the remarkable consistency in their results.

Thanks
Mark

Posted by: Mark Blumenthal | Oct 7, 2004 2:34:31 PM

Mark,

Zogby has weighted by party ID in the past. The amounts change, but the method does not.

Gerry

Posted by: Gerry | Oct 7, 2004 2:55:23 PM

Why couldn't Zogby have been lucky? He was right twice in a row.

That is not the kind of track record one normally associates with "expertise".

I'd hestitate to condition "expertise" on getting this election correct based upon the possibility that there are enough polling firms with techniques are only marginally different in a polling environment that breaks down on a 50/50 basis with a +/- factor of a couple points.

This could easliy result in someone getting three in a row.

Posted by: Eric | Oct 7, 2004 2:59:19 PM

Exit polls have huge national samples (perhaps as many as 100,000 respondents, if I recall correctly), which makes sense if you want a meaningful poll within each state.

Thus, the margin of error (MoE) on the national exit poll numbers will be exceedingly small.

If the exit polls within a state are done only at select precincts, this would seem to be a form of cluster sampling, which indeed increases the MoE (i.e., sampling error gets introduced at two stages, determining which precincts to sample and determining which individuals to survey at a given precinct).

Posted by: Alan R. | Oct 7, 2004 3:10:59 PM

Off topic somewhat, but do you have an explanation for the recent reversal of all known precedent by Bush doing better among RVs than LVs? I can't think of that ever happening with a Republican before. Any ideas what's going on?

Posted by: Bob | Oct 7, 2004 4:11:45 PM

Just wanted to update: I emailed Dick Bennett and he confirmed that American Research Group does NOT weight by either party identification or party registration.
Mark

Posted by: Mark Blumenthal | Oct 7, 2004 4:48:41 PM

Is Zogby weighting all polls to '00's national results? If so, even if his theory is correct, wouldn't that almost necessarily increase error in local races while reducing it in the national ones (unless a locale's turnout perfectly minors the national results)? If so, I'd expect to see exactly what we see from Z: remarkably low national error and remarkably high error on congressional/senatorial races.

Posted by: Nick Simmonds | Oct 8, 2004 6:27:00 AM

Mark,

I apologize, but for some reason my trackback went to the wrong post; the one that is titled "The Hedgehog Report With The Scoop" has nothing to do with this post of yours; if you would delete that trackback it would probably be a good thing.

Gerry

Posted by: Gerry | Oct 8, 2004 7:16:16 AM

Becareful using Zogby as the icon of pid weighting. He uses a lot of other unusual techniques - like day time calling etc. so there are more variables in play in comparing his record with that of other firms. You are on the right track in terms seeking a middle ground. It is the only answer.

Strange how a bunch of professionals and academics default to acting as high priests positing and defending "beliefs" rather than always questionning like good scientists should.

Posted by: Chris | Oct 8, 2004 10:56:10 AM

There is another way to weight for party preference that in my opinion splits the difference between the two descriptions Mark gives above.

My firm polls only known voters from the Secretary of State's website. For example, if we are working on a Republican primary, we will survey only people who have previously voted in a Republican primary within a given timeframe. (Georgia does not have registration by party--a registered voter can vote in either party's primary, but cannot switch between parties between the Primary and the Primary Runoff).

When we're polling in a General Election, we can weight voters to reflect the actual party mix of a given election based on their past voting behavior.

For example, in the 2002 General Election, if 40% of voters from the General had voted in the 2002 Democrat Primary or 2002 Democrat Runoff, we would weight the 2004 sample to reflect 40% Democrat primary or runoff voters.

This allows us to show several different scenarios. For example, if 2002 had a higher proportion of Democrats than the 2000 General election, we could weight to reflect turnout like it was in 2000 or 2002 and compare the scenarios to see how it effects our 2004 results.

In my experience, polling only past voters has given us very accurate results in the past, and the ability to weight by known past primary voting behavior gives us a great tool to run "what if" scenarios.

Posted by: Todd Rehm | Oct 10, 2004 10:11:38 PM

THANKS

Posted by: RAY GARAFOLA | Nov 2, 2004 2:50:01 PM

The comments to this entry are closed.