October 12, 2004
Why & How Pollsters Weight, Part III
I have so far discussed two different philosophies of weighting by party. The purist model -- still the choice of most media pollsters -- weights samples of adults by demographics but never weights by party identification. An alternative, advocated by pollster John Zogby, weights every 2004 survey so that it matches the estimate of party ID in the exit polls taken among voters in November 2000.
I promised last time to discuss those who pursue a "third way," somewhere between the Purist and Zogby models. Toward that end I did something bloggers are not supposed to do. I picked up a telephone and called some of my colleagues. In the process, I actually learned something: More media pollsters are now weighting by party than I had realized, demonstrating the conflict between art and science in pre-election polling.
Here is a partial sampling:
Investor's Business Daily/ Christian Science Monitor/TIPP -- This survey weights by party identification using a method that Charlie Cook advocated in a recent column, a dubbed "dynamic weighting" by either Cook or Ruy Teixeira. The TIPP survey weights every survey they conduct during an election year by party ID using a rolling six month average of data rolled together from previous surveys (they weight the combined data, that includes nearly 10,000 cases, by demographics only).
Raghavan Mayur, president of TIPP says he weights this way because "I wanted to be consistent in what I do" during an election year. He told me he believes that party ID is "stable at the aggregate level" during any given three month period and that any short term changes, even if real, are "fleeting." (Mayur also commented on the IBD methodology in this article).
NBC/Wall Street Journal/Hart-McInturff -- Two campaign pollsters administer this survey, Democrat Peter D. Hart and Republican Bill McInturff (who takes the place of the late Robert Teeter, Hart's longtime partner in this venture). In an interview, Hart confirmed that he sometimes weights the NBC/WSJ samples by party ID using an internal database of past survey results as a guide. Hart's technique might be dubbed "dynamic weighting in reserve," as his decision to weight is discretionary. Like other pollsters, Hart will correct any imbalances in race, gender or geography in his raw data. Then, if the party ID numbers seem "off" compared to previous surveys, he will look to see if any other demographic anomalies -- age, education, etc. -- can explain the difference. If weighting by these characteristics still leaves a "substantial difference" in party compared to the most recent NBC/WSJ poll, he will then weight by party to bring it back into balance.
What difference qualifies as "substantial?" Hart suggested that while he would not be concerned about changes of a few percentage points either way, he would weight to correct a shift from, hypothetically, an even distribution of Democrats and Republicans to a 10-point advantage either way. "I am not a believer," he said, "that party ID changes [that much] on a monthly basis."
Fox/Opinion Dynamics -- I hesitate to include the Fox survey here because they DO NOT WEIGHT BY PARTY. However, their unique sampling methodology puts them somewhere between the purists and John Zogby in terms of how they control for random variation in the partisanship of their samples.
Virtually all of the others surveys (including IBD and NBC/WSJ) begin with a sample of telephone numbers that represents all households with a working phone, then they screen down for registered and likely voters. The Fox/Opinion Dynamics survey does something a bit more complicated. Although they also call a sample of all households with phones, they "stratify" their sample regionally, setting sample quotas for geographic regions that reflect likely turnout.
To be more specific, the Fox survey divides the country by state into 18 regions, then subdivides each into urban, suburban, rural counties. For surveys of likely voters, they use registration and past voting statistics to set quotas for each sub-region that reflects the national distribution of likely voters. They then draw a random sample of telephone numbers using a random digit dial (RDD) methodology that essentially fills the quota within each sub-region. This sampling methodology resembles what political pollsters like me use for internal campaign polls in statewide races.
Why go to all this trouble? Because where we live is highly predictive of how we vote. Democrats are more likely to live in urban centers, Republicans in rural areas. Moreover, as John Gorman, president of Opinion Dynamics, pointed out in an email, "the demographics of non-response" make it "harder to get an interview in a Northeastern urban center than it is in a rural area...thus stratification is necessary to even out the response rates, and that's why we use it."
I believe this regional stratification is one reason why the Fox/Opinion Dynamics results have been nearly stable as stable lately as the pollsters who weight by party. For my money, this methodology reduces swings in the partisan composition of their polls using hard defensible data (turnout statistics) rather than softer attitudinal data (party identification).
But, how defensible is this method given the flood of new registrants? Won't these skew Gorman's stratification model? Here is his answer:
We have decided at this time not to try to reflect the new registrations. Our reasoning is twofold. First it is hard to accumulate accurate data...we fear introducing error rather than improvement. Second, while registering may be easy, getting out to vote is harder and we are not confident that these new registrants will actually vote.
If I were polling in a state like Ohio or Florida, where registration activity has been intense, I would be very concerned about the new registrations throwing off the old models. On the national level, however, I tend to agree with Gorman. Remember, in this instance, the issue is not the level of turnout but the possibility that turnout will grow disproportionately in Democratic rather than Republican areas (or vice versa).
UPDATE (10/13): Democracy Corps, the polling entity of Democrats Stan Greenberg and James Carville, uses a samping methdology comparable to the Fox/.Opinion Dynamics. Karl Agne, of DC, tells me that they stratify regionally based on past turnout statistics and do NOT weight by party.
There is one more national survey to discuss that now weights by party: The ABC News/Washington Post tracking survey. I'll take that up -- as well as telling you which approach I prefer -- in Part IV.
Related Entries - Weighting by Party
Raghavan Mayur... told me he believes that party ID is "stable at the aggregate level" during any given three month period and that any short term changes, even if real, are "fleeting."
So if the first debate increased Kerry's numbers by 5%, that would be reflected in his poll, but if those undecideds were so disgusted by Bush that they moved from Independent to Democrat, then the poll would show no change?
I'm an Independent, but there have certainly been a couple times I've identified more with the Republicans, probably enough to call myself one to a pollster. The first was after the 2000 election, starting when I heard that weasel Daley from Chicago whine for special consideration because Gore won the popular vote. The second was when it became clear that the Democrats' 2004 nominee would be a candidate with no principles, whose campaign would be disgraceful bordering on treasonous.
Posted by: akmdave | Oct 12, 2004 9:25:07 PM
Considering how Fox weights their sample, I wonder how they handle the Nader/ballot question? If they sample a region where Nader is unlikely to be on the ballot, do they ask about him? Do they then follow up with a "will you write in his name if he does not appear" question? If not, there's little point in counting them in the Nader column.
Posted by: Ann | Oct 13, 2004 1:10:08 AM
After reading your info on the methodologies of the pollsters, and after having spent most of the last 3 months observing and discussing this subject around the net (occasionally with the Votemaster), I am left with one impression: The only votes that will change are the Independents or Undecideds. About 88% of DEMs WILL vote for Kerry, the only question being how many of them will turn out (it seems like MANY). About 91% of the GOP will vote for Bush. Those two figures have remained steadier than the polls' overalls so far. The numbers suggest that the Undecideds within the GOP or DEMs are only 2% total, while the INDies represent 4% of the 6% Undecideds. So why not just assume 39% of the vote will be DEMs (based on Zogby's method), 35% will be GOP, subtract out the 11 or 9% switch-overs, then use those numbers as a base (which I believe will be SOLID). Finally, why not then simply poll the Independents? THEY ARE THE ONLY REAL MOVEMENT FACTOR ANYWAY. It seems a waste of time to poll GOP or DEMS, since they're not going to change their minds anyway. 99% of them are locked in. It is a waste of resources. POll the INDies, factor in their numbers based on the 26-27% they represent, add it to the GOP and DEM figures that are fixed and get the results.
BTW, the registrations and the get-out-the-vote efforts in the works for the DEMs appear to be off the charts, indicating that Kerry's upside is maybe as high as an increase of 10% of the total vote, while Bush's is maybe 2%. I am talking here of new totals due to massively high turnouts, yielding maybe 112% of the 2000 turnout. If Kerry gets even a 4% growth, Bush is in big trouble. Not one commentator has put the 2000 turnout into perspective, either: the populace was bored out of their minds and simply did not get to the polls due to disinterest. All indications that 2004 WILL be the exact opposite: massive voter intensity. There is NO indication that turnout will be like 2000. High turnouts favor the DEMs anyway. With things appearing to be close, due to the use of LVs in most polls, when the non-LVs show up, will the pollsters be jumping out of skyscraper windows?
Posted by: Steve Garcia | Oct 13, 2004 1:21:18 AM
"Moreover, as John Gorman, president of Opinion Dynamics, pointed out in an email, 'the demographics of non-response' make it 'harder to get an interview in a Northeastern urban center than it is in a rural area...thus stratification is necessary to even out the response rates, and that's why we use it.'"
Isn't this fact by itself strong prima facie evidence that failing to take into account something resembling party ID will seriously skew polls?
Obviously, the dispositions of many typical Dems with respect to answering polls differ from the dispositions of typical Republicans, given how much Dems dominate in urban settings.
Why subscribe, as does Gallup, to the convenient fiction that sampling biases, very much along political lines, don't occur in today's polling by random digit calling? Isn't that simply an ungrounded leap of faith? Isn't is powerful evidence that Gallup's approach is actively anti-scientific?
The reality is, if there's no way of compensating for very likely sampling biases, then polling itself is largely doomed to something approaching junk science. The ONLY hope for a scientific polling in today and for tommorrow is to introduce SOME weighting mechanism that resembles party ID, that, takes into account the pre-existing political inclinations of the polled voter, and tries to match those inclinations up to the hardest data knowable about the voter population.
And, obviously, the six month rolling average for party ID solves no basic problem IF the political inclinations of voters permanently affect their dispositions to respond to pollster's questions.
Posted by: frankly0 | Oct 13, 2004 4:37:06 PM
The comments to this entry are closed.