October 13, 2004

Why & How Pollsters Weight, Part IV

In Part I of this thread, I listed pollsters that weight by demographics but not by party identification, including the ABC News/Washington Post poll. I recently learned that ABC does weight the likely voter numbers in their October tracking survey by party ID. Like the approaches described in the last post, the ABC model is something of a compromise and worthy of further discussion.

The reporting of the ABC/Washington Post daily tracking poll is a bit confusing. I have received several emails from readers asking about the following passage from the Washington Post's methodology page (a similar blurb appears at the bottom of most ABC poll stories):

The Post and ABC News collect data jointly but are responsible for developing their own methods to identify likely voters. This may produce slightly different estimates of candidate support.

What does this mean? Let's start at the beginning. The Post and ABC both share data collected and tabulated every night by the same interviewing facility. Although they start with the same data they weight it differently and apply different likely voter selection models. I assume the weighting procedures are different, even before they select likely voters, because ABC and the Post have reported slightly different percentages among registered voters for 7 of the last 9 days (and reported ranges of interview dates were identical). Tonight among registered voters, for example, the Post has Kerry head of Bush (48% to 46%), while ABC has Bush ahead by the opposite margin.

[As of this writing, I cannot say much more about the weighting procedures used by the Washington Post, except that their methodology page makes no reference to weighting by party. I have made an email inquiry of Richard Morin, the Post's polling director, and will report back if and when I hear more.]

Based on their methodology page and an email exchange with ABC Polling Director Gary Langer, I can explain the weighting procedures used by ABC. They report all results using a rolling three-day average, but weight each night's data separately. They weight the full sample of adults each night to match census demographic estimates (for gender, age, race and education). For the registered voter sample, ABC stops there, weighting only by demographics.

In the October tracking surveys, likely voters get another weight for party identification, using a method that literally splits the difference between the Zogby approach (weighting to previous exit poll results) and the purist approach (not weighting by party at all). On any given night, ABC's party ID target for likely voters is the average of that night's unweighted result for party ID and the average party identification result on exit polls for the last three presidential elections (roughly 39% Democrat, 35% Republican, 26% independent).

ABC's Langer and Merkle lay out the rationale for this compromise on page two of their methodology brief. They see the merits of both arguments. On the one hand, "party ID has been remarkably stable in exit polls conducted in presidential elections since 1984." On the other, "party ID can and does change, and that polls measuring the dynamics of the race - rather than simply attempting to predict its outcome - need to measure and report this change, not suppress it." So they adopt a weighting scheme that allows some variation, while also imposing a control tied to past exit poll results. Langer and Merkle hasten to add that having gone to all this trouble four years ago, their party ID weighting "had essentially no effect on our estimate of vote preferences - no more than a single point on any given day."

* * *
The issue of weighting by party is a prime example of the tension between science and art in political polling. When it comes to art, judgements are always subjective. Keeping that in mind, here is my take: I am most comfortable with a combination of the approaches of John Gorman and Peter Hart (of the Fox News and NBC/Wall Street Journal surveys respectively) described in the last post. Gorman's approach of stratifying his samples by actual turnout statistics, despite the risk that past turnout is not a perfect guide to the future, forces a defensible regional consistency across surveys that makes weighting by party less necessary. If weighting is ever necessary, I prefer Peter Hart's cautious, ad-hoc "dynamic weighting in reserve" approach.

One thing to keep in perspective: The debate over party identification is important, but those who weight by party have no magic "fix" to the sometimes random variation in surveys and those who avoid weighting are not overlooking some obvious methodological flaw.

Which leads to my last point: For the last month, Ruy Teixeira and his correspondent, Alan Abramowitz have been loudly urging pollsters to weight by party identification to correct arguably flaws they perceive in likely voter models. I am sympathetic to some of their critiques of likely voter screening. However, they are now attacking the "silly" ABC/Washington Post likely voter model and suggesting that their "registered voter results are probably a better indicator of the actual standing of the race." Perhaps. But, as we now know, ABC weights its likely voter numbers by party, but not its samples of registered voters. So is the problem about the lack of weighting or the result?

[Alan Abramowitz responds here]

I really like this blog. I was taught stats by a professor who viewed Statistics as an art rather than a science that helps the user make a decision in an environment of uncertainty.

Anyway, aren't Ruy and Alan talking about two slightly different ways to project the presidential race at this time? One of which they prefer (Register Voter results) and one which they don't prefer (Likely Voter results), but believe can be improved (party ID weighing) if they are to be reported as representative of the voting population.

(Maybe I am spliting hairs in defence of Alan and Ruy, aka cocooning)


Posted by: John-Nicholas | Oct 13, 2004 8:08:02 AM


Somebody should look into these Strategic Vision polls. The company is based here in Georgia and they have no clients but claim to be getting something like 6,200 responses over a three day period (8 states, 801 respondents in each state). I talked to Gallup and they said it was possible for them to handle this type of load (though they usually don't) while the people at Hickman and Secrest both told me they never do that much calling.

Anyway, if you notice a pattern with Strategic Vision, they put out a "poll" usually 1 or 2 days after a slew of polls came out for a given state, and the StVi poll is always about 4 or 5 points better for Bush than the other polls.

Another neat thing: There is a pattern with their polls. They ask a head to head question and also a head to head including Nader. The results are always B-K for the head to head, and then for the Nader question, Bush's number B does not move while Kerry always drops 1 or 2 points to Nader. Interesting, considering that the Democracy Corps poll shows that Nader voters right now are splitting equally among the two candidates (a protest vote for Republicans who will probably vote for the libertarian?).

Another clue as to Strategic Vision validity: they totally boffed the black population sample in their first few Georgia polls. They released a poll in August showing Bush leading 53-42, but weighted the black percentage of the poll to 18%. Actually, in Georgia it is projected to be between 23-24%, which should mean that at that time Bush was leading 49-46. So what did they make up for their next poll. They said that "just to be safe" they increased their black makeup of the poll to 25%, and then inexplicably showed Bush jumping to 54-40.

Ever since, Bush has improved his showing in every poll they've released and Kerry has dropped. And they are definitely making up the internal questions, they have Bush's approval on the economy in Georgia at somethign ridiculous like 60-20. I saw the Global Strategy Group internals for the DSCC poll of the Georgia Senate race and Bush was net negative on handling of the economy.

Anyway, I digress, but something needs to be done about these guys. They are taking advantage of a media narrative that reports polls above all else and therefor trying to influence the election by making shit up.

Posted by: Chris | Oct 13, 2004 9:50:10 AM

I don't know anything political polling but I do know something about data analysis and I think the fact that pollsters believe they must weight for party identification means that their sample is to small. If their samples were large enough, they would automatically correct for skewing caused by over sampling one party or the other.

Besides, isn't party identification a poor predictor of how people vote in Presidential elections? The entire Reagan democrat phenomenon was comprised of Democrats who voted Republican for President and then Democratic for every other office. How does "correcting" for party identification predict the extent of this phenomenon.

It would seem to me, that a large and varied sample collected asking a simple "who do plan to vote for" question would give the best results.

Posted by: Shannon Love | Oct 13, 2004 10:15:46 AM

Sounds as if simplicity must be forced upon polling firms - report the results and the demographic info with party id.

Now, I understand that each polling firm is in the business of making money and this suggestion would not allow sufficient differentiation on a national scale, but do we really need this many polling firms with so many different ways to tweak the margins?

Posted by: Eric | Oct 13, 2004 12:17:29 PM

I just wanted to issue one of my periodic invitations to visit my own website on sample weighting:


Alan Reifman
(not to be confused with Alan Abramowitz, a frequent contributor to Ruy Teixeira's blog)

Posted by: Alan R. | Oct 13, 2004 2:31:34 PM

You are improperly conflating two separate critiques by Ruy.

First, Ruy complains about the failure to weight by party affiliation for ALL poll results, be they registered voters or likely voters. Either way, if you have way more republicans than democrats, you'll get a result that is inaccurately skewed republican. (And the phenomenon of Reagan-Democrats does not undermine that argument. Even during Reagan's time, if you had a poll that was overweighted towards republicans, the results would be inaccurately skewed in Reagan's favor.)

Second, Ruy separately complains that it is too early to rely on likely voters because registered voters are still much more likely to actually vote. In other words, he doesn't really think "likely" voters are all that likely to vote.

Posted by: cramer | Oct 14, 2004 6:51:36 PM

