June 08, 2006
Is RFK, Jr. Right About Exit Polls? - Part II
This post resumes my look at the article by Robert F. Kennedy, Jr. in Rolling Stone, "Was the 2004 Election Stolen?" In Part I, I began a review of the exit poll section of the article which continues below. Passages from the article are in bold italics.
Instead of treating the discrepancies as a story meriting investigation, the networks scrubbed the offending results from their Web sites and substituted them with ''corrected'' numbers that had been weighted, retroactively, to match the official vote count.
That sounds like quite a cover-up, doesn't it? Unfortunately, like so much of the discussion of exit polls in the RFK, Jr. article, that first sentence is wildly misleading. The practice of gradually adjusting the network exit poll tabulations to reflect the actual count has been a standard practice for decades. As this procedure seems to confuse almost everyone - including as of just yesterday, Robert Kennedy himself -- a bit more explanation is in order.
I have attempted to explain the workings of this process in the past (especially here, here and here), but my understanding has grown considerably since Election Day 2004 and some of my initial explanations were, in retrospect, a bit over-simplified. So let me try again. Let's start with two important terms in the exit pollster lexicon: "projections" and "tabulations."
The exit pollsters use the term "projections" to refer to various estimates of the overall vote preference in a given state that appear on the computer screens of the "decision desk" analysts responsible for calling winners at the television networks and at the Associated Press. While the details of these proprietary "decision screens" are closely held, I am told that they include a large number of different estimates that change during the course of the day. Some of these were defined in the glossary of the Edison-Mitofsky (E/M) post-election report released in January 2005, and have names like Best Survey Estimate, Best Geo Estimate, Composite Estimate, County Model, etc. (see pp. 7-10).
The most relevant projection to this controversy has the charming name of "Best Geo Estimator," which the E/M report describes as "the best estimate" displayed on the "Decision Screens" of network analysts (p. 19). Early on Election Day, before the polls close, these estimates derive entirely from exit poll tallies. These estimates are not based on "raw" data. As part of the survey design, they are statistically adjusted (or weighted) so that the relative sizes of geographic regions within each state match their sizes in recent presidential elections (hence the "Geo" in Best Geo Estimator).
In some states, these adjustments also weight intentional "over-samples" of heavily minority precincts to their actual contribution in past elections. The selection of additional precincts allows for adequate sub-samples of Latino or African-American voters in cross-tabulations. Finally, in the 13 states where the exit pollsters conducted telephone interviews to determine the preferences of early and absentee voters, these results must be also combined and weighted to the appropriate proportions.
The main point here is that every exit poll estimate, from those done in the middle of the day, is typically "weighted" to some degree to reflect the design of the survey. Exit pollsters never look at "raw" or unweighted results simply because the survey is not designed to work that way.
In the middle to end of the day, the weights used for the Best Geo Estimator (and the other estimates) may be altered to reflect any available precinct level hard-counts of actual turnout. In other words, interviewers attempt to obtain counts of the number of voters that have actually cast ballots at the selected precinct from polling place officials. When available, these counts are usually based on the number that signed into the precinct registration book (something I described in more detail here). Although this process is far from perfect, the goal is to determine if turnout patterns appear to be different than the assumptions used to design the survey sample. At some point during the day, the exit pollsters begin to make use of interviewer tallies of the gender, race and approximate age of the voters that refused to be interviewed. They may weight their estimates to correct any apparent bias in gender, race or age suggested by those tallies.
Just before the polls close, the exit pollsters do a final tabulation (referred to as the "Call 3" data) based on the complete exit poll sample. Again, the networks analysts make use of a number of different estimates for use in projecting winners. However the final "Call 3 Best Geo Estimator" is probably the most relevant to this discussion because it is the projection based on the exit poll only, without any adjustments to match the official vote count.
Once the polls close, the pollsters work to obtain the official vote count for each precinct in the exit poll sample (as well as for a larger random sample of precincts in each state). They gradually swap out exit poll tallies within each precinct and substitute the actual vote count, on the theory that the net result will provide a gradually clearer picture of the eventual winner. At some point, their models will also incorporate the county vote data being obtained by the Associated Press.
The term "tabulations" refers to the cross-tabulations that show the complete exit poll results by demographic subgroups (gender, age, race, party, etc). These tabulations appear on the network decision screens, but also are run to PDF files (like this one) and sent to the newspapers and other news organization that pay for access to the exit poll data. On election night, the weighted tabulations appeared on CNN.com lacked an overall "vote estimate." For months, extrapolations of the overall vote based on the sex-by-vote tabulations posted to CNN on Election Night were the closest data only available to the "Call 3 Best Geo Estimator" projections
[Although note, according to the E/M report the tabulations are weighted more often to the "Composite Estimate," than to the Best Geo Estimator, especially early in the day. The Composite Estimate combines the exit poll tallies with a summary of public pre-election polls].
Here is the key point: At any given time on Election Day or Night, the exit poll "tabulations" are weighted (or "forced," to use the exit pollster lingo) to match whatever estimator the analysts consider "best" for that purpose at any given time. Since the estimators gradually change to include more and more data based on the count, the exit poll tabulations -- including those posted to network web sites -- will ultimately get "retroactively" weighted to match the vote count. That procedure has always been part of the exit poll system.
So the weighting procedure is not evidence of a cover-up. It is a feature intended to allow the exit poll cross-tab tabulations to provide as accurate a read on the actual electorate as possible.
Rather than finding fault with the election results, the mainstream media preferred to dismiss the polls as flawed.(21)
''The people who ran the exit polling, and all those of us who were their clients, recognized that it was deeply flawed,'' says Tom Brokaw, who served as anchor for NBC News during the 2004 election. ''They were really screwed up -- the old models just don't work anymore. I would not go on the air with them again.''
The mainstream media had good reason to be cynical about exit poll results, as they had memories of many such past snafus. As summarized at the end of Part I of this series, the national election polls had shown a consistent discrepancy favoring the Democrats in every presidential election since 1988. The discrepancy in 1992 was almost as great as the one in 2004. There were also similar problems resulting in overstatements of the votes cast for Pat Buchanan in the Republican primaries in New Hampshire in 1992 and Arizona in 1996. Richard Morin, polling director for the Washington Post, recounted using a 1988 national exit poll sampling that showed the presidential race "to be a dead heat, even though Democrat Michael Dukakis lost the popular vote by seven percentage points to Dubya's father."
In fact, the exit poll created for the 2004 election was designed to be the most reliable voter survey in history. The six news organizations -- running the ideological gamut from CBS to Fox News -- retained Edison Media Research and Mitofsky International,(22) whose principal, Warren Mitofsky, pioneered the exit poll for CBS in 1967(23) and is widely credited with assuring the credibility of Mexico's elections in 1994.(24) For its nationwide poll, Edison/Mitofsky selected a random subsample of 12,219 voters(25) -- approximately six times larger than those normally used in national polls(26) -- driving the margin of error down to approximately plus or minus one percent.(27)
Here we have yet another technically true but highly misleading paragraph. It is true that in 2004, the news organizations hired Warren Mitofsky and Edison Research to replace its forerunner, Voters News Services (VNS). However, Mitofsky had been the founder and director of VNS (originally known as Voter Research and Services) in 1990, and his principal deputies continued to run it after he left in 1993. Not everyone was optimistic about the change. Writing in the Columbia Journalism Review, former NBC News president Larry Grossman voiced the concerns of those worried about "the very short time frame, new election day complications such as the growing trend toward mail balloting and people's increasing tendency to mislead pollsters or refuse to be polled, and the recent history of vote-tallying failures."
The National Election Pool (NEP) certainly touted improvements made in the 2004 exit polling and projection system. It is true, for example, that the 2004 exit polls expanded to thirteen the number of states where they supplemented polling place interviews with telephone polling of those who voted absentee or by mail. It is true that Mitofsky and his partner Joe Lenski updated the computer models used by the NEP to require more statistical confidence before making a projection.
But this paragraph creates the misleading impression that Mitofsky worked to "drive down" the margin of error on the national sample. While they may have juggled statewide allocations to boost samples in some battleground states (such as Florida), according to data archived with at the University of Michigan the sample size of the so-called national sample used to estimate views of voters nationally was actually smaller in 2004 (n=12,219) than in 2000 (n=13,225) or 1996 (n=16,637). Some may also get the impression that this national sample plays a significant role in projecting the outcome of the election. It does not. Only the state exit poll estimates are used for that purpose. The national exit poll tabulation are done only to provide subgroup results for national news coverage.
Note that the "six times larger than normally used" comparison refers to the typical size of a national telephone poll, not an exit poll.
This post continues with Part III.
Related Entries - Exit Polls
Do you know if any comparison was made using the "complete" exit poll data in the pollster's prediction model(appropriately scaling the data for geography, sex,age,etc)with the "actual" vote tallys?
Posted by: Bill R | Jun 9, 2006 11:11:15 AM
Bill, I for one am not sure what you are asking. Are you asking whether the state-level model predictions (prior to the inclusion of vote counts) have been compared with the official returns? The answer to that would be yes, for two Call 3 models (Best Geo and Composite); the results appear on pp. 21-22 of the E/M evaluation report. The average model error is smaller than the average Within Precinct Error (but not small), for whatever that is worth.
Posted by: Mark Lindeman | Jun 9, 2006 11:26:19 AM
The comments to this entry are closed.