December 06, 2004

Correcting the "Correction"

Having spent the weekend with my family, I have much to cover today and not enough time to cover it. Let's start with the mini-controversy over the estimate of Hispanic voters, both nationally and in the state of Texas.

I have obtained more information about these issues. It may not settle the debate, but it at least clears up what has and has not been "corrected:"

1) Texas - The correction by Edison/Mitosfky was just that, a correction of a tabulation error. Texas was one of 13 states where the exit pollsters conducted telephone interviews among those who voted early or by absentee ballot. In Texas, early voting accounted for roughly 51% 29% of all the ballots cast (and 29% of all registered voters). I have written many times (here and here for example) about the way exit polls are weighted geographically to match actual turnout on election day. Apparently, the telephone interviews in Texas were not properly weighted geographically. A correct weighting altered the estimates of Hispanic voters.

Bottom line:  In 2000, the Voter News Service (VNS) exit poll put George Bush's percentage of the Hispanic/Latino vote at 43%. The corrected National Election Pool (NEP) for this year shows Bush's support among Texas Hispanic voters growing six points to 49% (not the 59% initially reported).

2) Last Thursday, NBC's Ana Maria Arumi offered new numbers from a larger sample of Hispanic precincts nationwide, but her report was not intended as a "correction." She emailed the listserv of the American Association for Public Opinion Research (AAPOR): "The data I presented was not a 'correction' of the national poll but rather derived from a different source - an aggregation of the 50 state polls rather than the single national exit poll. This data is still EMR[Edison/Mitofsky]/NEP data - not edited or modified in any way" (quoted with permission).

NBC's "First Read" has the best summary of what Arumi tried to say:

NBC's Ana Maria Arumi addressed the National Association of Hispanic Journalists in DC yesterday with some findings from the exit poll. "Out of the 250 precincts in the national survey there were 11 plurality Hispanic precincts," she writes for First Read. "Through the luck of the draw, four Hispanic precincts were in Florida and three of those were in Miami-Dade County. This demonstrates some of the clustering effects you can have in a national sample of 250 precincts when you are looking at breakouts of subgroups like Hispanics -- in this case an overrepresentation of Cuban opinion in the overall Hispanic numbers."

"To ameliorate this clustering problem I aggregated the 50 state polls which were collected from a total of 1,469 precincts and looked at the Hispanic data in this much larger sample, which yielded smaller, but still significant, Bush gains among Hispanics:" 40% for Bush to 58% for Kerry.

Regular readers will note that Arumi's description of error due to clustering is exactly the sort of problem I anticipated in this post.

For those who find all of this confusing, let me try to translate: The "national" exit poll posted by CNN, CBS and NBC was a stand-alone survey based on roughly 14,000 interviews conducted at 250 precincts. The Hispanic results this survey have not been and will not be "corrected." Also, the Texas correction does not affect this study, because the Texas sample was not part of the stand-alone "national" exit poll. [Joe Lenski of Edison Research emailed to correct me on this point: “The 250 national precincts are a subsample of the 1469 state precincts and the data from the national precincts are included in both the state surveys and the national survey.”]

What Arumi reported was a bigger and better estimate of Hispanic voters derived by combining the separate exit polls for the 50 states (sampling 1,469 precincts) and weighting them appropriately by geography. This larger, better estimate did not include any of the precincts or interviews included in the "national" poll, but did include the corrected Texas results.  It's not a "correction," just a bigger, more accurate estimate.

[Another clarification: The estimate that combines all interviews in all 50 states most likely produces a better estimate of Hispanic voters because it uses a larger number of precincts. However, extreme weights are necessary to bring the distribution of interviews across all states into line with the actual geographic distribution of voters: Some states are weighted down to as little as one quarter to one fifth of their original value, while others weighted up by a factor of as much as 4-5. This extreme weighting makes the sampling error much higher than one would expect given the very large sample size. Thus, my use of “bigger” is somewhat misleading].

The bottom line: In 2000, George Bush got 35% of Hispanic voters nationwide according to the VNS exit poll.  In 2004, Bush's got 44% of Hispanics on the "national" NEP poll and 40% of the vote on the larger tabulation based on all of their state polls combined.   The 40% number is presumed to be more accurate, as it was based on a larger number of precincts.

3) Contrary to my earlier speculation, NEP did not intentionally "over-sample" Hispanic precincts anywhere. I was thrown by the vague language of the Scripps-Howard story.  My apologies for adding to the confusion.

4) On Friday, I linked to a press release by the William C. Velasquez Institute (WVI). That release (also excerpted by Ruy Texiera) reported on results from a separate study commissioned by WVI. Via email, Arumi made it clear that she and NBC "do NOT endorse the WVI poll nor their sampling methodology." So noted.

I assume that many readers will be frustrated with the way these clarifications and additional tabulations drip out slowly, if at all. I hope that they give NBC and Arumi some credit for at least putting more information into the public domain. Why does it take the networks so long to release simple corrections? Why are clear explanations so elusive? On those questions, I remain equally mystified.

Richard Nadler, on National Review .com ,12-8-04 reports that using the corrected state Hispanic percentages reduces Bush's Latino support to 38%. I would assume that this does not include a correction for Oklahoma, which was way too high. The magnitude of the shift here is large enough to be most of the adjustment that was made to bring down the overestimated Kerry support from all sectors.

Posted by: John S Bolton | Dec 10, 2004 2:50:54 AM

Great info as always; thanks so much. It would be great to see some of your news, findings, research, estimates and opinions bundled on one page summary. Somewhere all the why and when and by what person news can be cut out and just the numbers and source, along with any needed caveats, can stand. Hope that makes sense. Let me know if you need help with any graphic presentations of data. I might be able to do that. Thanks again!


Posted by: Doug Hess | Dec 15, 2004 2:08:15 PM

