« NCPP Special Citation | Main | Ideology as a "Diagnostic?" - Part I »

June 02, 2005

USCV vs. USCV

Back to exit polls for a moment.  Bruce O'Dell, the founding Vice President of U.S. Count Votes (USCV), the organization that has been arguing that the official explanations for the "exit poll discrepancy" are "implausible," has just released a paper that refutes, well...the most recent "working paper" by U.S. Count Votes. 

Some background: Back in April, they released a report titled "Analysis of Exit Poll Discrepancies" (discussed on MP here and here) that purported to show the implausibility of explanations for the discrepancies provided by the exit pollsters themselves.  Subsequently, Elizabeth Liddle, a self-described "fraudster" who had reviewed early drafts of that report did her own analysis and showed that a statistical artifact undermined the conclusions of the US Count Votes report (discussed by MP here).  At the AAPOR conference last month, exit pollster Warren Mitofsky presented findings based on Liddle's work that confirmed her hypothesis.  At the conference, US Count Votes author Ron Baiman distributed another "working paper" that claimed to refute Liddle's work.  The new paper was signed by only four of the twelve authors of the original USCV report.  He subsequently posted several very long comments on this site in the same vein.

Got that?  It's been quite a "debate."

Yesterday Bruce O'Dell, one of the original USCV authors, stepped forward with his own forceful evisceration of Baiman's arguments based on the computer simulations O'Dell did for USCV.  MP had considered doing a summary of O'Dell's paper, but could not find a way to improve on O'Dell's: 

The key argument of the USCV Working Paper is that Edison/Mitofsky's exit poll data cannot be explained without either (1) highly improbable patterns of exit poll participation between Kerry and Bush supporters that vary significantly depending on the partisanship of the precinct in a way that is impossible to explain, or (2) vote fraud. Since they rule out the first explanation, the authors of the Working Paper believe they have made the case that widespread vote fraud must have actually occurred.

However, a closer look at the data they cite in their report reveals that Kerry and Bush supporter exit poll response rates actually did not vary significantly by precinct partisanship. Systematic exit poll bias cannot be ruled out as an explanation of the 2004 Presidential exit poll discrepancy - nor can widespread vote count corruption. The case for fraud is still unproven, and I believe will never be able to be proven through exit poll analysis alone.

This paper should not be misinterpreted as an argument against the likelihood of vote fraud. Quite the opposite; I believe US voting equipment and vote counting processes are severely vulnerable to systematic insider manipulation and that is a clear and present danger to our democracy. I strongly endorse the Working Paper's call to implement Voter-Verifiable Paper Ballots and a secure audit protocol, and to compile and analyze a database of election results.

Judging only by the word count of the comments in my last post on this subject, there may appear to be some genuine question about whether the explanations provided by Edison-Mitofsky for the discrepancy between the exit poll results and the actual count are "plausible."  There is to be sure, much sound, fury and name-calling in this debate, but on the substance and the evidence the jury is in.  Even US Count Vote's founding Vice President can see it.  As O'Dell says, "systematic exit poll bias cannot be ruled out as an explanation of the 2004 Presidential exit poll discrepancy."

UPDATE - DemFromCt over at DailyKos chimes in on O'Dell's paper and stresses a point I neglected.  Dem takes issue with:

The multiple attacks on Elizabeth Liddle's credentials, motivation, etc. (and those of anyone who agrees with her) that's become a cottage industry at DU [Democratic Underground]  and at times, here at Daily Kos by a minority of posters. Kudos to Bruce O'Dell to have the intellectual integrity to write this; my hat is doffed. I hope his paper (and post) is read in the spirit in which it was written. And we really need to move on to something else [link and emphasis added].

Agree on all counts

* * *

Note: As always, MP welcomes dissenting opinions in the comments section.   However, the subject of exit polls and voter fraud seems to generate an unusual level of invective.  One comment in the last post on this topic inexplicably mocked the religious faith of another commenter.  I have deleted that comment -- the first time I have ever seen the need to do so on Mystery Pollster.

MP is generally libertarian when it comes to the comments section, but found the earlier comment to be repugnant and unacceptable, so please be advised:  There is no room for slurs against the gender, race, ethnicity, religion or sexual orientation on Mystery Pollster.  In the future, I will not hesitate to delete comments I consider morally offensive.  My board, my rules. 

Related Entries - Exit Polls

Posted by Mark Blumenthal on June 2, 2005 at 04:44 PM in Exit Polls | Permalink

Comments

Putting aside the statistical arguments, I find it incredibly unethical for Ron Baiman et al. to publish a paper that includes Bruce O'Dell's work without Bruce O'Dell's signature, agreement or approval of the paper.

Posted by: Sawyer | Jun 2, 2005 5:04:33 PM

I believe Bruce allowed them to include his work in the paper, but thought that his refusal to sign the paper was indicative enough that he wasn't comfortable with the conclusions being drawn from it. Maybe he can visit and comment.

Why would anyone continue to read the USCV revisions when: 1) they haven't had the decency to put febble's critique on their website under their section on criticisms of their work; and 2) the flat regression line through the scatterplot presented by Mitofsky destroys the Bsvcc hypothesis of their 3/31 study.

My 6 year old son can look at that scatterplot and know that nothing "odd" is going on in the Bush strongholds comapred to the other categories. Oh, Ron - the arbitrarily drawn categories are not "quintiles" as you keep referring to them. If E-M had used quintiles, we wouldn't be having this discussion (although it would undoubtedly be a different discussion, because the end seems to justify any mean for some at USCV).

At some point this entire discussion is worthless. But thanks MP for posting the link and a big hat tip to Bruce for taking a stand.

Posted by: Rick Brady | Jun 2, 2005 6:56:13 PM

Unusual level of invective for the MP blog, at least.

The discourse on your comments is unusually serious minded. IMHO your community is a model for fruitful discourse.

Posted by: Alex in Los Angeles | Jun 3, 2005 6:09:53 PM

Excellent: (1) variance is not constant but sensitive to partisanship and (2) we would know more if we could actually see the data. Finally we have a sensible conclusion.

The root problem all along seems to have been that naive intuitions are correct for horse races (p=0.5 approx.) but not every precinct is a dead heat, even if the national race is.

In the limiting case of a biased precinct, we get a case where the number of voters for the "out party" might be a "rare event"--in the limiting case, competing with the third-parties we seem to like to consider background noise. The expected number of outs in the sample might be, say, Poisson-distributed rather than Gaussian as we had back in the ol' F.D. (Freeman Days). Surely the distribution of "expected N(outs)" will be asymmetric!

Remember that outs and ins don't "compete" with each other--that's the root of our symmetry intuition. We really have two sets of people walking out--black balls from the urn who are a dime a dozen, and *independently* red ones who are rare. In the partisan case, the "clumpiness" of these rare events might show up quite significantly [rather like bursts or batch jobs vs. memoryless exponential arrivals in queuing theory]. There's a K-P correction factor for burstiness lurking there somewhere as well as a Fraud correction. :)

It would be interesting to compare precincts that were partisan in 2004 vs. 2000 against those that changed from partisan to non-partisan or vice versa.

Posted by: John Goodwin | Jun 3, 2005 10:47:42 PM

Yeah, this place is so educational.

The potential money from participants probably isn't there yet, but I'd love to see this turn into a kind of open-source polling outfit. Come up with questions, finance a poll, publicly tear apart and examine the results, form new theories, finance another poll, etc... I mean, you've all already shown that this kind of participation leads to intelligence that is smarter than Mitofsky. ;-)

Posted by: tunesmith | Jun 4, 2005 4:24:43 PM

It seems that the discourse has now moved from "not ruling out" rBr to the plausibility of rBr, where it should have been all the time.

Truth-is-All (TIA) on DU has produced an "optimizer" model which he thinks proves the total implausibility of the rBr hypothesis.

http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375366&mesg_id=375366

Please note that Ron Baiman has said that TIA has considerable expertise in quantitative analysis and has asked TIA to join "join our list."

http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375366&mesg_id=375497&page=

I guess Baiman means join the list of USCV sponsors.

TIA has called on MP to comment on his optimizer model:

http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375402&mesg_id=375402


And if I understand TIA correctly, before he advocated the model he ran a 100,000 trial simulation with it.

http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=203x373725

Coincidentally, after being challenged by TIA, Elizabeth Liddle has now packed it in and will no longer post on DU.

http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=203x375576

Posted by: davidgmills | Jun 6, 2005 11:48:10 AM

Thanks for the play-by-play david. If I have kept score correctly, TIA has febble running scared (1-0), Ron B. giving high fives (2-0), and MP worried his reputation will be soiled if he doesn't respond to the public challenge for a duel (3-0). Is that correct?

Some questions:
1) Can you define the Edison-Mitofsky hypothesis and tell us how this fits with what you call "rBr"?
2) Can you explain how TIA's optimizer model works and how it differs from the USCV, O'Dell, or Liddle models? (i.e., what are the inputs and outputs?)
3) How exactly has TIA proven rBr to be implausible?;
4) At what point does rBr become implausible? (i.e., is rBr - how you define it - implausible under all circumstances, or only some circumstances? Please elaborate.);
5) What is Ron B.'s hypothesis now that he finally realizes that he's been arguing against a definition of "constant mean bias" that only he was using?; and
6) What ever happened to the Bsvcc hypothesis of the 3/31 study? Is it still viable given the data now in the public realm?

Posted by: Rick Brady | Jun 6, 2005 3:31:43 PM

Thanks Rick for baiting me. You know I am no mathematician. I don't pretend to be. I am just the messenger. Don't shoot me. Just reporting what I glean from what I read.

But the census report was interesting too don't you think? 53.6% of voters being women.

Especially given Pew's breakdown of party ID at the beginning of 2004.

http://people-press.org/reports/display.php3?PageID=750

This quote is interesting for an article that claims the Repubs and Dems have reached near parity:

"Women tilt Democratic by a margin of 36% to 29%, while men favor the Republican party by a margin of 32% to 27%.

Women in every age group are more Democratic than Republican, with the largest gaps occurring among those age 60 and older. But Democrats also have a big advantage among young women (ages 18-24) and Baby Boomers."

By my calculations, if Mytofsky is right and the census is right and between 53.6% and 54% of the voters were women, Kerry should have won about 52.9% of the vote.

I guess those rBr's were mostly women.

Posted by: davidgmills | Jun 6, 2005 4:00:16 PM


TIA is concerned with a different problem--not the explaination of WPE discussed here, but whether the proposed rBr mechanism collides with the global constraint that the vote totals (percentage of popular vote) be such and such. His reported simulation result purports to show that the rBr mechanism cannot lead to the observed election vote split. [Though his later comment, that in a model sensitivity test Kerry "wins" only 68% of the simulations to Bush 32%, suggests his results, if the calculations were done correctly, may be significant at the 32% level--i.e. not significant :) ]

Just as a general comment on methodology--proposing 8 models and showing that, when [mis]calculated, none of them satisfy some known constraints on the dataset--doesn't really prove anything much.
You are entitled to conclude, either the model or the calculation are wrong, or both.

We've thrown out the election by hypothesis. So what we are left with is a projection from--not the poll--but some gross and much-rounded statistical summaries of the poll. In other words, we take a poll that was only good for 3% to begin with, throw away most of the information in it except for some contentious and disputed summary --and what? Make a prediction of who "really" (was going to later that day...) win? Same as the pollster himself did on the day of the election only with a fraction of his information? We are are supposed to prove using mathematics alone not only who *certainly* won but prove fraud as well?

Let's apply a conservation of information law: whatever TIA knows, it can't be *more* than M/E on the day of the election! Now, I'm the first to admit that their are mathematical constraints violated in the "leaked" data--after all, the tabulation programs were broken and fixed in flight, per the post mortem. But really! These efforts just distract from the main point, which is that without the raw data, as a scientist you know nothing. Mitovsky is beneath contempt for not relasing the raw data (at least, he may be a patriot to some partisan political or business interest, but he isn't serving any scientific end).

TIA's claim, however, is that there is no tabulatable data set (that's what a proposition like "no possible or at least reasonable way to satisfy constraints" means when the rubber hits the road--no viable and reasonable data set). But of course then either E/M *knows* his data set doesn't really tabulate and is fraudulently concealing the fact, or else they are pathetically deluded that their tabulation programs works correctly. The answer either way is, of course, for independent reviewers to see the underlying data set.

[Now, back in January, I presented my own arguments that no such data set can really exist, and that the real "secret" about the "leaked" data sets is that all the TV projections were done with flawed tabulation programs, but those arguments were based on signatures of data subsets that appeared in the various cross-tabs, not on the final tabulations].

In any event, a quick look at your posted links shows TIA calculates the "weighted" response rate by using the *number of precincts* as a weighting factor for the response proportions. Hunh? The precincts all have the exact same population? I don't think we need pay too much attention to such simulations. But if the code is posted somewhere someone can take a look see but this one isn't going to pass peer review. Not even close.

Posted by: John Goodwin | Jun 6, 2005 4:07:20 PM

The funny thing is that Ron Baiman is so desperate that he offered TIA a place in USCV for his (get this) "considerable expertise". I really wish TIA would accept - that would be a death knell for USCV. Accept, TIA, accept.

davidgmills - it is disingenuous for you to come here with the TIA crap when you know that

1. When faced with questions TIA never answers them, claiming that they come from "disruptors" (see http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375609&mesg_id=375609)

2. When someone persists in debunking TIA's amateurish incompetency, TIA runs to the moderators on DU and the debunker gets banned - every time.

Posted by: Sawyer | Jun 6, 2005 4:17:51 PM

The URL above got a ) attached to it so won't work. Here is a tinyurl for anyone interested:

http://tinyurl.com/76dm8

Posted by: Sawyer | Jun 6, 2005 4:20:41 PM

Lizzie was never banned. Brady was banned. So was I. An equal opportunity banner, it seems.

Posted by: davidgmills | Jun 6, 2005 4:28:44 PM

John:

In seven minutes, you managed to review all of TIA's work. That about says it all.

Posted by: davidgmills | Jun 6, 2005 4:31:01 PM

David - the world is round.

Posted by: John Goodwin | Jun 6, 2005 4:50:39 PM

Febble was not banned on DU for one reason - she was *very* careful not to contradict TIA directly, in any thread, on purpose, because she did not want to be banned.

Just see the thread above that I gave the link to: http://tinyurl.com/76dm8 Too bad I didn't save it before the moderator went through and deleted all the interesting bits. But the gist of it was: people asked TIA to back up his position, TIA said "I don't explain anything, I am God around here", and people commented on how pathetic that kind of attitude is.

Posted by: Sawyer | Jun 6, 2005 5:24:10 PM

Ya, I have a feeling there's some sort of connection between the moderators and TIA at DU, I am kobeisguilty on DU (I posted the comment that the tinyurl linked to) and I have tried to have a reasonable discussion with TIA on the subject, but within 2 posts of disagreement he lets the ad hominem attacks fly. But everytime I get my message deleted for criticizing TIA's personal attacks.

Anyways, who really expects to have a reasonable discourse on DU anyways?

And david: "In seven minutes, you managed to review all of TIA's work. That about says it all."

Lets avoid the quips on this board please... John was good enough to provide a detailed analysis with a logical argument behind why he doesn't support TIA's thesis, either respond in kind or let someone else respond for you, we're all friends here and the less amount of sarcastic fluff to wade through here, the more educated MP's audience will be :)

kizzle(kobeisguilty)

Posted by: Kizzle | Jun 6, 2005 6:42:55 PM

Funny thing is that TIA and Ron B don't agree on some basic things. It's Ron B's algebra that said Bush Strongholds have more Vote Count Corruption (Bsvcc). TIA's odds calculations show that the impossible one-tail p-values for design effect equals FRAUD EVERYWHERE!

In TIA's world, there is no "rBr" (or design effect for that matter). In Ron's world, the rBr is causing the "sausage to float" (y-intercept above zero), and therefore it is prevalent, but he doesn't much care because *something* funny is going on in the High Bush precincts that can't be explained by rBr (constant mean bias cannot explain the mean and median WPE pattern by precinct partisanship!).

John Goodwin - I completley disagree with you about the raw dataset. Perhaps a "blurred" dataset could be released, but unless a court orders the release, E-M and the NEP have a responsibility to protect exit pollsters and subjects. I'd love to see a dataset similar to the one released to ESI for Ohio released to everyone. However, preparation of the OH dataset took a while. Who is going to front the cash for all the other states?

David, I was banned from DU because I am an "evil" conservative. Why don't you tell everyone why you were banned?

If it wasn't for febble, OTOH, and Bruce, activity on the DU 2004 Election board would be nil. No one really reads it anymore. There are few "believers" left. Banning OTOH or Febble, when they clearly are abiding by the rules of the Board, would only chase off the remaining reasonable DUers (thankfully, that is not yet an oxymoron). TIA, autorank, SunshineKathy, TomMcintyre, MelissaG, RonB and crew can have fun talking to themselves. May they continue to forward their "analysis" to John Conyers and Jim Lampley.

Posted by: Rick Brady | Jun 6, 2005 6:58:29 PM

David-

Did you read my comments regarding the census poll data when you first posted them? The census data is inaccurate for two reasons:

1) The census folks say it is-they systematically overreport the number of voters. In 2000 it was a 5 million vote difference.

2) You can pick a state, do some simple math, and compare the number of registered voters reported by the Census Bureau to the actual number of registered voters reported by the Secretary of State. California is off by 2.4 million voters, for example. When easily verifiable facts are wrong, I tend to discount subsequent analyses.

Which is where I am with TIA. In his poll "Which of these facts convinced you that the election was stolen?" the top vote getter is: "97% of 40,000 documented voting anomalies favoring Bush"

Presumably he is referring to The "Election Incident Reporting System" (reachable through www.verfiedvoting.org) which currently has 42,696 incidents. Of those incidents, 1271 (less than 3%!) mention "Bush" and/or "Republican." Over 9K incidents are "Polling Place Inquiries" (Where do I vote?) The largest category (15K+) is Registration Related Problems (typically "Am I registered?").

The most popular reason by far on TIA's poll for suspecting that the election is fraudulent is completely made up. My guess is that much of TIA's other work evidences a similar level of skill in quantitative analysis.

Marty H

Posted by: Marty H | Jun 7, 2005 2:30:28 AM

To David, a correction:

No coincidence is required to explain the timing of my departure at DU at all. I also departed on a day the sun rose in the East. Derisive comment directed at me on DU has been even more frequent and predictable.

I like debate. It was not a debate. And eventually a good friend persuaded me to take my hand out of the blender.

Posted by: Febble | Jun 7, 2005 5:55:15 AM

Rick, when I say Mitovsky is beneath contempt, I mean that literally and not as the slams that seem so common in these discussions. Unfortunately, he has put himself and his organization in the position that we must, by scientific canon, ignore what they have to say and if they can't cough up that's their problem.

You don't get to play scientific rational being and hide your data from scrutiny at the same time. At some point the scientific community has to stop listening to cranks, as a policy for protecting its integrity. The means of avoiding such a fate is simple: open your log book, show your calculations, spell out your logic, submit to peer review with reasonable conditions. If you say But... But... But... -- that's a red flag. You loose cred fast.

In a scientific discussion, there comes a point where people question your data--raise, politely but firmly, enough legitimate objections that you have to show your data (or submit it to peer review, or work out some blurring protocol that proves your point without compromising your ethics) ... to maintain even a pretence of "science".

If Mitovsky happens to have sold his scientific reputation to some mediaconglom too cheap and doesn't happen to have enough money to buy it back, and no one else wants to bail him out, then too bad. You takes your money and you pays your chances. There's no law that says there has to be a believable exit poll. There's no law that says we have to believe one TV channel or one polling organization. If they can't afford to play science because they have families to feed--well, they have lots of company.

Beneath contempt is not contempt. It means you don't satisfy the entrance criteria for honor, rationality, or science and can't be held up to that bar. We don't get to ask "how scientific are these experimental results for which we don't see the data".

O'dell and the other scientific workers have done what can be done with what we know--and that's about it. If there's no raw (near-raw) data, then this is just Cold Fusion anyway. It is important to combat the illusion that we know this or that when we don't. We need to slap down reports that have a lot of math mumbo-jumbo but any scientist can see is hogwash. When those things get out into the street, they scare the horses. It is negligent when someone says "Green cheese hence FIRE!" not to say "Nope. No Fire."

I would love to see data. But if we don't have it, it is dishonest to pretend to conclusions. They gots it, we don't. No science here folks... move along.

Posted by: John Goodwin | Jun 7, 2005 9:31:25 AM

John, I believe that you are right in many respects, but wrong in others. Perhaps there is a distinction to be drawn between social sciences and hard sciences? Social Scientists deal with sensitive data all the time (heck, I had to go through this insane human subjects process just to interview 10 people for my honor's thesis). Are their conclusions challenged as unscientific if they don't reveal the subjects of their study?

I believe release of "blurred" data similar to what was released to ESI is in order. Time and money are the issues now. I believe that ESI has committed to making the blurred Ohio dataset available.

Also, you mentioned above a problem you suspected with the "projection tabulation programs." Can you clarify? There is a difference between exit poll "projections" and "estimates."

Posted by: Rick Brady | Jun 7, 2005 10:46:50 AM

To answer your query, Rick. You can go back an read the posts if you like, but the gist of my comment was: if you recall, on election night some PDFs were leaked (see scoop.nz) that had regional distributions of answers to about 50 questions (some questions were 1/2 or 1/4 of the form).

I ignored the races and just looked at the number of respondents to each question, treating the N respondents as a 50-dimensional vector and looked at possible decompositions (cross-tabulations of the number of respondents). There was one field that should have been tabulated for all records, but for which exactly 2000 respondents were not tabulated. Each of its regional components was exactly divisible by 4, so a reasonable hypothesis is that 500 records with a weight of 4 were included in the data set released to the media.

That 500 record subset was overwhelmingly in the West, suggesting that it was some sort of "seeding" meant to fill in until real data (or a different kind of data) arrived. Since it had a clear "signature" [missing set of data with same regional distribution] it was provably present in all three leaked data sets (and hence not part of an error).

What I proved is that between the "House" and the "President" data sets, if they were filters of the same relational data base, more data had to change in the West than was implied in the "signature". Feasible conclusions that follow from this fact: (1) the tabulation program used was broken [admitted in post mortem], (2) the data were generated dynamically by a simulation program or similar, not by filtering a database to which rows were being added, (3)the house and president data sets were for different databases. (3) is attractive but I disproved it -- not only did both data sets share the "signature" data, but by a sort of Kasiski cryptanalysis of the *number of respondents* to each question, you could prove that the data sets were improbably and undeniably related. They could be modelled as filters of a common database with almost complete overlap--i.e. differing only in obvious ways, such as "president" excluding non-respondents to "who did you vote for?"

By looking at just 5 or 6 rows, N respondents not B or K stuff, you could show the output of the tabulation program on election night was either broken, or tabulating two different data sets that shared identical statistical properties, or was pure synthesis. Skeptics take your pick but rational people don't get other choices. :)

Posted by: John Goodwin | Jun 7, 2005 11:42:01 AM

Thanks John. Got it.

Posted by: Rick Brady | Jun 7, 2005 12:03:18 PM

I’m a long time reader and fan of MP -- I sent my students here at MIT to the site last fall in the run-up to the election and they all found it very helpful. However, I’m a first-time poster.

My understanding is that the individual-level data is available through ICPSR and Roper. The data is free to members and can be purchased for $79 from Roper.

For a description see: http://webapp.icpsr.umich.edu/cocoon/ICPSR-STUDY/04181.xml

Is there other data folks are looking for?

Posted by: Adam Berinsky | Jun 7, 2005 12:28:55 PM

If that data set were to identify the precincts by name, I think they would have more sales. That of course is the issue--if you have a prima facie case for either fraud or incompetence, then checking the calibration of the measurement to find the sytematic error is the thing to do. Supposedly, everyone polled also ended up in a vote tally, so the random walk each candidate does in each precinct has to start from the poll, and walk plausibly towards to target, or, or, or, ... ?

Posted by: John Goodwin | Jun 7, 2005 2:04:02 PM

The comments to this entry are closed.