USCV vs. USCV

June 2, 2005November 14, 2019 Mark Blumenthal98 Comments

Back to exit polls for a moment. Bruce O’Dell, the founding Vice President of U.S. Count Votes (USCV), the organization that has been arguing that the official explanations for the “exit poll discrepancy” are “implausible,” has just released a paper that refutes, well…the most recent “working paper” by U.S. Count Votes.

Some background: Back in April, they released a report titled “Analysis of Exit Poll Discrepancies” (discussed on MP here and here) that purported to show the implausibility of explanations for the discrepancies provided by the exit pollsters themselves. Subsequently, Elizabeth Liddle, a self-described “fraudster” who had reviewed early drafts of that report did her own analysis and showed that a statistical artifact undermined the conclusions of the US Count Votes report (discussed by MP here). At the AAPOR conference last month, exit pollster Warren Mitofsky presented findings based on Liddle’s work that confirmed her hypothesis. At the conference, US Count Votes author Ron Baiman distributed another “working paper” that claimed to refute Liddle’s work. The new paper was signed by only four of the twelve authors of the original USCV report. He subsequently posted several very long comments on this site in the same vein.

Got that? It’s been quite a “debate.”

Yesterday Bruce O’Dell, one of the original USCV authors, stepped forward with his own forceful evisceration of Baiman’s arguments based on the computer simulations O’Dell did for USCV. MP had considered doing a summary of O’Dell’s paper, but could not find a way to improve on O’Dell’s:

The key argument of the USCV Working Paper is that Edison/Mitofsky’s exit poll data cannot be explained without either (1) highly improbable patterns of exit poll participation between Kerry and Bush supporters that vary significantly depending on the partisanship of the precinct in a way that is impossible to explain, or (2) vote fraud. Since they rule out the first explanation, the authors of the Working Paper believe they have made the case that widespread vote fraud
must have actually occurred.

However, a closer look at the data they cite in their report reveals that Kerry and Bush supporter exit poll response rates actually did not vary significantly by precinct partisanship. Systematic exit poll bias cannot be ruled out as an explanation of the 2004 Presidential exit poll discrepancy – nor can widespread vote count corruption. The case for fraud is still unproven, and I believe will never be able to be proven through exit poll analysis alone.

This paper should not be misinterpreted as an argument against the likelihood of vote fraud. Quite the opposite; I believe US voting equipment and vote counting processes are severely vulnerable to systematic insider manipulation and that is a clear and present danger to our democracy. I strongly endorse the Working Paper’s call to implement Voter-Verifiable Paper Ballots and a secure audit protocol, and to compile and analyze a database of election results.

Judging only by the word count of the comments in my last post on this subject, there may appear to be some genuine question about whether the explanations provided by Edison-Mitofsky for the discrepancy between the exit poll results and the actual count are “plausible.” There is to be sure, much sound, fury and name-calling in this debate, but on the substance and the evidence the jury is in. Even US Count Vote’s founding Vice President can see it. As O’Dell says, “systematic exit poll bias cannot be ruled out as an explanation of the 2004 Presidential exit poll discrepancy.”

UPDATE – DemFromCt over at DailyKos chimes in on O’Dell’s paper and stresses a point I neglected. Dem takes issue with:

The multiple attacks on Elizabeth Liddle’s credentials, motivation, etc. (and
those of anyone who agrees with her) that’s become a cottage industry at DU [Democratic Underground] and
at times, here at Daily Kos by a minority of posters. Kudos to Bruce O’Dell to
have the intellectual integrity to write this; my hat is doffed. I hope his
paper (and post) is read in the spirit in which it was written. And we really
need to move on to something else [link and emphasis added].

Agree on all counts

* * *

Note: As always, MP welcomes dissenting opinions in the comments section. However, the subject of exit polls and voter fraud seems to generate an unusual level of invective. One comment in the last post on this topic inexplicably mocked the religious faith of another commenter. I have deleted that comment — the first time I have ever seen the need to do so on Mystery Pollster.

MP is generally libertarian when it comes to the comments section, but found the earlier comment to be repugnant and unacceptable, so please be advised: There is no room for slurs against the gender, race, ethnicity, religion or sexual orientation on Mystery Pollster. In the future, I will not hesitate to delete comments I consider morally offensive. My board, my rules.

Mark Blumenthal

Mark Blumenthal is political pollster with deep and varied experience across survey research, campaigns, and media. The original "Mystery Pollster" and co-creator of Pollster.com, he explains complex concepts to a multitude of audiences and how data informs politics and decision-making. A researcher and consultant who crafts effective questions and identifies innovative solutions to deliver results. An award winning political journalist who brings insights and crafts compelling narratives from chaotic data.

98 thoughts on “USCV vs. USCV”

Sawyer says:

June 2, 2005 at 5:04 pm

Putting aside the statistical arguments, I find it incredibly unethical for Ron Baiman et al. to publish a paper that includes Bruce O’Dell’s work without Bruce O’Dell’s signature, agreement or approval of the paper.
Rick Brady says:

June 2, 2005 at 6:56 pm

I believe Bruce allowed them to include his work in the paper, but thought that his refusal to sign the paper was indicative enough that he wasn’t comfortable with the conclusions being drawn from it. Maybe he can visit and comment.
Why would anyone continue to read the USCV revisions when: 1) they haven’t had the decency to put febble’s critique on their website under their section on criticisms of their work; and 2) the flat regression line through the scatterplot presented by Mitofsky destroys the Bsvcc hypothesis of their 3/31 study.
My 6 year old son can look at that scatterplot and know that nothing “odd” is going on in the Bush strongholds comapred to the other categories. Oh, Ron – the arbitrarily drawn categories are not “quintiles” as you keep referring to them. If E-M had used quintiles, we wouldn’t be having this discussion (although it would undoubtedly be a different discussion, because the end seems to justify any mean for some at USCV).
At some point this entire discussion is worthless. But thanks MP for posting the link and a big hat tip to Bruce for taking a stand.
Alex in Los Angeles says:

June 3, 2005 at 6:09 pm

Unusual level of invective for the MP blog, at least.
The discourse on your comments is unusually serious minded. IMHO your community is a model for fruitful discourse.
John Goodwin says:

June 3, 2005 at 10:47 pm

Excellent: (1) variance is not constant but sensitive to partisanship and (2) we would know more if we could actually see the data. Finally we have a sensible conclusion.
The root problem all along seems to have been that naive intuitions are correct for horse races (p=0.5 approx.) but not every precinct is a dead heat, even if the national race is.
In the limiting case of a biased precinct, we get a case where the number of voters for the “out party” might be a “rare event”–in the limiting case, competing with the third-parties we seem to like to consider background noise. The expected number of outs in the sample might be, say, Poisson-distributed rather than Gaussian as we had back in the ol’ F.D. (Freeman Days). Surely the distribution of “expected N(outs)” will be asymmetric!
Remember that outs and ins don’t “compete” with each other–that’s the root of our symmetry intuition. We really have two sets of people walking out–black balls from the urn who are a dime a dozen, and *independently* red ones who are rare. In the partisan case, the “clumpiness” of these rare events might show up quite significantly [rather like bursts or batch jobs vs. memoryless exponential arrivals in queuing theory]. There’s a K-P correction factor for burstiness lurking there somewhere as well as a Fraud correction. 🙂
It would be interesting to compare precincts that were partisan in 2004 vs. 2000 against those that changed from partisan to non-partisan or vice versa.
tunesmith says:

June 4, 2005 at 4:24 pm

Yeah, this place is so educational.
The potential money from participants probably isn’t there yet, but I’d love to see this turn into a kind of open-source polling outfit. Come up with questions, finance a poll, publicly tear apart and examine the results, form new theories, finance another poll, etc… I mean, you’ve all already shown that this kind of participation leads to intelligence that is smarter than Mitofsky. 😉
davidgmills says:

June 6, 2005 at 11:48 am

It seems that the discourse has now moved from “not ruling out” rBr to the plausibility of rBr, where it should have been all the time.
Truth-is-All (TIA) on DU has produced an “optimizer” model which he thinks proves the total implausibility of the rBr hypothesis.
http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375366&mesg_id=375366
Please note that Ron Baiman has said that TIA has considerable expertise in quantitative analysis and has asked TIA to join “join our list.”
http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375366&mesg_id=375497&page=
I guess Baiman means join the list of USCV sponsors.
TIA has called on MP to comment on his optimizer model:
http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375402&mesg_id=375402
And if I understand TIA correctly, before he advocated the model he ran a 100,000 trial simulation with it.
http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=203×373725
Coincidentally, after being challenged by TIA, Elizabeth Liddle has now packed it in and will no longer post on DU.
http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=203×375576
Rick Brady says:

June 6, 2005 at 3:31 pm

Thanks for the play-by-play david. If I have kept score correctly, TIA has febble running scared (1-0), Ron B. giving high fives (2-0), and MP worried his reputation will be soiled if he doesn’t respond to the public challenge for a duel (3-0). Is that correct?
Some questions:
1) Can you define the Edison-Mitofsky hypothesis and tell us how this fits with what you call “rBr”?
2) Can you explain how TIA’s optimizer model works and how it differs from the USCV, O’Dell, or Liddle models? (i.e., what are the inputs and outputs?)
3) How exactly has TIA proven rBr to be implausible?;
4) At what point does rBr become implausible? (i.e., is rBr – how you define it – implausible under all circumstances, or only some circumstances? Please elaborate.);
5) What is Ron B.’s hypothesis now that he finally realizes that he’s been arguing against a definition of “constant mean bias” that only he was using?; and
6) What ever happened to the Bsvcc hypothesis of the 3/31 study? Is it still viable given the data now in the public realm?
davidgmills says:

June 6, 2005 at 4:00 pm

Thanks Rick for baiting me. You know I am no mathematician. I don’t pretend to be. I am just the messenger. Don’t shoot me. Just reporting what I glean from what I read.
But the census report was interesting too don’t you think? 53.6% of voters being women.
Especially given Pew’s breakdown of party ID at the beginning of 2004.
http://people-press.org/reports/display.php3?PageID=750
This quote is interesting for an article that claims the Repubs and Dems have reached near parity:
“Women tilt Democratic by a margin of 36% to 29%, while men favor the Republican party by a margin of 32% to 27%.
Women in every age group are more Democratic than Republican, with the largest gaps occurring among those age 60 and older. But Democrats also have a big advantage among young women (ages 18-24) and Baby Boomers.”
By my calculations, if Mytofsky is right and the census is right and between 53.6% and 54% of the voters were women, Kerry should have won about 52.9% of the vote.
I guess those rBr’s were mostly women.
John Goodwin says:

June 6, 2005 at 4:07 pm

TIA is concerned with a different problem–not the explaination of WPE discussed here, but whether the proposed rBr mechanism collides with the global constraint that the vote totals (percentage of popular vote) be such and such. His reported simulation result purports to show that the rBr mechanism cannot lead to the observed election vote split. [Though his later comment, that in a model sensitivity test Kerry “wins” only 68% of the simulations to Bush 32%, suggests his results, if the calculations were done correctly, may be significant at the 32% level–i.e. not significant 🙂 ]
Just as a general comment on methodology–proposing 8 models and showing that, when [mis]calculated, none of them satisfy some known constraints on the dataset–doesn’t really prove anything much.
You are entitled to conclude, either the model or the calculation are wrong, or both.
We’ve thrown out the election by hypothesis. So what we are left with is a projection from–not the poll–but some gross and much-rounded statistical summaries of the poll. In other words, we take a poll that was only good for 3% to begin with, throw away most of the information in it except for some contentious and disputed summary –and what? Make a prediction of who “really” (was going to later that day…) win? Same as the pollster himself did on the day of the election only with a fraction of his information? We are are supposed to prove using mathematics alone not only who *certainly* won but prove fraud as well?
Let’s apply a conservation of information law: whatever TIA knows, it can’t be *more* than M/E on the day of the election! Now, I’m the first to admit that their are mathematical constraints violated in the “leaked” data–after all, the tabulation programs were broken and fixed in flight, per the post mortem. But really! These efforts just distract from the main point, which is that without the raw data, as a scientist you know nothing. Mitovsky is beneath contempt for not relasing the raw data (at least, he may be a patriot to some partisan political or business interest, but he isn’t serving any scientific end).
TIA’s claim, however, is that there is no tabulatable data set (that’s what a proposition like “no possible or at least reasonable way to satisfy constraints” means when the rubber hits the road–no viable and reasonable data set). But of course then either E/M *knows* his data set doesn’t really tabulate and is fraudulently concealing the fact, or else they are pathetically deluded that their tabulation programs works correctly. The answer either way is, of course, for independent reviewers to see the underlying data set.
[Now, back in January, I presented my own arguments that no such data set can really exist, and that the real “secret” about the “leaked” data sets is that all the TV projections were done with flawed tabulation programs, but those arguments were based on signatures of data subsets that appeared in the various cross-tabs, not on the final tabulations].
In any event, a quick look at your posted links shows TIA calculates the “weighted” response rate by using the *number of precincts* as a weighting factor for the response proportions. Hunh? The precincts all have the exact same population? I don’t think we need pay too much attention to such simulations. But if the code is posted somewhere someone can take a look see but this one isn’t going to pass peer review. Not even close.
Sawyer says:

June 6, 2005 at 4:17 pm

The funny thing is that Ron Baiman is so desperate that he offered TIA a place in USCV for his (get this) “considerable expertise”. I really wish TIA would accept – that would be a death knell for USCV. Accept, TIA, accept.
davidgmills – it is disingenuous for you to come here with the TIA crap when you know that
1. When faced with questions TIA never answers them, claiming that they come from “disruptors” (see http://www.democraticunderground.com/discuss/duboard.php?az=show_mesg&forum=203&topic_id=375609&mesg_id=375609)
2. When someone persists in debunking TIA’s amateurish incompetency, TIA runs to the moderators on DU and the debunker gets banned – every time.
Sawyer says:

June 6, 2005 at 4:20 pm

The URL above got a ) attached to it so won’t work. Here is a tinyurl for anyone interested:
http://tinyurl.com/76dm8
davidgmills says:

June 6, 2005 at 4:28 pm

Lizzie was never banned. Brady was banned. So was I. An equal opportunity banner, it seems.
davidgmills says:

June 6, 2005 at 4:31 pm

John:
In seven minutes, you managed to review all of TIA’s work. That about says it all.
John Goodwin says:

June 6, 2005 at 4:50 pm

David – the world is round.
Sawyer says:

June 6, 2005 at 5:24 pm

Febble was not banned on DU for one reason – she was *very* careful not to contradict TIA directly, in any thread, on purpose, because she did not want to be banned.
Just see the thread above that I gave the link to: http://tinyurl.com/76dm8 Too bad I didn’t save it before the moderator went through and deleted all the interesting bits. But the gist of it was: people asked TIA to back up his position, TIA said “I don’t explain anything, I am God around here”, and people commented on how pathetic that kind of attitude is.
Kizzle says:

June 6, 2005 at 6:42 pm

Ya, I have a feeling there’s some sort of connection between the moderators and TIA at DU, I am kobeisguilty on DU (I posted the comment that the tinyurl linked to) and I have tried to have a reasonable discussion with TIA on the subject, but within 2 posts of disagreement he lets the ad hominem attacks fly. But everytime I get my message deleted for criticizing TIA’s personal attacks.
Anyways, who really expects to have a reasonable discourse on DU anyways?
And david: “In seven minutes, you managed to review all of TIA’s work. That about says it all.”
Lets avoid the quips on this board please… John was good enough to provide a detailed analysis with a logical argument behind why he doesn’t support TIA’s thesis, either respond in kind or let someone else respond for you, we’re all friends here and the less amount of sarcastic fluff to wade through here, the more educated MP’s audience will be 🙂
kizzle(kobeisguilty)
Rick Brady says:

June 6, 2005 at 6:58 pm

Funny thing is that TIA and Ron B don’t agree on some basic things. It’s Ron B’s algebra that said Bush Strongholds have more Vote Count Corruption (Bsvcc). TIA’s odds calculations show that the impossible one-tail p-values for design effect equals FRAUD EVERYWHERE!
In TIA’s world, there is no “rBr” (or design effect for that matter). In Ron’s world, the rBr is causing the “sausage to float” (y-intercept above zero), and therefore it is prevalent, but he doesn’t much care because *something* funny is going on in the High Bush precincts that can’t be explained by rBr (constant mean bias cannot explain the mean and median WPE pattern by precinct partisanship!).
John Goodwin – I completley disagree with you about the raw dataset. Perhaps a “blurred” dataset could be released, but unless a court orders the release, E-M and the NEP have a responsibility to protect exit pollsters and subjects. I’d love to see a dataset similar to the one released to ESI for Ohio released to everyone. However, preparation of the OH dataset took a while. Who is going to front the cash for all the other states?
David, I was banned from DU because I am an “evil” conservative. Why don’t you tell everyone why you were banned?
If it wasn’t for febble, OTOH, and Bruce, activity on the DU 2004 Election board would be nil. No one really reads it anymore. There are few “believers” left. Banning OTOH or Febble, when they clearly are abiding by the rules of the Board, would only chase off the remaining reasonable DUers (thankfully, that is not yet an oxymoron). TIA, autorank, SunshineKathy, TomMcintyre, MelissaG, RonB and crew can have fun talking to themselves. May they continue to forward their “analysis” to John Conyers and Jim Lampley.
Marty H says:

June 7, 2005 at 2:30 am

David-
Did you read my comments regarding the census poll data when you first posted them? The census data is inaccurate for two reasons:
1) The census folks say it is-they systematically overreport the number of voters. In 2000 it was a 5 million vote difference.
2) You can pick a state, do some simple math, and compare the number of registered voters reported by the Census Bureau to the actual number of registered voters reported by the Secretary of State. California is off by 2.4 million voters, for example. When easily verifiable facts are wrong, I tend to discount subsequent analyses.
Which is where I am with TIA. In his poll “Which of these facts convinced you that the election was stolen?” the top vote getter is: “97% of 40,000 documented voting anomalies favoring Bush”
Presumably he is referring to The “Election Incident Reporting System” (reachable through http://www.verfiedvoting.org) which currently has 42,696 incidents. Of those incidents, 1271 (less than 3%!) mention “Bush” and/or “Republican.” Over 9K incidents are “Polling Place Inquiries” (Where do I vote?) The largest category (15K+) is Registration Related Problems (typically “Am I registered?”).
The most popular reason by far on TIA’s poll for suspecting that the election is fraudulent is completely made up. My guess is that much of TIA’s other work evidences a similar level of skill in quantitative analysis.
Marty H
Febble says:

June 7, 2005 at 5:55 am

To David, a correction:
No coincidence is required to explain the timing of my departure at DU at all. I also departed on a day the sun rose in the East. Derisive comment directed at me on DU has been even more frequent and predictable.
I like debate. It was not a debate. And eventually a good friend persuaded me to take my hand out of the blender.
John Goodwin says:

June 7, 2005 at 9:31 am

Rick, when I say Mitovsky is beneath contempt, I mean that literally and not as the slams that seem so common in these discussions. Unfortunately, he has put himself and his organization in the position that we must, by scientific canon, ignore what they have to say and if they can’t cough up that’s their problem.
You don’t get to play scientific rational being and hide your data from scrutiny at the same time. At some point the scientific community has to stop listening to cranks, as a policy for protecting its integrity. The means of avoiding such a fate is simple: open your log book, show your calculations, spell out your logic, submit to peer review with reasonable conditions. If you say But… But… But… — that’s a red flag. You loose cred fast.
In a scientific discussion, there comes a point where people question your data–raise, politely but firmly, enough legitimate objections that you have to show your data (or submit it to peer review, or work out some blurring protocol that proves your point without compromising your ethics) … to maintain even a pretence of “science”.
If Mitovsky happens to have sold his scientific reputation to some mediaconglom too cheap and doesn’t happen to have enough money to buy it back, and no one else wants to bail him out, then too bad. You takes your money and you pays your chances. There’s no law that says there has to be a believable exit poll. There’s no law that says we have to believe one TV channel or one polling organization. If they can’t afford to play science because they have families to feed–well, they have lots of company.
Beneath contempt is not contempt. It means you don’t satisfy the entrance criteria for honor, rationality, or science and can’t be held up to that bar. We don’t get to ask “how scientific are these experimental results for which we don’t see the data”.
O’dell and the other scientific workers have done what can be done with what we know–and that’s about it. If there’s no raw (near-raw) data, then this is just Cold Fusion anyway. It is important to combat the illusion that we know this or that when we don’t. We need to slap down reports that have a lot of math mumbo-jumbo but any scientist can see is hogwash. When those things get out into the street, they scare the horses. It is negligent when someone says “Green cheese hence FIRE!” not to say “Nope. No Fire.”
I would love to see data. But if we don’t have it, it is dishonest to pretend to conclusions. They gots it, we don’t. No science here folks… move along.
Rick Brady says:

June 7, 2005 at 10:46 am

John, I believe that you are right in many respects, but wrong in others. Perhaps there is a distinction to be drawn between social sciences and hard sciences? Social Scientists deal with sensitive data all the time (heck, I had to go through this insane human subjects process just to interview 10 people for my honor’s thesis). Are their conclusions challenged as unscientific if they don’t reveal the subjects of their study?
I believe release of “blurred” data similar to what was released to ESI is in order. Time and money are the issues now. I believe that ESI has committed to making the blurred Ohio dataset available.
Also, you mentioned above a problem you suspected with the “projection tabulation programs.” Can you clarify? There is a difference between exit poll “projections” and “estimates.”
John Goodwin says:

June 7, 2005 at 11:42 am

To answer your query, Rick. You can go back an read the posts if you like, but the gist of my comment was: if you recall, on election night some PDFs were leaked (see scoop.nz) that had regional distributions of answers to about 50 questions (some questions were 1/2 or 1/4 of the form).
I ignored the races and just looked at the number of respondents to each question, treating the N respondents as a 50-dimensional vector and looked at possible decompositions (cross-tabulations of the number of respondents). There was one field that should have been tabulated for all records, but for which exactly 2000 respondents were not tabulated. Each of its regional components was exactly divisible by 4, so a reasonable hypothesis is that 500 records with a weight of 4 were included in the data set released to the media.
That 500 record subset was overwhelmingly in the West, suggesting that it was some sort of “seeding” meant to fill in until real data (or a different kind of data) arrived. Since it had a clear “signature” [missing set of data with same regional distribution] it was provably present in all three leaked data sets (and hence not part of an error).
What I proved is that between the “House” and the “President” data sets, if they were filters of the same relational data base, more data had to change in the West than was implied in the “signature”. Feasible conclusions that follow from this fact: (1) the tabulation program used was broken [admitted in post mortem], (2) the data were generated dynamically by a simulation program or similar, not by filtering a database to which rows were being added, (3)the house and president data sets were for different databases. (3) is attractive but I disproved it — not only did both data sets share the “signature” data, but by a sort of Kasiski cryptanalysis of the *number of respondents* to each question, you could prove that the data sets were improbably and undeniably related. They could be modelled as filters of a common database with almost complete overlap–i.e. differing only in obvious ways, such as “president” excluding non-respondents to “who did you vote for?”
By looking at just 5 or 6 rows, N respondents not B or K stuff, you could show the output of the tabulation program on election night was either broken, or tabulating two different data sets that shared identical statistical properties, or was pure synthesis. Skeptics take your pick but rational people don’t get other choices. 🙂
Rick Brady says:

June 7, 2005 at 12:03 pm

Thanks John. Got it.
Adam Berinsky says:

June 7, 2005 at 12:28 pm

I’m a long time reader and fan of MP — I sent my students here at MIT to the site last fall in the run-up to the election and they all found it very helpful. However, I’m a first-time poster.
My understanding is that the individual-level data is available through ICPSR and Roper. The data is free to members and can be purchased for $79 from Roper.
For a description see: http://webapp.icpsr.umich.edu/cocoon/ICPSR-STUDY/04181.xml
Is there other data folks are looking for?
John Goodwin says:

June 7, 2005 at 2:04 pm

If that data set were to identify the precincts by name, I think they would have more sales. That of course is the issue–if you have a prima facie case for either fraud or incompetence, then checking the calibration of the measurement to find the sytematic error is the thing to do. Supposedly, everyone polled also ended up in a vote tally, so the random walk each candidate does in each precinct has to start from the poll, and walk plausibly towards to target, or, or, or, … ?
Rick Brady says:

June 7, 2005 at 2:14 pm

Hi Adam, thanks for stopping by!
The data set you link to is weighted to the election result and does not include the relative weights so it is impossible to reconstruct the exit poll estimates from election night.
I believe that the “raw” dataset that many keep referring to is the dataset used to construct the scatterplots that Warren Mitofsky presented at AAPOR and Mark posted in another post.
A similar “raw” dataset for Ohio was made available to ESI, although it was packaged so that it is not possible to identify the actual precinct sampled. I’ve heard it called “blurring” but don’t quite know the details on how it was done other than it took a long time to produce.
If E-M, the NEP, or ESI would release the Ohio dataset that ESI received that would be a start. It would really be nice to have a similar datset for the nation, but I see little incentive for the NEP or E-M to spend the time and effort preparing such a file. I don’t think they care much about the fraudsters, but releasing such a dataset would likely score them points with the aapor/poq audience.
In my brief exposure to this field, I’ve noticed a difference of perspective (heh) between academics and practicioners regarding access to data.
davidgmills says:

June 8, 2005 at 9:57 am

Can anybody explain to me what they think these graphs were meant to show or purport to show?
http://www.democraticunderground.com/discuss/duboard.php?az=view_all&address=203×376145
davidgmills says:

June 8, 2005 at 10:49 am

On edit:
What is your best guess TIA wants a guy like me, or even much more sophisticated people, to think they show?
Graph by graph, please.
davidgmills says:

June 8, 2005 at 11:14 am

Febble:
What you got on DU, which you say was not debate, was a taste of cross-exam. This is why many experts won’t venture near a courthouse, or why when they do, they want to be paid handsomely for the grilling they take.
In the cross-exam of an expert, you try to show bias, sympathy, conflicts of interest, financial interest, lack of qualifications, lack of local knowledge, etc., — in short, anything that will give the jury pause to believe an expert’s point of view.
In my personal injury cases, many times the treating doctor, even when he is the only doctor who testifies, can’t convince a jury that an accident caused a certain injury. It drives me nuts, but it happens all the time.
Even as between experts, when we have a battle, the tragedy of “a battle of experts” at the courthouse is that most of the time, neither the judge nor the jury are capable of concluding which expert is “right.”
I think the cross-exam rattled you. It doesn’t mean you weren’t right. It just means you did not come across as credible to many people.
Rick Brady says:

June 8, 2005 at 11:54 am

David, your arguments are moving target and you don’t argue square. You are like many of the folks at DU. I’m glad febble left and am amazed that OTOH even bothers.
Two words: TORT REFORM.
John Goodwin says:

June 8, 2005 at 12:01 pm

David-
The dispute is not all that hard to understand in layman’s terms. Suppose we are polling in a Deep Republican district where Kerry voters are rarer than hen’s teeth, and we interview 100 respondents out of 150 we “sampled”. If we find no K voters at all (no evidence observed), what are we entitled to conclude about the K population of voters? We haven’t found any K’s, but they are rare. Do we have some sort of bound on how probable they are or not? What do we say if the vote shows fully 10% K?
With that in mind, lets consider 1, 2 or 3…. K-voters. As the number of K’s goes up, do we gain information about them and if so how fast. [I have a better handle on the ratio of men to women in downtown Seattle where I can observe lots of each than on the exact number of Sasquatch vs. the rural population in the deep forest, based on census responses. :)]
The discussion has to do with how to get that feature of the calculation right–as the percentage of voters changes, how does one compensate for the inherent uncertainty of having less and less evidence for the minority party, on the fringes.
The various participants in the discussion have offered slightly different mechanisms for the selection process, which result in them assigning different probabilities (different formulas), which of course changes the calculation. The way of resolving the question is to decide whose model is (1) a priori more believable and (2) sufficiently fits the data, relative to its explanatory power.
How you answer the evidential question weights your assessment of the substantive issues about non-response, possible frauds, biases and what not and consequently what credence you place in the proposed explanations.
Rick Brady says:

June 8, 2005 at 12:34 pm

John, you wrote: “The discussion has to do with how to get that feature of the calculation right–as the percentage of voters changes, how does one compensate for the inherent uncertainty of having less and less evidence for the minority party, on the fringes.”
Forget politics and public opinion for a moment. Does this “problem” occur in other arenas? (I suspect it does) If so, can you direct us to literature that may offer guidance on how best to compensate?
John Goodwin says:

June 8, 2005 at 12:48 pm

Rick – I mostly come at this from a physics standpoint, where there is a whole literature on “counting rare particles” and what error bars you get as you move towards zero events.
I was just trying to make intuitive a feature of the mathematics of Bernoulli trials (which seems to be what everyone tries to use for polling). It had better just “fall out” of the mathematics people use, or they are making unwarranted assumptions.
If, instead of K and B voters, we have red and black balls (molecules) diffusing out of the polling place, and we sample the two (independent) diffusing gases, then if there is a 50-50 gas mixture, the uncertainty is “root-N”. But as the red gas starts to get rare, if your statistic is a *difference* between red gas counts and black gas counts, you have two root-N’s, with the smaller one starting to dominate. You might precisely know the number of black balls (because they form almost all the sample), so *their* root-N is small, but the root-N for the red balls is what dominates. So “variance” has a bathtub shape that goes up for p=0 or p=1, because at that point the errors are dominated by N for *the decreasing* population.
This is all well-known. Any book on elementary probability theory will have it somewhere. There are several PDFs of such online.
Rick Brady says:

June 8, 2005 at 1:57 pm

Thanks John. Good stuff. As you know I’m pretty new to this. I figured it was “well known” to someone, but simply had to find it.
Someone has been working on this problem and has modeled the effect as it relates to the exit polls. I’m not sure if he wants to go public with his findings yet, but when I saw the charts it made my eyes bug out because, well, it was not “well known” to me and I certainly didn’t expect it.
Time to start exploring the “other” literature as there doesn’t seem to be anything in the survey methodology stacks (that is immediately recognizable at least).
davidgmills says:

June 8, 2005 at 2:09 pm

Rick:
Since when was cross-exam fair? It is almost never fair and often full of cheap shots. Cross-exam simply points out the things of which other people would be suspicious.
There are at least three good examples in Febble’s case. First of all, her conversion. MP, and she, and probably others like yourself, touted her conversion from sinner (fraudster) to saint (possible believer) as proof that she should be believed. Bad mistake. People are highly suspicious of publicised conversions, even if genuinely felt and believed by the converted person.
Secondly, taking Mitofsky’s money. People are highly suspicious of taking money even when it is up front and disclosed.
Third, being on Mitofsky’s payroll and not telling anyone for a time and leaving the impression for a time she had no financial interest in the matter. People are suspicious of that as well.
Nobody is beyond suspicion. But it is best to keep suspicion to a minimum if you want to retain your credibility. Unfortunately, she fell into several traps, perhaps naively and perhaps even innocently as often happens to newbies, but fell nonetheless.
RE: Tort Reform.
You know the interesting thing about tort reform is this. Just about everone wants it until they feel like someone has done them wrong. Then all of a sudden they are highly in favor of all the tort law in their favor they can get.
Imagine my consternation at representing someone who spent his whole life trying to reform the tort system and then gets seriously wronged and injured. All the other people who brought lawsuits are frauds, but not him; his cause is just. Seen it way too many times to count. Makes me cynical as hell about the motives behind tort reform.
Then there is the double standard between individuals and corporations. Individuals who go to court claiming they have been wronged should have caps put on their recoveries. But not corporations — no caps there. Corporations love to be able to sue when they have been wronged, and believe me, they don’t want limits on what they can recover.
John Goodwin says:

June 8, 2005 at 2:29 pm

Rick – had time to look up a ref. for you.
A great site for the math of what we’re trying to do here is
http://www.dartmouth.edu/~chance/
They have http://www.dartmouth.edu/~chance/teaching_aids/articles.html
lots of good stuff including
sample programs and a
http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html
Chapter 9 discusses probability theory as applied to polling, esp Figure 9.5 “Polling Simulation”. Of course, polls are given as an “example” and “exercises” throughout the book–some exercise for the reader, hunh? Discussing the ins and outs of chapter 4 is where most of the participants on DU are stuck.
Rick Brady says:

June 8, 2005 at 3:02 pm

Fantastic John!
I’m blown away by all of this really. How could E-M (or anyone else studying exit polls for more than 6 months) have assumed that taking the sum of the signed within precinct error would leave “bias” to be explained?
John Goodwin says:

June 8, 2005 at 3:26 pm

Well, Rick, the liberals are interested in calculating the conditional probability that Kerry really won the election, given that Bush did in fact win it. As you might guess, it comes down to a rather Clintonian construction of “really” and “won”. Outcome based reasoning helps, especially in technical defintions of terms like “expected value”.
John Goodwin says:

June 8, 2005 at 5:23 pm

Anyway getting back to “layman’s math”: the root intuition of the fraudsters is that the polls and vote totals don’t jibe, precinct by precinct. If we add up the number of red balls and black balls in our sample, we get two “staircases” (and a third one that represents the total respondents and, lest we forget, another representing third party responses).
The sense that fraud has occurred has to be founded on looking at the slope of the staircase (actually, a line drawn from the origin to the top of the staircase) for the sample. We project that slope to the whole voting population, and look to see if the sample’s slope “hits the target”. Claims of fraud rest on a sense that, it can’t possibily.
Now, the slope of the staircase is not constant (autocorrelation). The very worst autocorrelation (lag 1) is filtered out because interviewers are not supposed to take successive voters–like husband and wife walking out together. But *some* correlation exists on all scales–fractally for those into pop science. 🙂
In any event, one thing we could do with raw data is *see* the jitter in the slope. This jitter, which is lost by just telling us the aggregate count for the precinct, would give us an empirical “cone” around the ascending staircase, a sort of independent measure of “on target”.
Obviously, if we sample too small a fraction of the precinct, our cone-of-reasonableness will be quite large, both to inherent uncertainty of a short ascent, and due to the large “leverage” of extrapolating into the distance.
But there’s another problem–all slopes are not equal. Slopes are really angles, and slopes near zero (minority in partisan precinct) “hit the wall” at a different angle from the 45 degree line (100%–the other guy). If we assign each slope a “delta theta” as a fair way of putting a cone around the slope, in flight to its target, we see the flat staircases are aiming for a target that is much “smaller”, whereas the lines with more slope are aiming for a “glancing blow”, and have a much larger landing zone–hence less sensitivity to error (size of the projectile cone).
In addition, the staircase (a random walk) is jittery, with “clusters”, “secular trends”, “hot spots”–it looks more like the stock market than a handicapped ramp.
So, naive intuitions like “that bumblebee can’t fly its way around a normal distribution” aside, the real math question is what is the staircase like–did it get some false slopes up front?–is it feasible?–what is the short scale and long scale autocorrelation like?
John Goodwin says:

June 8, 2005 at 6:37 pm

Anyway, attempting to *contribute* to a reasonable criterion for possible fraud: We have two random walk (staircase) that goes from 0 to the announced vote total for each candidate (rather: two candidates and their instantaneous sum, the Total. Pick any two degrees of freedom.) The two walks are “correlated” in the microscopic scale, since if the next vote is for one, it is not for the other, so subtract the sequences.
Now we have the difference staircase that corresponds to “X’s lead over Y”. The gut intuition is that if this lead is yea-far in one direction after sample N.sample.size, that it is improbable to “return to the total”. Now, lets subtract out the linear sequence (constant slope), and work with the “excess lead over a linear approach to the final total” (difference staircase vs. straight line).
I make a weak assumption: If the “vote lead sequence” ever re-crosses the linear line, the liberals shouldn’t be suprised, since if it gets back on track once it could without undue improbableness choose to stay there from then on. 🙂 Thus, we have the problem: given a random walk, which after N.sample.size steps is yea-far out of kilter from a straight line, what is the probability of a zero-crossing *no sooner than* N.vote.total. That is, what is the probability of a *first* zero-crossing at N.vote.total.
So…. if the overall vote sequence is ergodic (surely the sample can’t be–but let’s ignore that small detail!) we can run it *backwards* from the zero-crossing, and ask, for a vote sequence starting at bang on the announced total, what are the odds of ending up yea-far off way back at N.vote.total-N.sample.size.
THAT, is the problem the liberals *should* be solving. 🙂 The properties of the random walk are material and affect the calculation, but crude estimates and defense of simplifying assumptions used would be a good starting place.
Naive slope arguments with data so bumpy aren’t going to convince anyone who is capable of not being convinced.
Rick Brady says:

June 8, 2005 at 6:48 pm

John, assuming you had the “raw” data, what would you look for? What test(s) would you run? What is your dependent measure? Forgive me if you answered these questions above, but every time I think I’m following you, I realize that I’m not.
Assume that you are talking to someone who: 1) has the data; and 2) has a very solid grasp of statistics. That means, you aren’t talking to me, I just want to eavesdrop 😉
Kizzle says:

June 8, 2005 at 7:05 pm

Two quick questions:
1) Assuming a completely hypothetical election and all things being equal, what scale of fraud *can* exit polls detect? In Ukraine, the initial exit polls were almost a 20% lead for Yushchenko but Yanukovich ended up winning (fraudulently). It seems like in a clear landslide of the exit polls against a contrary vote count, they can, in fact, imply fraud. But in smaller cases, it is as MP calls it, “too blunt” of an instrument. Assuming I knew for a fact (lets say I did it 😉 ) that I had switched 500,000 votes in Florida from Kerry to Bush *from the central tabulation center*, not from individual precincts, do you believe that it would be reflected in the exit polls to a clear extent?
2) Assuming fraud occured at the tabulation center instead of individual precincts or counties, how would this affect the ability of exit poll WPE to determine/imply fraud?
Rick Brady says:

June 8, 2005 at 7:22 pm

Kizzle, in response:
1) The exit polls didn’t prove fraud in Ukraine. They were an indication of fraud, but the physical evidence was enough to convict. However, I think Dr. Fritz Schueren of NORC (and ESI) suggested that an exit poll can be *designed* to detect fraud, but the US media exit polls are terrible post-hoc measures. Given recent work by febble (which continues, in part in this thread with the help of others), it should be easier to detect “suspect” individual precincts or a “suspect” pattern (or signature) across precincts.
2) What do you mean by “tabulation center”? Aren’t votes tabulated per County?
There are two types of exit poll error: a) unrepresentative sample of precincts; and b) unrepresentative sample of exiting voters within precincts.
In 2000, most of the problems seemed to have been with the precinct sample, most notably in Florida. I read somewhere that if they had used a different base year for turnout, they would never have called Florida for Gore.
In 2004, the error was determined to be in the precinct as the vote count of sampled precincts closely approximated the over all vote count (slight Bush bias). However, I’m not sure if this means that they checked all the states for precinct sample error (i.e., don’t know if some states more “wrong” than others, but in aggregate they gave a slight Bush bias.)
Regardless, the error within precincts was tagged as the problem in 2004, so it’s what we are analyzing (right now anyway).
If the tabulators knew which precincts were being polled, I guess they could divvy up the votes in such a way that it wouldn’t show up in an analysis of WPE, but I’m not sure how they would do it so that it didn’t show up in the statewide estimates.
John Goodwin says:

June 8, 2005 at 7:44 pm

In summary argumentation (since David has put us all on trial), I think it might be easier to do what we often do in Mathematics, which is solve an *inverse* problem first. My inverse problem (which, in the grand style of Hilbert, I proprose to statisticians….) is the *non-sampling* problem. Suppose I have an election, but remove from the voting all those persons included in the sample, possibly by choosing them randomly, but perhaps choosing them with some small systematic bias.
What should the vote count sequence I have defined (net lead minus linear slope line) look like? How far is it allowed to wander to one side, for a given sample size and given bias? How often are its first zero crossings allowed to happen? With what probabilities (means and variance or closed form please)?
Supposing we have two right triangles back to back (O = origin, A = N.sample.size, C=Votes.in.sample (vertex above A), D = N.votes.total).
Thus, triangle OAC, with right angle at A, represents the sample, and DAC, with right angle at A, represents the long-baseline non-sample: given this, how high above the line OAD is C allowed to be?
The non sampling problem is more fun, since it has a long baseline (DA), and a small angle (CAD) “looking back”. It’s only peculiarity is the sample has been removed from it, so height AC is due to a random component plus a bias contribution. On the null hypotheses, there is no bias. So calculated the height AC and tell me how probable each such height is.
Thus far, investigators have used a “small sample” assumption–they have treated OA as zero relative to DA, and AC as essentially the shadow of AC on a vertical line through the origin O. But I bet small precincts had a greater percentage sampled than large ones–so you will have to fold the “real” sampled percentages into any aggregate answer (not just one precinct). But that will work if you can give me a closed form with bias and sampled percentage as parameters.
The advantage of working on the “non-sampled” sequence is that if the removed sample is random and unbiased (null hypothesis), you should be able to assume an ergodic sequence and get the answer in closed form. Then you can get a sampling distribution for that answer and something like a p-score. The hard part is to extend that result to cover the non-ergodic case [biased removal of points], and parameterize your first answer. If you *do* then you win the sweepstakes–you have, in closed form, an expression for how far off a biased poll can be before it is improbable that the election happened that way. If you are a paranoid sort, you should believe the CIA already has done the math (since OR defined this problem back in the OSS days…. how to deceive the enemy on the margins of statistical credibility. Probably, they just asked Von Neumann to run the Monte Carlo. 🙂 )
Rick: there are different kinds of random walks (time series autocorrelation for lag N is one diagnostic). You need exploratory data analysis of the raw data before you are justified in saying “such and such a random walk” has such and such a distance with such and such a probability. If you don’t see the raw data, there is no reasonable way to detect fraud. Sorry–white noise is not brown noise is not black noise. Drop “shot noise” and “1/f noise” and “brownian motion” into Google. Liberals can’t know what they say they know unless they’ve seen the data and they don’t know the properties of the sequence–that’s how the world works. Is the stock market too high today or about right? How much of a “break” does it have to have before we *know* the Dow is up?
Kizzle says:

June 8, 2005 at 7:47 pm

Well, what happens to the votes after they are counted by county? Where do they get fed into from there and who is responsible for that step?
And I was very careful to use “imply” rather than “prove” in my Ukraine example 🙂
John Goodwin says:

June 8, 2005 at 7:55 pm

Kizzle: fraud prevention is easy: revise the constitution to give each county an electoral vote proportional to its population in the state, and use a two-level electoral college. Since all the votes in the county will be worthless if the county cheats, everyone watches local elections like a hawk.
The electoral college is a great design–any second grader can do the addition, and anything the states certify without challenge is guarranteed legal. In order to subvert American democracy, you have to subvert some significant number of 50 independent Republics. If you manage that, no paper constitution [or computer fraud dectection system for that matter] will help you anyway.
Unfortunately, in 1865 1/2 the independent Republics were in fact so subverted by armed intervention. Since then, the system has stopped functioning. But it was a great design. Hey, hey, ho, ho Rutherford B. Hayes has got to go.
Rick Brady says:

June 8, 2005 at 8:12 pm

John, I agree that without the raw data there is no way to reasonably detect fraud.
Question about something you wrote: “The hard part is to extend that result to cover the non-ergodic case [biased removal of points], and parameterize your first answer. If you *do* then you win the sweepstakes–you have, in closed form, an expression for how far off a biased poll can be before it is improbable that the election happened that way.”
I keep mixing up my units of analysis. When you say “how far off a biased poll can be” are you referring to the poll within the precinct (as if each precinct is an independent poll)?
Kizzle says:

June 8, 2005 at 8:13 pm

I like the idea John 🙂
I can just imagine it now… swing counties… it will be a nightmare for politicians to campaign 🙂 Only thing I disagree with, is using an electoral system, it only takes changing 538 votes to subvert an election, not a majority, minority, or a couple of states.
Rick Brady says:

June 8, 2005 at 8:14 pm

Kizzle, not sure… I’m not even clear where E-M got their vote counts for the WPE calculation.
John Goodwin says:

June 8, 2005 at 8:45 pm

Rick — I was thinking within precinct. After you know the distribution for one precinct, you probably want to sum them and get the aggregate–convolving over d(S), distribution function for percentage of precinct sampled in the actual sample. 🙂
BTW, speaking of wall street, economists already know a lot about ergodic sequences, with bias and volatility (“portfolio beta”).
here’s a random link:
http://scholar.lib.vt.edu/theses/available/etd-5398-184344/unrestricted/etd.pdf
(top search result for “beta volatility homoscedastic”) — last term thrown in to eliminate popular articles in newspapers and magazines.
You are right about electors, but they are only obeying state laws. The constitution can’t help it if states make state laws.
In 1848, there was nothing wrong with a state deciding to join Mr. Marx and establish a dictatorship of the proletariat if they liked. If the Party picked the electors and liquidated the faithless ones and certified what they liked, that was hunky-dory. Just as it should be. Robust design requires partitioning the problem.
John Goodwin says:

June 8, 2005 at 9:17 pm

For those who are intrigued by the “physics” of voting:
Wikipedia on the “colors” of noise:
http://en.wikipedia.org/wiki/Colors_of_noise#.22Less_official.22
Does anyone know what “color” exit polls are?
Does it vary from precinct to precinct, or state to state? Do we need a new color in the spectrum, “liberal”?
Anyone want to hazard a guess (to the nearest quarter integer, say) as to what “alpha and beta” for voting sequences (net leads), as polled, or as counted? Yes, I know we now have *two* definitions of alpha.
Should we name the fraud free cone around our random walk the “cone of silence”? 🙂
Kizzle says:

June 8, 2005 at 9:34 pm

Interesting, if anyone knows or can point me to information on how the votes are counted after county tabulation, I’d be much obliged. Ideally, I’d like to know every step between my showing up to the voter booth to my vote being displayed on CNN (or FOX if that’s your thing).
John Goodwin says:

June 8, 2005 at 9:40 pm

I don’t get cable so I wouldn’t know. 🙂 But that raises an interesting idea: if we know the “count” order of the sampled precincts, we would have an interesting time series base for plotting the “bias”. Everyone knows that urban precincts “come in last”. Of course, bias in the *poll* by response shouldn’t correlate with the order the precinct was tabulated in the election–or could it?
Kizzle says:

June 8, 2005 at 9:41 pm

Sorry, one more question because I think it got lost in the fray earlier, and sorry to add lay-man’s fluff to the hard-core statistics discussion that preceded this 🙂
1) Assuming all things being equal, what would fraud look like in a hypothetical election as reflected by the exit polls, and what would be the estimated minimum threshold of votes switched?
2) Rick, you mentioned before that exit polls *can* be designed to look for fraud… how would that change the current system we’re using, and would it change only ex post facto analysis methods, or the polling methodology itself, or both?
Rick Brady says:

June 8, 2005 at 9:45 pm

John, I think the distribution varies for each precinct depending on sample size and partisanship. Some precinct distributions are more normal than others. I’m struggling to conceptualize a way to distinguish between bias and what can reasonably occur due to chance alone. My only thought at this point is to apply individual error bars on each precinct alpha. That may help distinguish the “true” outliers, but it doesn’t explain the relationship between outliers (what color is the noise?). You’ve probably supplied enough information in your comments here for the truly smart people to figure out what should be done.
Look – let’s make this simple. If you were going to build a multiple regression model to explain the variance in within precinct error, what would you use as the dependent measure? It seems that ln(alpha), arctan(alpha), and WPE are confounded, but I’m not sure by how much.
John Goodwin says:

June 8, 2005 at 9:52 pm

Rick you’ve invented a new field — GDA (Gedanken Data Analysis) = Exploratory Data Analysis of data we wish we had. I don’t know. Surely how to transform the dependent variable is the least of our worries. Plot them all.
If you know a bit about electrical circuits, imagine our “vote counter” ramping up its voltage (one counter for each candidate). Suppose we have a comparator that compares the “real” voters walking in the door with the theoretical ramp, which says how many *should* have appeared by voter N, if the counter is unbiased. The real counts lag or lead this ramp, and the difference between the two candidates, minus the slope difference, should just be “random noise” of some kind. The question is what are its statistical properties.
Naive slope arguments say “hey, those integrators are ramping too fast to hit the final target”. But that sort of argument doesn’t take into account the noise spectrum.
John Goodwin says:

June 8, 2005 at 10:34 pm

Kizzle – since we’ve been talking signal processing and target recognition, suppose Darth Vader has two detection systems on board the Death Star. One is for detecting intruders, and the other is the targeting system he uses to shoot people. He wants those systems to work differently, since when he presses he fire button he wants to make damn sure he’s hitting Luke’s ship and not Palpatine’s, but he’s happy to be sending his minions investigating those false intruder alarms all day.
Using exit polls for intruder detection is different from using them to obtain court convictions. In other words, there’s no good answer without a threat model and risk analysis. Also, you didn’t tell me if I get to see all the raw data. The models get a lot better and the thresholds tighter if I do. 🙂 Personally, I like the legal notion of a prima facie case and I think we’ve met the burden of proof long ago to go in with supoenas blazing, but hey that’s politics. For some strange reason the politicoes don’t like raking up that mud. Wonder why.
Anyway the fraudsters underestimate the power of the Dark Side. It knows all about target recognition systems and signal processing and the math we’re doing here is the same. Criminals get caught because they’re stupid. The smart ones grow up to be governments.
Rick Brady says:

June 8, 2005 at 10:50 pm

John, there are very smart people who do have the raw data. I have a hunch that if we could improve the dependent measure [arctan(alpha) seems to be a smidge better than ln(alpha)], then we could get the plots.
You said: “Surely how to transform the dependent variable is the least of our worries. Plot them all.” I’m not quite worried about plotting them (although plots are nice); I (and others) want to explain the variance in them. (I assume by “them” you mean the various transforms?)
I wish Lizzie or Mark L. would jump in here and ask all the right questions that I am not. I’m just poking around.
John Goodwin says:

June 8, 2005 at 11:10 pm

Not so gedanken after all then. Arctan converts to “polar coordinates” which is what this problem wants for starters. As I recall, if you look at the lagged autocorrelations, there’s a formula for the “optimal” variance stabilizing transform for some family of transforms–hazy memory from 20 years ago. For sure get the stats people on this. Box, Hunter, and Cox are the guys you want–“Experimental Design” and their book on Time Series Analysis. But there is so much in recent history on these subjects.
Some of the key problems are around the area of “censorship”–if you don’t sample parts of your alleged sample space, what happens. People who make up mortality tables seem to care, and there is a whole field. I’m surprised all this (econometrics, random walks, survivor analysis, censorship issues, markov processes, signal processing…) hasn’t cross-fertilized with Pollsters yet. I wouldn’t think poll prediction would be all that different from designing signal filters with feedback (add in a sample of the election results) and feed forward as part of your tool kit. Optimal cheating strategies being the converse problem.
Rick Brady says:

June 9, 2005 at 12:53 am

John, I’m not saying that all of this hasn’t crossfertilized with polslters yet; only that that it hasn’t with exit pollsters. In fact, the literature is fairly sparse on exit polls and the statistical analysis is even sparser. There are several regression studies of WPE and one other from the 1960s that used Mean Square Error as the dependent, but these were conference papers and I haven’t convinced the authors to dig deep enough for them to send them to me. Wouldn’t matter though, because I know these papers didn’t look at what we’re talking about.
I tried to get the Box, Hunter, and Cox text back when you recommended it before. Our library has 3 copies. All three were missing. I reported them missing and they apparently found them about a month ago. I’ll check it out tomorrow morning.
Thanks for the tips. I’ll make sure the stats people read them 😉
Rick Brady says:

June 9, 2005 at 12:57 am

Scratch that… it was the Box, Hunter and Hunter text you recommended before, was it not?
Rick Brady says:

June 9, 2005 at 1:01 am

I found “Time series analysis : forecasting and control / George E.P. Box, Gwilym M. Jenkins, Gregory C. Reinsel” But nothing by Box, Hunter and Cox.
Febble says:

June 9, 2005 at 6:04 am

Hi, John and Rick.
I’m a statistician, not a mathematician (hey, I’m not really a statistician, I just use statistics). I understand a bit about autocorrelations and random walks, as I deal with experimental designs in which these are either potentially interesting, or, more often, simply a nuisance.
As far as the exit poll data is concerned, it is noise, and I am not sure the order of repondents is actually recorded and known. I think sheets were dropped in box – were the sheets numbered?
What a statistician wants is a reasonable well-behaved dependent variable to regress on the predictors of interest. Unfortunately one of these is the vote-count itself, and several others may be correlated with it. So we want the cleanest variable we can.
Arctan(alpha) would seem to work reasonably well, and for technical reasons, arctan(alpha)-arctan(1) works even better as it gives a variable where 0=no bias, and bias is what we are interested in.
Critically, what we are interested in is what might have made polling Kerry voters more likely than polling Bush voters (or vice versa), for a given proportion of each lot’s votes. The WPE doesn’t hack it. Arctan(alpha) isn’t bad, but it still leaves us with your “bathtub” (I think of more like a saggy tent, with not-quite vertical tentpoles slanted sideways in opposite directions). And what the saggy tent produces is greater variance in arctan(alpha) at the extremes; greater variance where N is small; and greater skew where N is small. So for a given N, the means will form a kind of ogive, more or less flat in the middle but diving up or down at the extremes – and result in a possibly spurious linear correlation, that will be more problematic at low Ns.
The larger the N, it seems to me, the flatter (and more extensive) the flat bit will be, but the function will always take a leap (or a plummet) at the extremes.
This, from a very practical, visually oriented, statistician who just needs to get the data into a form where the General Linear Model can be used to test a hypothesis, as it is one of the most powerful statistical tools we have.
So the what we need is some kind of transform of alpha that allows for the skew, which ought to be predictable given the N and the vote proportion. But I can’t figure one out.
Lizzie
Rick Brady says:

June 9, 2005 at 8:43 am

Whew. The cavalry arrives. Thanks Lizzie. Uh…John…what she said.
John Goodwin says:

June 9, 2005 at 9:12 am

Rick – it was Box, Hunter, and Hunter for the one book. They mention Box-Cox transformations in that book. Box and Cox wrote an article together. Box-Jenkins was indeed the other (time series) book I was thinking of. That was a list of authors who are known for their work on transformations and also on time series, not authors of one book. BHH is far the most readable for a workaday scientist. The Jenkins book interesting but less so. 🙂 I’m sure the economists have taken that stuff since the 70s and run with it and their are probably some nice expositions out there.
BTW, most of my comments above were about modelling, not EDA, since I didn’t think you could get the data. So the fact that no one knows the order is not relevant. Besides, if its random gettting mixed up in the box shouldn’t matter. 🙂
If you have to prove the worth of your analysis before the analysis will be done, you’ll have to do a simulation anyway, to justify it.
I was trying to intuitively justify the importance of ergodicity (which I’m not really sure of…). The basic idea is that if you *did* know the time series [and what you know is the order the central station adds in precincts to the sampled+unsampled time series, not down to the individual level like physicists would like ;)].
Ergodicity means that if you “play the song backwards” it sounds like noise–the same color noise in fact. The intuition is that pathologies–biases, fraud, what not–show up as non-ergodicity, which puts them on an analytical footing. It makes precise the somewhat vague notion “this really should be more random than it is”. The idea is to run simulations [time order known], assume ergodicity, calculate statistics of interest, then compare those statistics to real world ones.
In other words my comments are a hint to mathematicians. I think you really could treat this, analytically, as a bunch of transition probabilities [branching ratios for non-response, inclusion in sample, etc.]. This might give you a firmer grasp on how your tuning parameter (your alpha among them) fits in a priori. Your arguments so far–and I haven’t seen a revised paper–are based on conditional probability formulas at heart. That’s great but really it is probably invalidated by assumptions of independece. Voters arrive in clumps and the bigger the clump the lower the instantaneous sample rate right? Urban areas are clumpier than rural because of commuting, right?
As to your problems at the ends I really don’t know. Theory says there are singularities at the end points [like taking a ratio with no data in the denominator, natch], but that you shouldn’t be seeing them so soon. Central Limit turned out to be not so central after all. What to say? Theory breaks down faster with clumpy non-random wierdly correlated clusters of non-ideal respondents, when you don’t happen to have enough of them?
John Goodwin says:

June 9, 2005 at 9:26 am

“So the what we need is some kind of transform of alpha that allows for the skew, which ought to be predictable given the N and the vote proportion. But I can’t figure one out.”
BTW, how small are your “Ns”? Some people get better fits if they at 1/2 to the Ns. I don’t remember why, but I’ve seen that several places (one of them being ch. 9 in the reference above and the other a discussion of chi squared in modelling where the responses are rounded to integers). What happens if you bump up 60 K votes to 60.5 everywhere and so on? 🙂
Also, if you have any precincts where someone got zero, there’s an endpoint correction as well.
Febble says:

June 9, 2005 at 9:36 am

Gee, John, I wish I was smart enough to get half of that, but let’s see if I get a quarter.
I think one thing you may be saying is that if it is random it doesn’t matter. Quite. That’s how we deal with it in my field (psychophysics). We present randomised stimuli, and we know that there will be autocorrelations in the data especially when people are performing at chance, and may alternate response strategies when they don’t know the answer. Usually we just present enough trials that we collapse across whole blocks of trials and use the aggregates. Sometimes we use signal detection theory to calculate D’ (“D prime”), which helps a bit, and sometimes we use a function that defines response bias (“c” or “beta”). I have been trying to figure out how this would apply here, but to the extent it does, I don’t think it is adequate.
What you say about clumping is interesting. The plot seems to have more random noise than you would expect from sampling error if sampling error was defined on the assumption that no clustering occurred. If voters were clumped, and sampling rate varied with some other variable (weather?) or even the same variable (rate of exit of voters from the precinct?) this might explain the noise. But the noise doesn’t matter. What we want is the signal. And being a bear of very little brain, my gut feeling was – if there’s only one Bush voter, you get him or you don’t. If you don’t, alpha goes to infinity. If you do, alpha is pretty low, as the denominator will be 100%. But say there are five Bush voters and 200 Kerry voters. You may have as much chance of getting too many as too few – but alpha will be less responsive to too few (unless it disappears completely) than too many – so the sampling distribution will be skewed in the direction of low alphas, especially as your infinite alphas will simply disappear from the dataset. And of course the inverse is true at the other end of the scale. My gut sense is that for sample sizes of around 80, as in the exit polls, arctan(alpha) will be pretty flat, all other things being equal, apart from the very extremes, which actually are only present at the Kerry end of the plot. But where N drops , the skew will start to affect data points increasingly distant from the ends. So, again, all other things being equal, the linear slope of arctan(alpha) against vote-count margin will be a function of N. Not that a linear slope is the correct fit, but the GLM is our best tool, and I want to know what is likely to mislead it.
I still think, that arctan(alpha) is the best we’ve got so far, but it would be nice to get it better, if necessary by incorporating N. But as I say, I can’t see how.
Rick Brady says:

June 9, 2005 at 11:11 am

Re: “clumping”
Aren’t we assuming a perfectly drawn sample at the established interval? Also, aren’t we assuming that non-response and misses are random? If so, then clumping shouldn’t matter because every “kth” was approached.
Re: “My gut sense is that for sample sizes of around 80, as in the exit polls, arctan(alpha) will be pretty flat…”
Mark L. modeled N=20, 30, and 40 at patisanship (P)=90% and the alpha (ln or arctan) observed didn’t get all that much closer to expected. Perhaps when finishes his grading he can try a number of other scenarios until he finds the point where the two converge. Also, from what I gather from the sims, the two depart immediately beyond P=+/-50% at N=20. I’d really like to see the parameters (when it comes into play and how severely for each N and P).
John Goodwin says:

June 9, 2005 at 12:52 pm

While I’m reading over your post a friend sent me this: –
A wicked king amuses himself by putting 3 prisoners to a test. He takes 3 hats from a box containing 5 hats – 3 red hats and 2 white hats. He puts one hat on each prisoner, leaving the remaining hats in the box. He informs the men of the total number of hats of each color, then says, “I want you men to try to determine the color of the hat on your own head. The first man who does so correctly and can explain his reasoning, will immediately be set free. But if any of you answers incorrectly, you will be executed on the spot.”
The first man looks at the other two and says “ I don’t know”
The second man looks at the hats on the first and third man and finally says “ I don’t know the color of my hat, either.”
The third man is at something of a disadvantage. He is blind. But he is also clever. He thinks for a few seconds and then announces, correctly, the color of this hat.
What color hat is the blind man wearing? How did he know?
My take: Let A and B be not blind and C blind:- (solution left as exercise) —
“Actually, C has to assume that B told the truth. Since there is no penalty for not knowing C took a terrible risk, since B, not confident of his mental abilities, might have preferred not to answer at all, instead of risking death for freedom. Luckily, C’s bet paid off for him, since we are told he happened to answer correctly (but not using valid reasoning). So the blind man was not only clever, but too clever by half.”
It’s the real world assumptions that get you every time.
John Goodwin says:

June 9, 2005 at 12:53 pm

Anyway the moral is: don’t assume I know what I’m talking about. 🙂
John Goodwin says:

June 9, 2005 at 3:06 pm

As far as clumping, censorship, and non-response is concerned, think of a simple queuing model. There are arrival rates lambda-1 and lambda-2, which vary over the course of the day, and obviously relate to total number of voters and partisan factor p.
Suppose the interviewer can walk people through the process at rate mu. Now, if the traffic intensity gets too high, no one tolerates a “line” waiting to do a survey (or differential non-response means service queue K and queue B have different max lengths). But I bet length 0 is realistic–if the guy is busy I just move on. Demographics matter–I bet not only Dem vs. Repub but also the other shades of classification in the cluster analysis discussed here. Lots of children? Forget it. Marginally disaffected? Maybe a plus (I want to vent); maybe a minus (I’m passive aggressive about it–that’ll show them.)
But in terms of behavior, this model doesn’t treat the case where traffic intensity is high but events for the minority party are rare. Think of a Geiger counter that “goes dead” for some time after it gets an event. If you are swamped with lots of Bottom quark events (B) and you really need Kaons (K) because they are rare in these parts, the high background of B-events makes your detector dead, and stops you from seeing even the few K’s you’re looking for. It drives the variance of K’s up more than it helps pin down the rate of B’s. Basically, high backgrounds of things you already know well really suck.
Now imagine what happens if particles arrive in “bursts”. Self-censoring magnifies what is already a bad situation.
John Goodwin says:

June 9, 2005 at 5:20 pm

Febble and Rick — let’s make sure we’ve internalized the “+-1/2” correction: 0.5/sqrt(Npq) is 0.186 for N=80 and p=0.1.
That shifts the confidence interval *asymmetrically* and away from “0” by about 0.2 Z-units. In other words, the error bars on “expected N” are not symmetric. They have “more variance” upwards than downwards, even for N=80.
Rick says:

June 10, 2005 at 12:01 pm

John: I’m still internalizing what I read in Box, Hunter and Hunter last night. If I read it right, it seems that different transforms might be best across the partisan range. Need to read that again…
Rick Brady says:

June 10, 2005 at 12:47 pm

“Now the proportion of dying (larvae) varies greatly from labratory to labratory. Indeed, it looks from the data as if p might easily vary from .05 to .95. If this were so, the variances npq could differ by a factor of 5. Fisher suggested that, before treating binomial proportions y/n=p^ by standard normal theory techniques using t statistics (and the analysis of variance and regression methods discussed later), a transformation should be made to a different”metric” x, given by sinx x=SQRT(p^).
“The quantity x may simply be thought of as a score derived from p but having more convenient properties. [Technically, as it is used here, it is an angle measured in grads (100 grads=90degrees).]
…
“The purpose of this transformation is to stabilize the variance. Thus, whereas the variance of p is very differentt for different values of p, the variance of x is approximately constant. It will be seen from the graph (in text) that this stabilization is achieved by stretching out the scale at the ends of the range, at the expense of the center. Not only does the transformation achieve a standardized variance, but also it turns out that x is more nearly normally distributed over the whole range than is p^.” Pg. 133-134.
That sounds like a clue to me 😉
Kizzle says:

June 10, 2005 at 1:56 pm

solution (don’t read if you want to figure out the riddle, and you guys probably figured this out already but I thought i’d post anyways 🙂 ):
When 1 looks at 2 and 3, the only way he would know his color is if both 2 and 3 are white, thus making him red. But he replies
that he doesn’t know. Thus from 1’s answer, we (and the other two prisoners, i assume) know that both 2 and 3 can’t be white
When 2 looks at 1 and 3, he knows that if 3 is white, he in fact is red, from 1’s answer that both 2 and 3 can’t be white. But he replied that he does not know. Meaning that 3 can’t be white.
Thus, 3 must be red.
Or:
1. (1 knows his color) iff (2&3 are white)
2. (1 knows his color) is false
3. (2&3 are white) is false
4. If (3 is white) then (2 can’t be red/2 is white) (from 3)
5. (2 knows his color) iff (3 is white)
6. However, (2 knows his color) is false
7. (3 is white) is false (from 5)
———-
8. 3 is red
Kizzle says:

June 10, 2005 at 2:02 pm

yikes.
4. If (3 is white) then (2 can’t be red/2 is white) (from 3)
should be:
4. If (3 is white) then (2 can’t be white/2 is red) (from 3)
Kizzle says:

June 10, 2005 at 2:09 pm

And as for John’s comment about the real world assumptions get you every time, I answered very similarly to his version in some of my logic class exams, but for some reason the teacher didn’t come to the same conclusion as I did and marked me wrong every time 🙂
John Goodwin says:

June 10, 2005 at 2:41 pm

Yes, Rick, the point of the variance stabilizing transformation is to make the distribution more normal. But even more important it is to stabilize the variance. The variance needs to be the same because, you see, Independent and *Identically* distributed implies, among other things *identical* variance. The question is how to get the Central Limit to work with 80 people in the precinct and not require 800 or 8000. A “rule of thumb” is 30 is ok (but no guarantees–see ch. 9 on the PDF I sent you).
The basic problems, in approaching a gaussian, are (1) skew, (2) non constant variance, and (3) non-normality. Central limit doesn’t care nearly so much about normality [after all, binomial isn’t normal near zero–normal means finite probability of -83 voters, say. Binomial stops at zero ], as it cares about mixing things with different widths and skewness.
That is what was so funny about Freeman and the first figure of the USCV paper. They seemed to imply that all *50 states* should somehow obey Central Limit. Lesseee, I take one sample from each of 50 different distributions, and I get a normal distribution with unit stdev–because we all know everything is the world is random, all distributions are identical, and the answer is always Z.
John Goodwin says:

June 10, 2005 at 2:48 pm

Rick – Here is a link
http://darkwing.uoregon.edu/~robinh/arcsin.txt
Reposted from this (Dec 14th) thread–
http://www.mysterypollster.com/main/2004/12/exits_were_the_.html
Rick Brady says:

June 10, 2005 at 2:51 pm

John, thanks. Working on it. I haven’t had time to look at the PDF you sent in detail. I’m on “vacation” now in Seattle. My sister is graduating UW this evening (Department) and tomorrow (campus).
I’m hearing that the Fisher transform didn’t quite work. Not sure on the details. I’ll read try to skim the PDF soon.
Regarding your Freeman/USCV comment, to be fair John, you criticized him on different grounds before, so I see that your (like our) approach to the analysis is evolving. That’s a good thing!
I just got a walk through of my gramps’ boat that he’s been building from scratch for 30 years. He lives in Bremerton. It’s a big boat and he designed a jet propulsion system from scratch. Fancy. Lots of sanding when you make a boat. Really dusty. I need a shower.
John Goodwin says:

June 10, 2005 at 3:00 pm

Rick – I live in Seattle if you want to meet in person somewhere this weekend.
http://www.mysterypollster.com/main/2005/05/aapor_exit_poll.html
where I say, re USCV
[[In the first graph, please note and comment on the funnel shape. I am glad the discussion has finally moved into the importance of p very different from 0.5 as related to nested errors. This is always where the discussion should have been. Read my lips: “Arcsin stabilizing transformation”.]]
I don’t expect the transformation to do any more than Febble has already gotten out of it–it’s the same one if done correctly.
Rick Brady says:

June 10, 2005 at 3:14 pm

Hey John: I’d love to meet up, but I have no car and a lot of family to hang out with before I leave early Sunday morning. In a few months my gramps plans to launch his boat in Port Townsend so I’ll be up here again with less on my plate.
Awaiting results from the sims… Thanks for the help, but I’m not sure we’re there yet. If there is no “perfect” transform, any suggestions on where to go from here? We simply want the best dependent measure possible for the regression analysis.
Jack Neefus says:

June 10, 2005 at 3:52 pm

On the scatterplot Mitofsky released after the AAPOR:
http://inside.bard.edu/~lindeman/aapor1b.jpg
The more I look at the Y axis, the more I’m stunned by the huge variance. There are at least a dozen points which have an absolute value of 50% or more.
(1) Does a 60% variance mean that an exit poll worker reported an 80-20 Kerry win in an 80-20 Bush precinct? (That would be equivalent to reporting a Bush landslide in Harlem or a Kerry landslide in North Dakota.) And how is that even mathematically possible in a 50-50 precinct?
Or does the 60% WPE mean: The official results showed Kerry winning by 10% and the exit polls showed him winning by 16%? That would be a little more reasonable. But even so, it’s still surprising with the constant denial rate.
(2) What do the outliers have to say about the exit poll overall? If the reason for those outliers could be understood, it could shed new light on why the entire poll was off. I’m assuming the geographical location of those dots is unknown. But it could be crucial.
Would appreciate any input. I appreciate the grubbiness of data collected in the field, but this is surprising even to me.
(Note: I post on DU as “ribofunk.”)
Rick Brady says:

June 10, 2005 at 4:37 pm

Hey Jack! I would recommend not looking at the WPE chart when trying to distinguish the outliers. I don’t have the link, but I believe that Mark L. posted the one with ln(alpha). That is a better measure for error by precinct partisanship, although we’re learning now it’s not perfect. But you are right, the variance is huge and those outliers do appear to be absolutely unexplainable by chance.
But the outliers are on both sides, so if it’s fraud, it’s likely on both sides. If it’s something else, it’s likely on both sides. However, the regression line has a y-intercept above zero. That means (because of the sign convention) that there is a net error (bias) to Kerry.
You wrote: “And how is that even mathematically possible in a 50-50 precinct?” It’s not. But that doesn’t mean it’s fraud. It might be. It might not be. But recall, there are many outliers on both sides.
It does not appear that the outliers are affecting the y-intercept or the slope. That tells me there are two things worthy of exploring: 1) the outliers (what gives?); and 2) the y-intercept (as Lizzie says, “floating sausage).
John Goodwin says:

June 10, 2005 at 8:28 pm

People seem to want some sort of benchmark to answer “is this WPE reasonable”. I would suggest one from Random Walk theory. On a level plane (p=q=0.5), a particle doing a random walk–net vote count with each net vote shifting +1/-1 depending on who gets the next one–will “spread out” as a “gaussian wave packet” with width sqrt(N). For N=number of votes and small sample (estimated mean) split 50/50, so mu=0 to start with, the “likelihood that election could happen” is just what you think it is.
Use sigma=sqrt(Npq) and calculate critical values of the normal distribution–that’s what everyone’s first intuition is.
So, if mu is not zero in your sample, you let it “diffuse” towards the election. Since n.sample << N.vote, if p=q=0.5 it certainly *can* get there. Now, let's consider a random walk on a *tilted* plane. p-q is not zero, but there's a "drift". So our expanding gaussian wave packet, growing outwards like ripples in a pond, but also slipping to one side. What is the "point of no return"? It is the critical slope/variance combination for which the wave's growth is exactly compensated by its drift. At that point, we have a definition of "a lead you can't catch": mu = 4*p*q/abs(p-q) [in one simple theory] except if p=q=0.5, in which case it is infinte, or even for mu incredibly large, there is *some* election that can counteract it. Of course, if our sample is big enough, the real election doesn't have "N enough to catch up", even for p=0.5=q. In any event, this simple consideration lets us see "how reasonable" WPE is: p=0.1 => mu.crit = 0.45
p=0.2 => mu.crit = 1.07 [that’s your 80/20 case]
p=0.4 => mu.crit = 4.5!
p=0.5 exactly => mu.crit = infinite
… and symmetric on the other side.
Show me the math:
http://home.eng.iastate.edu/~mayao/EE523/EE523_w12.pdf
John Goodwin says:

June 10, 2005 at 8:31 pm

Repost without less than and greater than
signs:
People seem to want some sort of benchmark to answer “is this WPE reasonable”. I would suggest one from Random Walk theory. On a level plane (p=q=0.5), a particle doing a random walk–net vote count with each net vote shifting +1/-1 depending on who gets the next one–will “spread out” as a “gaussian wave packet” with width sqrt(N). For N=number of votes and small sample (estimated mean) split 50/50, so mu=0 to start with, the “likelihood that election could happen” is just what you think it is.
Use sigma=sqrt(Npq) and calculate critical values of the normal distribution–that’s what everyone’s first intuition is.
So, if mu is not zero in your sample, you let it “diffuse” towards the election. Since n.sample is very much less than N.vote, if p=q=0.5 it certainly *can* get there.
Now, let’s consider a random walk on a *tilted* plane. p-q is not zero, but there’s a “drift”. So our expanding gaussian wave packet, growing outwards like ripples in a pond, but also slipping to one side. What is the “point of no return”?
It is the critical slope/variance combination for which the wave’s growth is exactly compensated by its drift. At that point, we have a definition of “a lead you can’t catch”:
mu = 4*p*q/abs(p-q) [in one simple theory]
except if p=q=0.5, in which case it is infinte, or even for mu incredibly large, there is *some* election that can counteract it.
Of course, if our sample is big enough, the real election doesn’t have “N enough to catch up”, even for p=0.5=q.
In any event, this simple consideration lets us see “how reasonable” WPE is:
p=0.1 … mu.crit = 0.45
p=0.2 … mu.crit = 1.07 [that’s your 80/20 case]
p=0.4 … mu.crit = 4.5!
p=0.5 exactly … mu.crit = infinite
… and symmetric on the other side.
Show me the math:
http://home.eng.iastate.edu/~mayao/EE523/EE523_w12.pdf
John Goodwin says:

June 10, 2005 at 8:40 pm

So, what it means:
This is backwards from most discussion so far–in the non-partisan precincts, a poll that is “way off” is no evidence of fraud,
because there is some size of election that will erase that lead. It is in the *partisan* precincts where the strongest evidence of fraud would lie, if the poll confirmed a large lead, and the results were too close to even!
The missing sensitivity analysis has do with “uncertainty in p”. p is in the denominator, so small changes to it affect criticality a lot [taking partials, I get approx. but directly a 4*dp shift to mu.crit for uncertainty dp!].
Mark Lindeman says:

June 11, 2005 at 12:01 pm

Jack,
It turns out to be really helpful to keep
http://www.exit-poll.net/
bookmarked, or most crucially, to download the January evaluation report and be ready to fire up the PDF at a moment’s notice.
Maybe not so helpful in this particular case because I think the glossary definition of WPE is actually misleading! But I do use the report a lot.
(1) WPE doesn’t quite mean either of those things. It is the difference in net _margin_ between the exit polls and the official results. So, a -60% WPE favoring Kerry (i.e., Kerry does better in the exit poll) could be, for instance, if the official result was 40% Kerry/60% Bush (-20), but the exit poll had it 70% Kerry/30% Bush (+40). The average WPE of -6.5% implies (crudely) that the exits gave Kerry 3+% more than the official returns, and Bush 3+% less. (That isn’t quite right, but it gives a flavor.)
(2) I think you’ve stated the problem well. I doubt that the outliers are the proverbial “smoking gun,” but they could at least be a contrail pointing to innocent error, fraud, or both.
Mark (“OTOH”)
Marty H says:

June 14, 2005 at 3:33 am

Mark L-
Do you know if USCV discussed the implications of the New Hampshire recount at all? NH had the second highest exit poll discrepancy, yet a partial hand recount organized by Nader verified the official tally. That’s proof enough for me that the New Hampshire exit polls were wrong-and there’s no reason to expect the rest of the states’ exit polls to be any more accurate.
Thanks,
Marty
Rick Brady says:

June 14, 2005 at 9:03 am

Marty: Another odd fact. Dr. Steven Freeman of USCV pointed to the BYU poll of Utah as evidence of general exit poll accuracy and then concluded that the NEP poll, which was off by more than 2 points in Utah, was accurate for Ohio, Pennsylvania, and Florida. Evidence inconvenient to their points is conviently ignored.
Marty H says:

June 15, 2005 at 3:29 pm

Rick-
My take at this point is that there are three groups in this debate:
There’s USCV who believes that the exit poll data is strong enough for a “guilty” verdict on fraud (the first paper)-or at least an indictment (the current working paper.) Of course, USCV is not alone in this belief.
There are other people (including some former USCV members) who believe that sufficient fraud to tip the election may have taken place, but do not believe the exit poll data we have is enough evidence to convict, and so they render a “Not Guilty” verdict. I believe that Elizabeth, Bruce, and Mark L. fall into this category.
The third group (to which I belong) believe that the official election results reflect the will of the people on Nov 2. That’s the “Innocent” group. That does not mean that the process overall is perfect, but it was good enough.
I just wanted to see if anyone from the “Not Guilty” camp had any comments on NH, pro or con.
Regarding inconvenient data, it can be ignored, discounted, or disproven (from weakest case to strongest.) It appears that USCV ignores it, but they may have a rational reason to discount these data.
If my expertise were in polling, Utah would make a great case study. What was different between the BYU and the NEP polls that accounted for the difference in accuracy? I’d look at completion rates first-was there a significant difference? Why? Mitofsky had problems with younger interviewers, but my guess is that the BYU poll was done by BYU students, most likely under the age of thirty. Perhaps BYU has a positive association with Utah voters, while the media does not. Was the BYU poll conducted by enthusiastic, well trained volunteers as opposed to people “just collecting a paycheck”? Were dress codes enforced for either group? NEP pollsters worked individually, BYU pollsters were paired (I believe). It’d seem like there would be enough data there to keep an army of statisticians busy until the next election.
Marty
Cate Marshall says:

June 16, 2005 at 1:54 pm

I would like to suggest that there is a fourth group that Marty H has not identified: those who are concerned about election fidelity. If you focus on election fraud, the key issue becomes whether there is evidence in the last election of bias large enough to change the outcome. A focus on election fidelity leads to a different set of questions. The first of these is: How well does our voting system capture the intent of the voters? It seems to me that the limited exit poll data that has been made available for public analysis indicates either (1) serious problems with election fidelity, or (2) serious problems with exit poll methodology. Given the serious consequences that low election fidelity has for the proper functioning of democracy, does not this merit further serious analysis?
Rick Brady says:

June 16, 2005 at 2:31 pm

Marty:
Info on the BYU exit poll is here: http://exitpoll.byu.edu/
Students. The KBYU/Utah Colleges Exit Poll is the only existing statewide student-run exit poll in the nation. BYU undergraduate students from three disciplines — statistics, political science and communications — design the project.
On Election Day, more than 1,000 student volunteers from campuses across the state carry out the exit poll. The direct involvement by students facilitates first-hand learning of the political process and is an invaluable educational opportunity for all who participate.
* 800 volunteer interviewers fan out to 90 polling places
* 200 students volunteer for data entry assignments
* 100 students work with logistical coordination and roving crisis teams
* 75 students produce the KBYU Election Night news broadcast
* 30 gallons of hot chocolate and hundreds of donuts help the students cover an estimated 4,000 miles on the road on Election Day
The students are not all from BYU. I spoke to J. Quin Monson, Assistant Director of the Center for the Study of Elections and Democracy at BYU in Miami. He said that they don’t identify BYU as the sponsor of the survey. I mentioned the BYU v. Utah Utes rivalry as being a possible source of bias and he laughed.
The fact is that they have much larger sample sizes, a better training program, they poll in shifts, and they’re sampling a fairly homogeneous population. That’s why I think they “nailed” it and seem to have a good track record.
I’ll send Quin an e-mail and ask about completion rates. It was probably on their board at aapor, but I don’t recall.
I don’t have much of an opinion about New Hampshire. I’m sure that ballot stuffing is a “plausible” explanation for why the recount confirmed the election result.
Fran says:

June 16, 2005 at 2:43 pm

Cate, we need to work both sides – election fidelity and exit poll methodology. My gut tells me that the election was not fradulent; however, the lack of a paper trail from the touch screens invites fraud and, if we leave things as they are, fraud will occur sooner or later. With respect to exit polls, we either need to do more a more wide-ranging and careful job (much more expensive) or we need to convince the American public that exit polls are not so much a check on the the vote count as a check on the characteristics of who voted for whom. One more thing – brought to mind by those early results showing a few red states going heavily for Kerry – the exit pollsters need to have a methodology to prevent or flag the “slamming” of exit polls.
Rick Brady says:

June 16, 2005 at 3:21 pm

Marty:
Quick response from Quin. He said the poll had a ~65% response rate. Prior retail sales experience was the only interviewer characteristic correlated with response rate (presumably, the greater the experience, the easier the “sale”).
Other factors like time of day and urban/rural location were more strongly associated with response rate.
I’m wondering if they have a dependent measure for within precinct error, or if they bothered to look at potential sources of bias at all.
Cate Marshall says:

June 16, 2005 at 4:01 pm

Fran, As you note with respect to exit polls, money is a finite resource. The same is true with respect to voting systems. There are now many proposals for voting system changes, some of them mutually exclusive, most of them requiring the expenditure of public funds. While I do not dismiss the need to address hypothetical scenarios for election fraud for the future, I think that begs the question of what happened in the last national election. Fidelity problems impact far more than the outcome of the national race. If we had fidelity problem in the last election, I think it’s important to understand and address those problems.
Thus I am suggesting that examination of the exit poll data needs to go beyond a debate about whether there is evidence of “fraud.” Is there evidence of problems with election fidelity? If so, how can the exit poll data help us understand what is causing those problems?
Fran says:

June 17, 2005 at 3:51 pm

While I appreciate all of the work that has been done with the limited information that is available to us, UNTIL we can match precinct polling data with precinct results, we can’t say or prove much (even statistically). Were that data available, then I suspect we could find likely reasons for many of the WPE and I suspect we would find more than one reason.
Marty H says:

June 20, 2005 at 7:45 pm

Rick-
Sorry for the delayed “thank you” in addressing the questions I put out there. This internet thingy might catch on-you ask a rhetorical question and get back real answers.
So response rates in Utah correlated with prior retail sales experience-meaning you have to “sell” interviewees on taking the survey. Again, the “art” of polling rears its head.
Marty

Comments are closed.

USCV vs. USCV

More Stories

MysteryPollster Is Back! (Sorta)

Another ‘Phantom Swing’? Investigating Differential Nonresponse in 2018

MysteryPollster is Back!