« April 17, 2005 - April 23, 2005 | Main | May 1, 2005 - May 7, 2005 »

April 29, 2005

The Liddle Model That Could

Regular readers of this blog may know her as "Febble," the diarist from DailyKos.  Her real name is Elizabeth Liddle, a 50-something mother and native of Scotland who originally trained as a classical musician and spent most of her life performing, composing and recording renaissance and baroque music.  She also studied architecture and urban design before enrolling in a PhD program in Cognitive Psychology at the University of Nottingham where she is currently at work on a dissertation on the neural correlates of dyslexia.  In her spare time, she wrote a children's book and developed in interest in American politics while posting on the British Labor party on DailyKos.  A self proclaimed "fraudster" she came to believe our election "may well have been rigged," that "real and massive vote suppression" occurred in Ohio where the recount was "a sham" and the Secretary of State "should be jailed."

She is, in short, perhaps the least imaginable person to have developed a computational model that both suggests a way to resolve the debate about whether the exit polls present evidence of fraud and undermines the central thesis of a paper by a team of PhD fraud theorists. 

But that's exactly what she's done.

Assignment editors, if you are out there (and I know you are), while it is a bit complex and geeky, this is an amazing story.  Read on...

Let's start at the beginning. 

As nearly everyone seems to know, the "early" exit polls conducted on November 2, 2004 (the ones tabulated just before the polls closed in each state), had John Kerry leading George Bush nationally and had him running stronger in most states than he ultimately did in the final count.   On the Internet, an argument has waged ever about whether those exit polls present evidence of vote fraud, about the possibility that the exit polls were right and the election result was wrong. 

In late January, Edison Research and Mitofsky International, the two companies that conducted the exit polls on behalf of the consortium of television network news organizations known as the National Election Pool (NEP) issued a public report that provided an analysis of what happened accompanied by an unprecedented release of data and information about the workings of the exit polls.  The tabulations provided in that report are the basis of the ongoing debate. 

On March 31, an organization known as U.S. Count Votes (USCV) released a new "scientific paper" and executive summary that "found serious flaws" with the Edison-Mitofsky (E-M) report (according to the USCV press release).  They found that the E-M explanation "defies empirical experience and common sense" and called for a "thorough investigation" since "the absence of any statistically-plausible explanation for the discrepancy between Edison/Mitofsky's exit poll data and the official presidential vote tally is an unanswered question of vital national importance" (p. 22).  [MP also discussed this paper and its conclusions in a two-part series earlier this month].

In an email, Elizabeth Liddle explained to me that she discovered the USCV web site while "poking about the web" for data on the U.S. elections.  After doing her own statistical tests on Florida election returns, she became part of the discussion among the USCV statisticians that ultimately led to their paper.  While she reviewed early drafts, she ultimately came to disagree with the conclusions of the final report. 

[Full disclosure:  For the last two weeks, I have had the unique opportunity to watch the development of Elizabeth's work through a running email conversation between Elizabeth, Rick Brady of StonesCryOut and "DemFromCT" from DailyKos. My use of the familiar "Elizabeth" henceforth results from that remarkable experience.  This post benefits greatly from their input, although as always, the views expressed here are my own].

To understand Elizabeth's contribution to this debate, we need to consider the various possible reasons why the exit polls might differ from the count. 

Random Sampling Error? - All polls have some built in error (or variability) that results from interviewing a sample of voters rather than the entire population.  Although there have been spirited debates about the degree of significance within individual states (here, here, here and here), all agree that the there was a consistent discrepancy at the national level that had Kerry doing better in the poll than the count.  The "national sample" of precincts (a subsample of 250 precincts) showed Kerry winning by three points (51% to 48%), but he ultimately lost the race by 2.5 points (48.26% to 50.73%).  The E-M report quantifies that error (on p. 20) by subtracting the Bush margin in the election (+2.5) from the Bush margin in the poll (-3.0) for a total error on the national sample of -5.5 (a negative number means an error that favored Kerry in the poll, a positive number means an error favoring Bush).

At the state level, the E-M report showed errors in the Bush-Kerry poll margin (the Bush vote minus the Kerry vote) favoring Kerry in 41 of 50 states and averaging -5.0.  At the precinct level the discrepancy favoring Kerry in the poll averaged -6.5   Both E-M and USCV agree that random sampling error alone does not explain these discrepancies.  Biased Sample of Precincts? - Exit pollsters use a two step process to sample voters.  They first draw a random sample of precincts and then have interviewers approach a random sample of voters at each precinct.  The exit pollsters can check for any sort of systematic bias in the first stage by simply replacing the interviews in each selected precinct with the count of all votes cast in each precinct.  As explained on pp. 28-30 of the E-M report, they did so and found a slight error in Bush's favor (+0.43).  Both E-M and USCV agree that the selection of precincts did not cause the larger discrepancy in Kerry's favor.   The remaining discrepancy occurred at the precinct level, something the E-M report calls "within precinct error" (or WPE) (p. 31). 

Response Bias?  - The E-M report summarized its main conclusion (p. 3):

Our investigation of the differences between the exit poll estimates and the actual vote count point to one primary reason: in a number of precincts a higher than average Within Precinct Error most likely due to Kerry voters participating in the exit polls at a higher rate than Bush voters (p. 3)

Although the E-M report made no attempt to prove that Kerry voters were more likely to want to participate in exit polls than Bush voters (they noted that the underlying "motivational factors" were "impossible to quantify"- p. 4), they based their conclusion on two findings:  (1) A similar but smaller pattern of Democratic overstatement had occurred in previous exit polls and (2) errors were greater when interviewers were less experienced or faced greater challenges following the proscribed random selection procedures.  [Why would imperfect performance by the interviewers create a pro-Kerry Bias?  See Note 1]   

Bias in the Official Count?  - The USCV report takes strong exception to the E-M conclusion about response bias, which they termed the "reluctant Bush responder (rBr)" hypothesis.  The USCV authors begin by arguing that "no data in the E/M report supports the hypothesis that Kerry voters were more likely than Bush voters to cooperate with pollsters (p. 8)."   But they go further, claiming "the required pattern of exit poll participation by Kerry and Bush voters to satisfy the E/M exit poll data defies empirical experience and common sense" (p. 12).   This is the crux of the USCV argument.  Refute the "reluctant Bush responder" theory, and the only remaining explanation is bias in the official count. 

To make this case, the USCV authors scrutinize two tables in the E-M report that tabulate the rate of "within precinct error" (WPE) and the survey completion rates by the "partisanship" of the precinct (in this case, partisanship refers to the percentage of the vote received by Kerry).   I combined the data into one table that appears below: 

If the Kerry voters had been more likely to participate in the poll, the USCV authors argue, we would "expect a higher non-response rate where there are many more Bush voters" (p. 9).  Yet, if anything, the completion rates are "slightly higher [0.56] in the in precincts where Bush drew >=80% of the vote (High Rep) than in those where Kerry drew >=80% of the vote (High Dem)" [0.53 - although the E-M report says these differences are not significant, p. 37].

Yet the USCV report concedes that this pattern is "not conclusive proof that the E/M hypothesis is wrong" because of the possibility that the response patterns were not uniform in across all types of precincts (p. 10).  I made a similar point back in January.  They then use a series of algebraic equations (explained in their Appendix A) to derive the response rates for Kerry and Bush voters in each type of precinct that would be consistent with the overall error and response rates in the above table. 

Think of their algebraic equations as a "black box."  Into the box go the average error rates (mean WPE) and overall response rates from the above table, plus three different assumptions of the Kerry and Bush vote in each category of precinct.  Out come the differential response rates for Kerry and Bush voters that would be consistent with the values that went in.  They get values like those in the following chart (from p. 12):

Uscv_chart_1


The USCV authors examine the values in this chart, and they note the large differences in required differential response rates for Kerry and Bush supporters in the stronghold precincts on the extreme right and left categories of the chart and conclude:

The required pattern of exit poll participation by Kerry and Bush voters to satisfy the E/M exit poll data defies empirical experience and common sense under any assumed scenario [p. 11 - emphasis in original].

Thus, they find that the "'Reluctant Bush Responder" hypothesis is inconsistent with the data," leaving the "only remaining explanation - that the official vote count was corrupted" (p. 18).  Their algebra works as advertised (or so I am told by those with the appropriate expertise).  What we should consider is the reliability of the inputs to their model.   Is this a "GIGO" situation?  In other words, before we put complete faith in the values and conclusions coming out of their algebraic black box, we need to carefully consider the reliability of the data and assumptions that go in.

First, consider the questions about the completion rates (that I discussed in an earlier post). Those rates are based on hand tallies of refusals and misses kept by interviewers on Election Day.  The E-M report tells us that 77% had never worked as exit poll interviewers before and virtually all worked alone without supervision.  The report also shows that rates of error (WPE) were significantly higher among interviewers without prior experience or when the challenges they faced were greater.  At very least, these findings suggest some fuzziness in the completion rates.  At most, they suggest that the reported completion rates may not carry all of the "differential" response that could have created the overall discrepancy [How can that be?  See Note 2]. 

The second input into the USCV model is the rate of error in each category of precinct, more specifically the mean "within precinct error" (WPE) in the above table.  This, finally, brings us to the focus of Elizabeth Liddle's paper.  Her critical insight, one essentially missed by the Edison-Mitofsky report and dismissed by the USCV authors, is that "WPE as a measure is itself confounded by precinct partisanship."  That "confound" creates an artifact in the tabulation of WPE that causes a phantom pattern in the tabulation of WPE by partisanship.  The phantom values going in to the USCV model are another reason to question the "implausible" values that come out. 

Elizabeth's Model

Elizabeth's paper explains all of this in far greater detail than I will attempt here, and is obviously worth reading in full by those with technical questions (also see her most recent DailyKos blog post).  The good news is she tells the story with pictures.  Here is the gist.

[Update: here is an alternate link for the paper]

Her first insight, explained in an earlier DailyKos post on 4/6/05 is that the value of WPE "is a function of the actual proportion of votes cast."  Go back to the E-M explanation for the exit poll discrepancy:  Kerry voters participated in the exit poll at a slightly higher rate (hypothetically 56%) than Bush voters (50%).  If there were 100 Kerry voters and 100 Bush voters in a precinct, an accurate count would show a 50-50% tie in that precinct, but the exit poll would sample 56 Kerry voters, and 50 Bush voters showing Kerry ahead 53% to 47%.   This would yield a WPE of -6.0.  But consider another hypothetical precinct with 200 Bush voters and 0 Kerry voters.  Bush will get 100% in the poll regardless of the response rate.  Thus response error is impossible and WPE will be zero.  Do the same math assuming different levels of Bush and Kerry support in between and you will see that if you assume constant response rates for Bush and Kerry voters across the board, the WPE values get smaller (closer to zero) as the vote for the leading candidate gets larger as illustrated in the following chart (you can click on any chart to see a fullsize version:

Fig1


Although this pattern is an artifact in the data, it does not undermine the USCV conclusions.  As they explained in Appendix B (added on April 12 after Elizabeth's explanation of the artifact appeared in her 4/6/05 DailyKos diary), the artifact might explain why WPE was larger (-8.5) in "even" precincts than in Kerry strongholds (+0.3).  However, it could not explain the larger WPE (-10.0) in Bush strongholds.  In effect, they wrote, it made their case stronger by making the WPE pattern seem even more improbable:  "These results would appear to lend further support to the "Bush Strongholds have More Vote-Corruption" (Bsvcc) hypothesis" (p. 27). 

However, Elizabeth had a second and more critical insight.  Even if the average differential response (the differing response rates of Kerry and Bush voters) were constant across categories of precincts, those differences would show random variation at the precinct level. 

Consider this hypothetical example.  Suppose Bush voters tend to be more likely to co-operate with a female interviewer, Kerry voters with a male interviewer.  And suppose the staff included equal numbers of each.  There would be no overall bias, but some precincts would have a Bush bias and some a Kerry bias.   If more of the interviewers were men, you'd get an overall Kerry bias.  Since the distribution of male and female interviewers would be random, a similar random pattern would follow in the distribution of differences in the response rates. 

No real world poll is ever perfect, but ideally the various minor errors are random and cancel each other out.  Elizabeth's key insight was to see that that this random variation would create another artifact, a phantom skew in the average WPE when tabulated by precinct partisanship. 

Here's how it works:  Again, a poll with no overall bias would still show random variation in both the response and selection rates.  As a result, Bush voters might participate in the poll in some precincts at a greater rate than Kerry voters resulting in errors favoring Bush.  In other precincts, the opposite pattern would produce errors favoring Kerry.  With a large enough sample of precincts, those random errors would cancel out to an average of zero. The same thing would happen if we calculated the error in precincts where the vote was even.  However, if we started to look at more partisan precincts, we would see a skew in the WPE calculation:  As the true vote for the leading candidate approaches the ceiling of 100%, there would be more room to underestimate the leader's margin than to overestimate it. 

If that description is confusing, the good news is that Elizabeth drew a picture.  Actually, she did far better.  She created a computational model to run a series of simulations of randomly generated precinct data - something financial analysts refer to as a Monte Carlo simulation.   The pictures tell the whole story. 

The chart that follows illustrates the skew in the error calculation that results under the assumption described above -- a poll with no net bias toward either candidate. 

Fig3a


The black line shows the average "within precinct error" (mean WPE) for each of nine categories of partisanship.  The line has a distinctive S-shape, which takes into account both of the artifacts that Elizabeth had observed.  For precincts where Kerry and Bush were tied at 50%, the WPE averaged zero.  In precincts where Kerry leads, the WPE calculation skewed positive (indicating an understatement of Kerry's vote), while in the Bush precincts, it skewed negative (an understatement of the Bush vote). 

In the most extreme partisan precincts, the model shows the mean WPE line turning slightly back toward zero.  Here, the first artifact that Elizabeth noticed (that mean WPE gets smaller as precinct partisanship increases) essentially overwhelms the opposite pull of the second (the effect of random variation in response rates). 

Again, if these concepts are confusing, just focus on the chart.  The main point is that even if the poll had no net bias, the calculation of WPE would appear to show big errors that are just an artifact of the tabulation. They would not indicate any problem with the count. 

Now consider what happens when Elizabeth introduces bias into the model.  The following chart assumes a net response rate of 56% for Kerry voters and 50% for Bush voters (the same values that E-M believes could have created the errors in the exit poll) along with the same pattern of random variation within individual precincts. 

Fig4a


Under her model of a net Kerry bias in the poll, both the mean and median WPE tend to vary with partisanship. The error rates tend to be higher in the middle precincts, but as Elizabeth observes in her paper, "WPEs are greater for high Republican precincts than for high Democrat precincts."

Again, remember that the "net Kerry bias" chart assumes that Kerry voters are more likely to participate in the poll, on average, regardless of the level of partisanship of the precinct.  The error in the model should be identical- and in Kerry's favor - everywhere.   Yet the tabulation of mean WPE shows it more negative in the Republican direction. 

The Model vs. the Real Data - Implausible?

Now consider one more picture.  This shows the data for meanWPE and medianWPE as reported by Edison-Mitofsky, the input into the USCV "black box" that produced those highly "implausible" results. 

Fig5


Elizabeth used a regression function to plot a trend line for mean and median WPE.  Note the similarity to the Kerry net bias chart above. The match is not perfect (more on that below) but the mean WPE is greater in the Republican precincts in both charts, and both show a divergence between the median and mean WPE in the heavy Republican precincts.  Thus, the money quote from Elizabeth's paper (p. 21): 

To the extent that the pattern in [the actual E-M data] shares a family likeness with the pattern of the modeled data... the conclusion drawn in the USCV report, that the pattern observed requires "implausible" patterns of non-response and thus leaves the "Bush strongholds have more vote-count corruption" hypothesis as "more consistent with the data", would seem to be unjustified. The pattern instead is consistent with the E-M hypothesis of "reluctant Bush responders," provided we postulate a large degree of variance in the degree and direction of bias across precinct types.

In other words, the USCV authors looked at the WPE values by partisanship and concluded they were inconsistent with Edison-Mitofsky's explanation for the exit poll discrepancy.  Elizabeth's proof shows just the opposite:  The patterns of WPE are consistent with what we would expect had Kerry voters been more likely to participate in the exit polls across the board.  Of course, this pattern does not prove that differential response occurred, but it cuts the legs out from the effort to portray the Edison-Mitofsky explanation as inherently "implausible." 

It is worth saying that nothing here "disproves" the possibility that some fraud occurred somewhere.  Once again - and I cannot repeat this often enough - The question we are considering is not whether fraud existed but whether the exit polls are evidence of fraud.   

The Promise of the Fancy Function

Her paper also raises some fundamental questions about "within-precinct error," the statistic used by Edison-Mitofsky to analyze what went wrong with the exit polls (p. 19):

[These computations] indicate that the WPE is a confounded dependent measure, at best introducing noise into any analysis, but at worst creating artefacts that suggest that bias is concentrated in precincts with particular degrees of partisanship where no such concentration may exist. It is also possible that other more subtle confounds occur where a predictor variable of interest such as precinct size, may be correlated with partisanship.

In other words, the artifact may create some misleading results for other cross tabulations in the original Edison-Mitofsky report.  But this conclusion leads to the most promising aspect of Elizabeth Liddle's contribution:  She has done more than suggest some theoretical problems.  She has actually proposed a solution that will not only help improve the analysis of the exit poll discrepancy but may even help resolve the debate over whether the exit polls present evidence of fraud.

In her paper, Elizabeth proposed what she called an "unconfounded index of bias," an algebraic function that "RonK" (another DailyKos blogger that also occasionally comments on MP) termed a "fancy function." The function, derived in excruciating algebraic detail in her paper, applies a logarithmic transformation to WPE. 

Don't worry if you don't follow what that means.  Again, consider the pictures.  When applied to her no-net bias scenario (the model of the perfect exit poll with no bias for either candidate), the mean of her "BiasIndex" plots a perfectly straight lines with both the mean and median at 0 (no error) at every level of partisanship (the error bars represent the standard deviation). 

Fig3b_1


When applied to the "net Kerry bias" scenario, it again shows two straight lines, only this time both lines show a consistent bias in Kerry's favor across the board.  When applied to the model data, the "fancy function" eliminates the artifact as intended. 

Fig4b_1


Which brings us to what ultimately may be the most important contribution of Elizabeth's work. It is buried, without fanfare, near the end of her paper (pp. 20-21): 

One way, therefore, of resolving the question as to whether "reluctant Bush responders" were indeed more prevalent in one category of precinct rather than another would be to compute a pure "bias index" for each precinct, rather than applying the formula to either the means or medians given, and then to regress the "bias index" values on categories or levels of precinct partisanship.

In other words, the USCV claim about the "plausibility" of derived response rates need not remain an issue of theory and conjecture.  Edison-Mitofsky and the NEP could chose to apply the "fancy function" at the precinct level.  The results, applied to the USCV models and considered in light of the appropriate sampling error for both the index and the reported completion rates, could help us determine once and for all, just how "plausible" the "reluctant Bush responder" theory is. 

The upcoming conference of the American Association for Public Opinion Research (AAPOR) presents Warren Mitofsky and NEP with a unique opportunity to conduct and release just such an analysis.  The final program for that conference, just posted online, indicates that a previously announced session on exit polls - featuring Mitofsky, Kathy Frankovic of CBS News and Fritz Scheuren of the National Organization for Research at the University of Chicago (NORC) -- has been moved to special lunch session before the full assembled AAPOR membership.  I'll be there and am looking forward to hearing what they have to say. 


Epilogue: The Reporting Error Artifact

There is one aspect of the pattern of the actual data that diverges slightly from Elizabeth's model.  Although her model predicts that the biggest WPE values in the moderately Republican precincts (as opposed to Bush strongholds), the WPE in the actual data is greatest (most negative) in the precincts that gave 80% or more of their vote to George Bush. 

A different artifact in the tabulation of WPE by partisanship may explain this minor variation. In January, an AAPOR member familiar with the NEP data suggested such an artifact on AAPOR's member only electronic mailing list.  It was this artifact (not the one explained in Elizabeth's paper) that I attempted to explain (and mangled) in my post of April 8.

The second artifact involves human errors in the actual vote count as reported by election officials or gathered by Edison-Mitofsky staff (my mistake in the April 8 post was to confuse these with random sampling error).  Reporting errors might result from a mislabeled precinct, a missed or extra digit, a mistaken digit (a 2 for a 7), a transposition of two sets of numbers or precincts.  Errors of this sort, while presumably rare, could create very large errors.  Imagine a precinct with a true vote of 79 to 5 (89%).  Transpose the two numbers and (if the poll produced a perfect estimate) you would get a WPE of 156.  Swap a "2" for the "7" in the winner's tally and the result would be a WPE of 30. 

Since truly human errors should be completely random, they would be offsetting (have a mean of zero) in a large sample of precincts or in closely divided precincts (which allow for large errors in both directions).  However, in the extreme partisan precincts, they create an artifact on a tabulation of WPE, because there is no room for extreme overestimation of the winner's margin.  Unlike the differential response errors at the heart of Elizabeth's paper, however, these errors will not tend to get smaller in the most extreme partisan precincts.  In fact, these errors would tend to have the opposite pattern, creating a stronger artifact effect in the most partisan precincts. 

We can assume that such errors exist in the data because the Edison-Mitofsky report tells that they removed three precincts from the analysis "with large absolute WPE (112, -111, -80) indicating that the precincts or candidate vote were recorded incorrectly" (p. 34).  Presumably, similar errors may remain in the data that are big enough to produce an artifact in the extreme partisan precincts (less than 80 but greater than 20).

I will leave it to wiser minds to disentangle the various potential artifacts in the actual data, but it seems to me that a simple scatterplot showing the distribution of WPE by precinct partisanship would tell us quite a bit about whether this artifact might cause bigger mean WPE in the heavily Republican precincts.  Edison-Mitofsky and NEP could release such data without any compromise of respondent or interviewer confidentiality. 

The USCV authors hypothesize greater "vote corruption" in Bush strongholds.  As evidence, they point to the difference between the mean and median WPE in these precincts:  "Clearly there were some highly skewed precincts in the Bush strongholds, although the 20 precincts (in a sample of 1250) represent only about 1.6% of the total" (p. 14, fn).   A simple scatterplot would show whether that skew resulted from a handful of extreme outliers or a more general pattern.  If a few outliers are to blame, and if similar outliers in both directions are present in less partisan precincts, it would seem to me to implicate random human error rather than something systematic.  Even if the math is neutral on this point, it would be reasonable to ask the USCV authors to explain how outliers in a half dozen or so precincts out of 1,250 (roughly 0.5% of the total) somehow leave "vote corruption" as the only plausible explanation for an average overall exit poll discrepancy of 6.5 percentage points on the Bush-Kerry margin.

Endnotes [added 5/2/2005]:

Note 1) Why would imperfect performance by the interviewers create a pro-Kerry Bias?    Back to text

The assumption here is not that the interviewers lied or deliberately biased their interviewers.  Rather, the Edison-Mitofsky data suggest that a breakdown in random sampling procedures exacerbated a slightly greater hesitance by Bush voters to participate. 

Let's consider the "reluctance" side of the equation first.  It may have been about Bush voters having less trust in the news organizations that interviewers named prominently in their solicitation and whose logos appeared prominently on questionnaires and their materials.  It might have been about a slightly greater enthusiasm by some Kerry voters to participate as a result of apparently successful efforts by the DNC to get supporters to participate in online polls. 

Now consider the data presented in the Edison-Mitofsky report (pp. 35-46).  They reported a greater discrepancy between the poll and the vote where:

  • The "interviewing rate" (the number of voters the interviewer counts in order to select a voter to approach) was greatest
  • The interviewer had no prior experience as an exit poll interviewer
  • The interviewer had been hired a week or less prior to the election
  • The interviewer said they had been trained "somewhat or not very well" (as opposed to "very well")
  • Interviewers had to stand far from the exits
  • Interviewers could not approach every voter
  • Polling place officials were not cooperative
  • Voters were not cooperative
  • Poll-watchers or lawyers interfered with interviewing
  • Weather affected interviewing

What all these factors have in common is that they indicate either less interviewer experience or a greater degree of difficulty facing the interviewer.  That these factors all correlate with higher errors suggests that the random selection procedure broke down in such situations.  As interviewers had a harder time keeping track of the nth voter to approach, they may have been more likely to consciously or unconsciously skip the nth voter and substitute someone else who "looked" more cooperative, or to allow an occasional volunteer that had not been selected to leak through.  Challenges like distance from the polling place or even poor weather would also make it easier for reluctant voters to avoid the interviewer altogether.

Another theory suggests that the reluctance may have had less to do with how voters felt about the survey sponsors or about shyness in expressing a preference for Bush or Kerry than their reluctance to respond to an approach from a particular kind of interviewer.  For example, the E-M report showed that errors were greater (in Kerry's favor) when interviewers were younger or had advanced degrees.  Bush voters may have been more likely to brush off approaching interviewers based on their age or appearance than Kerry voters. 

The problem with this discussion is that proof is elusive.  It is relatively easy to conceive of experiments to test these hypotheses on future exit polls, but the data from 2004 - even if we could see every scrap of data available to Edison-Mitofsky - probably does not facilitate conclusive proof.  As I wrote back in January, the challenge in studying non-respondents is that without an interview we know little about them.   Back to text

Note 2) Why Are the Reported Completion Rates Suspect? (Back to text)

The refusal, miss and completion rates upon which USCV places such great confidence were based on hand tallies.  Interviewers were supposed to count each exiting voter and approach the "nth" voter (such as the the 4th or the 6th) to request that they complete an interview (a specific "interviewing" rate was assigned to each precinct).  Whenever a voter refused or whenever an interviewer missed a passing "nth" voter, the interviewer was to record the gender, race and approximate age of each on a hand tally sheet.  This process has considerable room for human error. 

For comparison consider the way pollsters track refusals in a telephone study.  For the sake of simplicity, let's imagine a sample based on a registration list of voters in which every selected voter has a working telephone and every name on the list will qualify for the study.  Thus, every call will result in either a completion, a refusal or some sort of "miss" (a no answer, a busy signal, an answering machine, etc.).  The telephone numbers are randomly selected by a computer beforehand, so the interviewer has no role in selecting the random name from the list.  Another computer dials each number, so once the call is complete it is a fairly simple matter to ask the interviewer to enter a code with the "disposition" of each call (refusal, no-answer, etc).  It is always possible for an interviewer to type the wrong key, but the process is straightforward and any such mistakes should be random and rate.

Now consider the exit poll.  The interviewer - and they almost always worked alone - is responsible for counting each exiting voter, approaching the nth voter (while continuing to count those walking by), making sure they deposit their completed questionnaire in a "ballot box", and also keep a tally of misses and refusals. 

What happens during busy periods when the interviewer cannot keep up with the stream of exiting voters?  What if they are busy trying to persuade one selected voter to participate while another 10 exit the polling place?  If they lose track, they will need to record their tally of refusals (including gender, race and age) from memory.  Consider the interviewer facing a particularly high level or refusals, especially in a busy period.  The potential for error is great.  Moreover, the potential exists for under-experienced and overburdened interviewers to systematically underreport their refusals and misses compared to interviewers with more experience or facing less of a challenge.  Such a phenomenon would artificially inflate the completion rates where we would expect to see lower values. 

Consider also what happens under the following circumstances:  What happens when an interviewer - for whatever reason - approaches the 4th or the 6th voter when they were supposed to approach the 5th.  What happens when an interviewer allows a non-selected voter to "volunteer?"  What happens when a reluctant voter exits from a back door to avoid being solicited?  The answer in each case, with respect to the non-response tally, is nothing.  Not one of these deviations from random selection results in a greater refusal rate, even though all could exacerbate differential response error.  So the completion rates reported by Edison-Mitofsky probably omit a lot of the "differential non-response" that created the overall exit poll discrepancy. 

It is foolish, under these circumstances, to put blind faith into the reported completion rates.  We should be suspicious of any analysis that comes to conclusions about what is "plausible" without taking into account the possibility of the sorts of human error discussed above.  Back to text

 

Posted by Mark Blumenthal on April 29, 2005 at 01:58 PM in Exit Polls | Permalink | Comments (34)

April 27, 2005

ABC/Washington Post on Judicial Nominees

The conservative wing of the blogosphere took great exception yesterday to the latest survey from the Washington Post and ABC News that gave front page play to the assertion that "a strong majority of Americans oppose changing the rules to make it easier for Republican leaders to win confirmation of President Bush's court nominees."   The complaints fell into two categories, (1) that the sample was unrepresentative and (2) that the questions "changing the rules" was biased.  MPs quick take is that the former complaints are largely unfounded, the latter debatable.   Let's take a closer look.

1) Biased sample?  Our friend Gerry Dalyes (of Dalythoughts) nicely summarized the first grievance [though as he points out in the comments section, he did not endorse it]: ]

The Ankle Biting Pundits , Erick at Red State and Powerline have all noted (hat tip to Michelle Malkin that while the 2004 exit polls showed that the parties were at parity among voters, the sample in this poll is not; it includes 35% Democrats and 28% Republicans- a 7 point advantage for Democrats.

The problem with this complaint is that ABC News and the Washington Post -- like most polling organizations -- surveys all American adults, not just registered or likely voters.  The voting population is slightly more Republican than the population of all adults.  Screening for voters is appropriate in a pre-election survey intended to track the campaign or forecast the outcome, but a survey of what "Americans" think ought to survey, well,  all Americans.   Even if you disagree, the issue is not one of "bias" or "over-representation" but of a difference in the population surveyed. 

Among all adults, as opposed to registered or likely voters, most survey organizations have shown a slight Democratic advantage in party ID over the last year, consistent with the ABC/Post results.  I put together the following table that averaged data from 2004 and year-to-date 2005 when available:  The surveys from CBS/New York Times, Harris, the Pew Research Center and Time/SRBI all show a Democratic advantage of two to six points.  Gallup (subscription required) is the exception, showing party ID at parity. 

Party_id

More to the point is this sentence in the ABC analysis:

Thirty-five percent of respondents in this survey identify themselves as Democrats, 28 percent as Republicans, about the same as the 2004 and 2005 averages in ABC/Post polls. It was even on average, 31 percent-31 percent, in 2003 [emphasis added].

If anyone from ABC or the Post is reading, it would be helpful to see those averages from 2004 and 2005.  Nonetheless, considering the ABC/Post poll's +/-3% sampling error, the party ID results are within range of the results for the other surveys from 2005 presented above (with the exception of Gallup's 35% GOP number), though they do look a point or two more Democratic than the average of the other surveys. 

If we were confident that this small difference resulted from random chance or some sort of sample bias aloine, we would want the ABC/Post pollsters to weight their data to correct it.  The problem is that the difference could be the result of a slight variations in question wording, in the content of earlier questions that might affect responses the party ID question the end of the survey, or perhaps they reflect a small real but momentary change in party identification.  If the difference is just about sampling error or sample bias, weighting could make the survey more representative.  If the difference is about any of the other issues, weighting would make it worse. 

The irony of all this -- one likely not lost on other pollsters -- is that the Washington Post enabled this criticism by breaking with past practice and putting out a PDF summary that included complete results not only for party identification and ideology, but also for the full list of demographics.  MP commends Richard Morin and the Washington Post for taking this step, even though it seems to be bringing them only grief. 

Yes, consumers of poll deserve this level of transparency.  Yes, it is appropriate to ask tough questions about how well any poll represents the nation.  But leaping to the conclusion that the sample composition is "ridiculously bad" (Ankle Biting Pundits) or that it shows "egregious" bias (Powerline) is just flat wrong. 

2) Biased question? - The second category of complaint took issue with the wording and context of the question that was the focus of the coverage:  "Would you support or oppose changing Senate rules to make it easier for the Republicans to confirm President Bush's judicial nominees?" 

Judicial filibuster is an example of the type of issue that makes pollsters lives miserable. The underlying issue is both complex and remote.  Few Americans are well informed about the procedures and rules of the Senate, and few have been following the issue closely (only 31% tell robo-pollster Scott Rasmussen they are following stories on the judicial nominees "very closely").  So true "public opinion" with respect to judicial filibusters is largely unformed.   When we present questions about judicial nominees in the context of a survey interview, many respondents will form an opinion on the spot.  Results will thus be very sensitive to question wording.  No single question will capture the whole story, yet every question inevitably "frames" the issue to some degree. 

To MP, the most frustrating bias in media coverage of polling -- be it mainstream or blog -- is the pressure to find a settle on a single question as the ultimate measure of "public opinion" on any issue.  In a sense, public opinion about issues like the judicial filibuster is inherently hypothetical.  Many Americans, perhaps most, lack a pre-existing opinion.  If we want to know how Americans will react to some future development (or whether they will react at all), no single question can tell us what we need to know. 

The best approach in situations like these is to follow the advice of our old friend, Professor M:

The answer is NOT to find a single poll with the "best" wording and point to its results as the final word on the subject. Instead, we should look at ALL of the polls conducted on the issue by various different polling organizations. Each scientifically fielded poll presents us with useful information. By comparing the different responses to multiple polls -- each with different wording -- we end up with a far more nuanced picture of where public opinion stands on a particular issue. If we can see through such comparisons that stressing different arguments or pieces of information produces shifts in responses, then we have perhaps learned something

So what can we learn from different polls on this issue?  The PollingReport has a one page summary that includes most recent polling on the issue (including survey dates and sample sizes): 

In mid-March, Newsweek found 32% approved and 57% disapproved changing the rules regarding filibusters with the following question:

U.S. Senate rules allow 41 senators to mount a filibuster -- refusing to end debate and agree to vote -- to block judicial nominees. In the past, this tactic has been used by both Democrats and Republicans to prevent certain judicial nominees from being confirmed. Senate Republican leaders -- whose party is now in the majority -- want to take away this tactic by changing the rules to require only 51 votes, instead of 60, to break a filibuster. Would you approve or disapprove of changing Senate rules to take away the filibuster and allow all of George W. Bush's judicial nominees to get voted on by the Senate?

At the beginning of April, the NBC News/Wall Street Journal poll found 40% who wanted to eliminate the filibuster and 50% wanted to maintain it, when they asked this question:

As you may know, the president of the United States is a Republican and Republicans are the majority party in both houses of Congress. Do you think that the Republicans have acted responsibly or do you think that they have NOT acted responsibly when it comes to handling their position and allowing full and fair debate with the Democrats?

Then there is this week's ABC/Washington Post survey that found 26% supporting a rule change to make it easier for Bush to win confirmation of his judicial appointees and 66% opposed.  The ABC question had two parts:

"The Senate has confirmed 35 federal appeals court judges nominated by Bush, while Senate Democrats have blocked 10 others. Do you think the Senate Democrats are right or wrong to block these nominations?"  48% said "right," 36% said wrong, 3% both and 13% were unsure

"Would you support or oppose changing Senate rules to make it easier for the Republicans to confirm Bush's judicial nominees?" - 26% support, 66% oppose, 8% unsure.

In an online survey, Rasmussen Reports asked a national sample several different questions.  Unfortunately, they did not release the verbatim language.  The following comes from language in their online release: 

"Forty-five percent (45%) of Americans believe that every Presidential nominee should receive an up or down vote on the floor of the Senate. That's down from 50% a month ago."

"When asked if Senate rules should be changed to give every nominee a vote, 56% say yes and 26% say no. A month ago, those numbers were 59% and 22% respectively"

The Republican polling firm Strategic Vision asked the following questions this past week of sample of registered voters in Florida:

Do you approve or disapprove of a Republican plan in the United States Senate to limit Democratic filibustering of judicial nominations and allow a vote on the nominations?  Florida registered voters: Approve 44%, Disapprove 33%, Undecided 23%.

Do you approve or disapprove of Democratic filibusters of President Bush's judicial nominations in the United States Senate? Florida registered voters: Approve 28%, Disapprove 57%, Undecided 15%

One thing largely missing in the questions asked by public pollsters is a better sense of how informed and engaged Americans are in this issue.   So far, only Rasmussen has asked, "how closely have you been following the issue?"  Unless I've missed it, no one has asked for a rating of the importance of the issue as compared to issues like health care, Social Security, Terrorism, Iraq, etc. 

In the same vein, MP wishes that a public poll would ask Americans an open-ended question about this issue.  It would first ask, "have you heard anything about a controversy involving President Bush's judicial nominations?"  Those who answer yes would then get an open-ended follow-up: "What specifically have you heard?"  The answers would help show how many have pre-existing opinions that demonstrate worry about conservative nominees or about the President Bush has getting his nominees confirmed. 

MP does not agree that the question asked by the ABC/Washington Post poll is inherently biased: "Would you support or oppose changing Senate rules to make it easier for the Republicans to confirm Bush's judicial nominees?"  There is much I like about this question:  It is clear, concise and easy to understand and interpret because it avoids the use of often unfamilar terms like "filibuster." 

The problem is -- and here the conservative critics have a point -- it is just one question and it does reflect one particular framing of the issue.   As Ramesh Ponnuru points out, there is another question they could have asked that is equally concise and clear:  "Would you support or oppose changing Senate rules so that judges can be confirmed by majority vote?"  We might take Ponnuru's suggestion a step further and ask whether rules "that make it easy for a minority of Senators to block a nomination even when majority of the Senate supports it?" 

Different questions may produce greater support for the Republican position, as the various results presented above imply.  Understanding public opinion with respect to judicial nominees is not about not about deciding which question is best, or whether any one question alone is biased.  It is about measuring all attitudes, even the ones that conflict, and coming to a greater understanding of what it all means.  The answers may be contradictory, but sometimes, so is public opinion.

[minor typos corrected]

Posted by Mark Blumenthal on April 27, 2005 at 01:54 PM in Polls in the News, Weighting by Party | Permalink | Comments (20)

April 25, 2005

Disclosing Party ID: LA Times

Today I have another response from a public pollster regarding the disclosure of party identification.  I have been asking public pollsters that do not typically disclose the results for the party identification to explain their policy.  Today we hear from Susan Pinkus, director of the Los Angeles Times poll: 

My predecessors usually did not release this information in the press releases unless it was requested and I just followed the precedent.  However, at times, the Times Poll has published party ID figures in poll stories if it was part of the overall analysis.   If someone requests it, of course, we are more than happy to give it to them.  Having said that, the party ID results were so much in the headlines last year, and so contested by the campaigns (depending if they liked the results of that particular poll or not) that I will probably start putting those figures in the poll's press release -- when it is called for (i.e., during election years or is relevant to the survey's analysis).  Party ID is asked in national polls because of the obvious -- not every state is registered by party.  In California, however, voters have to register by party or declined-to-state. In state and local Ca. races, we usually only ask registered voter question and not party ID.

Thank you LA Times

To confirm the willingness of the Times to release party numbers on request:  Those who follow this topic closely will recall that in June 2004 an LA Times poll that showed John Kerry leading George Bush came under attack by Matthew Dowd of the Bush campaign.  He told ABC's The Note that the poll was "a mess" because it "is too Democratic by 10 to 12 points" (proving that complaints about party identification do not always come from Democrats).  Pinkus responded with a statement and a release of party identification for all polls going back to September 2001. 

There is one more public pollster that I asked for a statement that has not yet responded.  I'll post that if and when I receive it.

Party Disclosure Archive (on the jump)

Posted by Mark Blumenthal on April 25, 2005 at 06:10 PM in Weighting by Party | Permalink | Comments (0)