October 31, 2004
The MP Reader Survey
As I mentioned in the last post, I created a short online survey about this site that you can fill out by clicking this link. If you have read this blog, I would appreciate it if you would take 3-5 minutes and compete the survey. It will not track respondent identities, and as such, your participation is completely anonymous and confidential.
Why a reader survey now?
After the election, I need to do some thinking about the future of this blog. I am a pollster, after all, and it is hard for me to think about anything without survey data.
Since some of you have already asked via email, however, let me be clear: I am certainly not planning to use responses to this survey to try to predict how many of you will return in the future. I have no illusions: Most of you are obsessed with polls right now because of the election. The traffic to this site will fall off big time right after the election, and I will have all sorts of hard numbers on that.
However, I am very curious about the snapshot of who is reading this blog right now, why, how you got here and what you think about it. I'd also want to use the survey to take a closer look at those who think they might want to return every now and then, even at times when an election is months or eyars away.
One last point for those of you who miss no irony: This survey is certainly NOT based on a random sample. It is what some call a "convenience" sample; a survey that will reflect nothing more than the views of those who choose to complete it. I expect that the respondents will be hugely biased toward regular readers. In this case, however, regular readers are the universe I care about most.
So, again, if you have found this site useful, please take a moment and complete the survey. If you have problems with the survey, please email me. Feel free to post comments or email them (but I will not be able to answer much mail until after the election).
The last big post on likely voter models coming soon...
And Now for Something Completely Different...
You may have wondered why I disappeared for the last 24 hours or so. Well, another bit of personal business intruded. So, without further ado...
Mother and baby are doing well and resting comfortably. His 22 month old sister is a little perplexed as to why we keep calling him Sam and not "Deena's baby brother."
As for Daddy, well....Let's just say that when we learned the due date would be right around the elections, I was pleased. Since campaign pollsters like me are usually finished with our campaign work by now (public polls keep going, internal polls are mostly done), I thought, "great, perfect timing!"
Today...ah...not so much.
P.S. I do have an important yet still unfinished post on all the likely voter models other than Gallup & CBS in the works. Also, the conclusion of this presidential campaign is looking to be every bit as close and interesting as the last time around. There are a few topics I have been dying to get to. So I'll be surfing and posting for the next 72 hours. Don't worry. My son and family will be nearby and not ignored. After Tuesday they will get 100% of my attention for awhile.
P.P.S: You can do Sam's father a big favor by taking a few minutes to complete this brief (3-5 minute) survey he created about Mystery Pollster. Respondent identities will not be tracked and answers are completely confidential. If you are a first time visitor to the site, please take a moment and look around first. If you have any problems with the survey please email me. Thank you!!
October 30, 2004
Are They Breaking Yet?
Looking for validation of the incumbent rule, more than a few readers (including Mickey Kaus and Noam Scheiber) have asked when we can expect to see undecided start "breaking" toward Kerry. My answer all along has been that we typically see the phenomenon between the last survey and when the ballots are counted.
The theory behind the rule is that those who tell pollsters they are undecided are conflicted: ready to fire the incumbent but still possessing strong doubts about the challenger. In the end, their feelings about the incumbent typically win out, because the incumbent's performance in office is much more central to their ultimate decision. So my hunch (not informed by empirical data) is that the break either occurs at the very last moment or is simply something a voter would rather not admit to a stranger on the phone.
As such, I think Kaus is on to something when he wonders about an "embarrassment" factor that might limit Kerry more on telephone surveys but not on automated, recorded interviews like those done by SurveyUSA and Rasmussen. I think I see evidence of this in the polls by SurveyUSA in Florida, Ohio, Pennsylvania and Michigan. In each of those states, SUSA has Bush matching the RealClearPolitics average but has Kerry running a few points higher. Their surveys always show a lower
higher undecided than most other surveys, and Jay Leve, SurveyUSA's director has always speculated it is because their recorded interview better simulates the solitary experience the voting booth. At the same time, I see an opposite pattern in Iowa, Missouri and Colorado - so perhaps I'm just data mining. I want to watch this closely over the weekend.
Nonetheless, a hedge: The best empirical evidence for incumbent rule lies in the surveys gathered by Nick Panagakis, Chris Bowers and Guy Moleneux. I do not have access to their spreadsheets, but I am assuming that most of the surveys they reviewed were fielded during the last calendar week before the election the rather than over the final weekend. Also, campaign pollsters like me believe in the incumbent rule because of our own experience with internal surveys that we almost always complete before the final weekend. So it is possible we may see some signs of a break over the weekend. Then again, I wouldn't be surprised if we didn't
When we all started talking about the incumbent rule three weeks ago, there were two key counterarguments. One was that an examination of older Gallup polls showed a number of elections in which incumbents gained during September or October. As Kaus noted, Pat Caddell has made a similar argument. Even if you do not see my point about the incumbent rule working at the end of the campaign, I think we can put that argument to rest. John Kerry gained significant support after the first debate and, once you factor in sampling error, the overall preferences have barely budged since.
Consider an update to my poll of polls approach to the four nightly tracking surveys (by ABC, Zogby, TIPP and Rasmussen). Individually, they have shown small insignificant movement. But average all four and they look remarkably flat. If anything, Kerry may have gained a point in the last few days.
If that finding does not persuade (the tracking polls are all weighted by party, after all), consider the six organizations that polled both last week and this week. Average all six and the results for each week look nearly identical: Bush led by an average of four points last week (49% to 45%) and an average of three points this week (49% to 46%). Bush has not gained. Once again, if anything, Kerry picked up a point. If the averages seem inappropriate given the usual slight differences between organizations (sample sizes, dates, question language, etc.), consider this: Three surveys showed Bush doing slightly better this week, three showed him doing slightly worse. That's exactly what you'd expect if you flipped a coin six times (Note: Democracy Corps actually conducted six standalone surveys over this period. For the table, I simply calculated separate averages for the first three and the second three).
The second argument is that 9/11 and the Iraq War renders the incumbent rule moot. The theory is that conflicted voters will opt to stick with the incumbent rather than "changing horses midstream." Those who make this argument typically point to a number of races in 2002 where undecided voters appeared to break for incumbents. I remain skeptical -- this election looks nothing like 2002 to me -- but we will not know for sure until Tuesday night. Given that Osama bin Laden has reared his evil head once again, 9/11 will certainly be on a lot of minds over the weekend. If nothing else, the counter-argument will get a fair test.
One last thought, as we ponder the final 72 hours of the campaign. Four years ago, pollsters like me looked at the polls released the Friday before Election Day and concluded that the race was over. George Bush looked to be on his way to a comfortable victory. As the table below shows, the polls that day had Bush ahead of Al Gore by an average of five points (47% to 42%). Of course, the Bush DUI story broke that same day. By Monday, seven of the eight surveys that continued to track over the weekend showed a Gore closing the margin to an average of one point (Bush led 46% to 45%).
I include this data not because I expect a repeat of Gore's late surge but to remind everyone that, as Yogi Berra says, "it ain't over till it's over."
[Appropriate table inserted for 2000 polls - 10/31]
A Special Request: Have you found this site useful? Please help me keep it alive by taking a few minutes to complete this brief (3-5 minute) about Mystery Pollster. Respondent identities will not be tracked and, as such, your participation will remain completely anonymous and confidential If you have any problems with the survey please email me. Click here if you're asking, "why a reader survey now?" Thank you!!
October 29, 2004
Likely Voters VII: CBS/NYT
Virtually all of the national surveys use some form of cut-off procedure to define likely voters. Respondents are either classified as likely or unlikely voters. There is one notable exception: The CBS/New York Times poll, whose likely voter model involves weighting respondents by their probability of voting.
Warren Mitofsky, then CBS polling director (now director of the network exit polls) developed the CBS/NYT model using validation studies conducted by the University of Michigan's National Election Studies (NES). The NES regularly checked registration records to see if respondents had actually voted. Mitosfky used questions identical to those on the NES to ask about registration, intent to vote, history of past voting, and when they moved to their current address. All of these questions had been shown to correlate with actual turnout. Mitofsky used the survey results to classify voters into several groups, ranging from low to high turnout, and weight by the probability of voting derived from the NES studies (CBS has posted a more detailed description of the current procedure here).
While CBS does not release its actual probability data, a 1984 article in Public Opinion Quartely authored by Political Scientists Michael Traugott and Clyde Tucker (then an assistant survey director at CBS) included similar data that helps show how the model works. The table below shows data from the 1980 NES. Those who said they were not registered to vote are in the first row, followed by four groups of voters ranked on their reports of past voting and interest in the campaign:
The middle column shows the percentage of each group of respondents that actually voted in the 1980 election for President. The CBS procedure is to weigh non-registrants to a probability of zero, then weight each group of registered voters by its probability of voting (the percentage that actually voted). Thus, using the Traugott/Tucker data as a hypothetical example, they would multiply each respondent in the "high" turnout group by a "weight" of 0.746, respondents in the medium group by 0.619 and so on. Since 2000, CBS also started giving a weight of 1.000 to any respondent who says in the survey that they have already voted absentee.
The main advantage of the CBS model is that it uses the entire sample of registered voters, unlike the cut-off models that throw out respondents not classified as likely voters.
The main disadvantage is that it relies on data from validation studies. The model is only as good as the probabilities it applies and cannot be applied in statewide races where such data is unavailable. Also, the National Election Studies stopped conducting validation studies in the late 1980s. As a result, CBS did their own validation study following the 2000 elections. They called back respondents to pre-election surveys in November 2000 "to ask whether they actually voted to refine what by then were outdated probabilities." Since the 2000 study depended on self-reports of voting behavior, CBS adjusted the probabilities to account for the usual over-reporting.
Does the CBS model forecast elections more accurately than the Gallup model? Perhaps not, although both have performed similarly in presidential elections. According to the error rates calculated by the National Council on Public Polls, the final presidential election polls conducted by CBS produced slightly more error than Gallup on the margin between candidates since 1976 (4.1% vs. 3.7%) and error for the leading candidate (2.1% vs. 1.9%; error calculations explained here). The differences appear to be random: CBS did slightly better three times (especially 1992), Gallup did better four times (especially 1996). It is also worth noting that CBS had the second lowest rates of error of all polls in the 2000 elections, showing Al Gore one point ahead of George Bush (45% to 44%) on their last poll.
One big apparent advantage of the CBS method is that it has shown more stable results among likely voters over the course of the fall campaign without weighting by party. In October of 2000, while the Gallup poll swung wildly, showing Bush variously leading by as much as 13% and trailing by as much as 11%, four of the five CBS surveys showed the race even or Bush leading by a single percentage point. The widest lead CBS gave George Bush was 4% (46% to 42%) on their next to last survey. This year, the CBS has been similarly consistent: The three surveys of likely voters conducted by CBS in October have shown Bush ahead among likely voters by between one to three percentage points.
I have always been a fan of the CBS model, if only because it provides such an elegant solution to the likely voter problem. When it comes to turnout, the best a poll can do is tell us the relative probabilities of different kinds of voters. Some kinds of people tend to vote more often, some less. If we must try to forecast the probable electorate, why throw out respondents if we don't have to? If standard "cut-off" models produce more volatile results, the answer seems even more obvious.
Note: I have certainly not attempted a thorough search, the only application of probability weighting at the State level I am aware of is the Minnesota Poll conducted by Rob Daves at the Minnesota Star Tribune.
October 28, 2004
The hardest part about blogging the last few weeks is that I get less time to simply read and surf than I used to. So you may already know about the following. .
First, I discovered that Philip Meyer, a Knight Professor of Journalism at the University of North Carolina at Chapel Hill, wrote a column in USAToday on Erickson's POQ article and the Gallup likely voter model. Money quote:
A likely-voter poll is the right thing to do if all you want is to predict the outcome of the election - but that's a nonsensical goal weeks before the event. Campaigns change things. That's why we have them.
It would be far more useful to democracy if polls were used to see how the candidates' messages were playing to a constant group, such as registered voters or the voting-age population. Whoever is elected will, after all, represent all of us.
Read it all.
Second, another must-read for the polling obsessed that those outside the Washington Beltway may have missed: Washington Post Polling Director Richard Morin's lengthy exposition on how tough it is to be a pollster these days. It covers all the topics we've been chewing over and some we haven't gotten to yet: cell phones, reponse rates, Internet polling, etc. If you enjoy this blog, you'll find much of interest.
First, a blanket apology to all of you who have emailed over the last week or so. You have sent a ton of great questions. I wish I had time to answer them all, but with the election rapidly approaching, I'm starting to realize I won't get to them all. I am hoping to get through a bunch over the weekend.
Second, you may have noticed a few small changes. I cleaned up the way the Frequently Asked Questions list works. It should be easier to find things I’ve written over the last month. I have received many emails about topics I've already covered. Also, I added some blog roll lists. I still have a recommended book list and list of methodology pages in the works. Some of you have sent links to your own blogs. If I've forgotten a blog, gently remind me again. Apologies for forgetfulness.
Finally, the problems some of you have experienced over the last few days with unusually large fonts and Mac browser hangs should be cleared up. The problem was my habit of pasting html tables saved with a program called "Excel" (perhaps you’ve heard of it?) that included all sorts of extraneous stuff that makes the blog software very unhappy. Very bad idea. The tables are now jpeg files, which are also easier to read. At some point, I’ll learn to do non-graphical tables…but not this week.
Likely Voters VI: Still More on Gallup
Of the two complaints about the Gallup likely voter (LV) model I covered in the last post, the first (that the selection procedure does not perfectly predict individual turnout) applies equally to all LV models. The second (that it is too volatile) appears mostly directed at Gallup, but many of the similarly structured surveys (Pew, Newsweek, Time) have been more volatile than surveys that either weight by party (Zogby, TIPP, Rasmussen, ABC/Washington Post tracking, sometimes WSJ/NBC) or stratify by turnout (FOX/Opinion Dynamics, Democracy Corps).
A third complaint is mostly about Gallup: A chorus of voices on the Kerry-wing of the blogosphere and elsewhere - especially Ruy Teixeira (DonkeyRising), Steve Soto (TheLeftCoaster) and Chris Bowers (MyDD).
First, are they right? I put the numbers posted on RealClearPolitics into my spreadsheet and found:
- This week, Gallup has Bush ahead by five points (51% to 46%), the other surveys have Bush ahead by an average of two points (48% to 46%).
- This week, Gallup released surveys in six states. The averages across all six states are similar to the national result Gallup has Bush ahead by an average of 4 points (50% to 46%), others have Bush up by an average of one point (48% to 47%). Gallup gives Bush a bigger lead in 4 of 6 states.
- Since the Democratic convention, Gallup has released 10 national polls. Although the differences were sometimes small, 7 of 10 showed a bigger margin for Bush than the average of other likely voter surveys conducted over the same period. If you average the averages: Gallup had Bush ahead by an average of five points (50% to 45%), while others had Bush head by two points (47% to 45%).
- If the pattern seems weak, consider this: In 11 of 16 cases cited above, Bush did better on Gallup surveys, something that should have been a 50/50 proposition each time. The probability of flipping a coin 16 times and having it come up heads 11 times is 6.7%.
I want to be clear. I do not believe that anyone at Gallup, CNN or USAToday has intentionally skewed its numbers. But they have been showing Bush doing a bit better than other surveys. This raises two questions: (1) Why are they different and (2) who is right?
Steve Soto has been arguing for months that Gallup's samples are biased toward Republican identifiers. He obtained and posted party identification results from Gallup for nearly every survey it releases, and has argued that Gallup's party mix seems implausible in comparison to party registration statistics or past exit poll results. Ruy Teixeira and Alan Abramowitz have made similar arguments on Donkey Rising. Soto and Teixeira have also recently written about the minority representation in Gallup's most recent sample: 8% of likely voters were black, compared with 10% of voters on the 2000 exit polls; 15% of Gallup's likely voter sample was non-white compared to 19% of voters on the 2000 exit polls. They show similar differences for income, as well.
I do not want to revisit the debate about weighting by party identification. I have misgivings (you can read all about those starting here). However, in this case I agree with Teixeira that the demographic differences are less the disease than a symptom. They are telling us to look carefully at the effect of the likely voter model.
Let me suggest another discrepancy. I called Gallup seeking answers to a few questions I had about how first time voters could possibly qualify as likely voters. Since they say a likely voter must score at least a 6 out of 7 on their index, and since three of the items require some past voting, it would be impossible for a first time voter to qualify. I had asked this question via email about a month ago, and was told that Gallup gives 18-21 year olds an extra point. But the extra point would still leave a first time voter with a maximum of five points.
The answer, I learned from Gallup's Jeff Jones yesterday, is that younger voters get more than one bonus point. Apparently, Gallup gives 18 and 19 year olds up to three extra points, depending on how they score on the other likelihood questions. They give 20 and 21 year olds up to two points (since they could have voted in 2002 and answered that they "always" or "almost always" voted in prior elections).
Still with me?
Here's the bottom line. On a self-reported question, 6% of those who qualified as likely voters said they will cast their first presidential vote in the 2004 election. Among all registered voters, 12% say they are first time voters. Though a very small subgroup (roughly 72 weighted interviews) first-time likely voters support John Kerry by a 56% to 42% margin, while past voters prefer Bush 52% to 45%.
The difference in the Gallup survey looks to me to be right on the edge of statistical significance. However, it is consistent with preference for Kerry among first time voters on two other recent surveys: 57% to 36% for Newsweek and 54% to 43% for ABC. Also on the 2000 exit polls, Al Gore won first time voters by a 52% to 43% margin. The real distinction for Gallup was the percentage of likely voters that qualified.
Again, self-reported first-time voters were:
- 6% of likely voters on Gallup's most recent survey
- 9% of likely voters on Newsweek's recent survey
- 10% of likely voters on the recent ABC survey
- 9% of voters in 2000, according to the national exit poll
Jones points out that even if they had doubled the number of first time voters in the sample, it would have cut Bush's overall margin by only a single percentage point. True. But the lower number raises a larger question. Is the Gallup model simply screening out too many voters who do not typically vote in presidential elections? All mechanical issues aside, the demographic differences and the higher than average support for Bush (given the consistent finding that non-2000 voters tend to prefer Kerry) suggest that Gallup is effectively modeling a lower turnout than the other surveys.
Every measure of intent to vote and interest in the election is significantly higher this year on every survey I have seen. According to Jones, 84% of adults now say their probability of voting rates a 10 on Gallup's 1-10 scale, 16 percentage points higher than this time four years ago (68%). Not all of those adults will vote, to be sure, but the finding certainly suggests a higher than usual turnout. Shouldn't categories like first-time voters and self-described non-voters from 2000 be a bit higher than usual rather than lower?
Perhaps Gallup agrees. According to Jones, they will raise their cutoff for likely voters on their last survey coming up this weekend from 55% of adults to 60%. Jones and others with access to the Gallup data tell me such a change would not have altered the results much on the last few Gallup surveys. Perhaps. But if that's the case, why change now?
Likely Voters V: More Gallup
In the last post I covered how the Gallup likely voter model works. In this post, I want to review criticisms of the model.
An Imperfect Predictor? One complaint about the Gallup model and its progeny is that they do not perfectly predict likely turnout. Some real voters get classified as "unlikely” – some non-voters are deemed "likely.” The creators of the original Gallup model did not promise that their model could make a 100% accurate classification only that selecting a subgroup of the likely voters sized to match the likely turnout level provides the most accurate read on the outcome. Keep in mind that although the mechanics of the model have been in use for more than 40 years, the methodologists that apply it review how well the model worked after every election. Since they typically ask vote questions of all registered voters, they can look back after each election and check whether alternative models would have predicted the outcome more accurately. If Gallup continues to stick with the model, it is because they believe it continues to work as well or better than any alternative.
As described in the last post, the original Gallup models were based on validation studies that obtained the actual vote history for respondents. This process was relatively easy when pollsters interviewed respondents in-person, as their names and addresses easily obtained. Conducting a validation study on a random digit dial (RDD) telephone interview, requires that the respondent provide their name and address to the pollster. And records are dispersed in thousands of clerks offices and databases across the country. So such studies are now rare and difficult.
In 1999, the Pew Research Center conducted such a validation study of the Gallup likely voter model, using polls taken during the Philadelphia mayor’s race. They were able to obtain actual voting records for 70% of their respondents. The Pew report is worth reading in full, but here are the two key findings. On the one hand, the likely voter model was far from perfect in predicting individual level voting: .
Using [the Gallup likely voter] index, the Center correctly predicted the voting behavior of 73% of registered voters…. The 73% accuracy rate means that 27% of respondents were wrongly classified -- those who were determined as unlikely to vote but cast ballots (17%), or non-voters who were misclassified as likely to vote (10%).
On the other hand, the Pew report showed that the results of the likely voter sample came closer to predicting the very close outcome, as well as the preferences of those in the sample who actually voted, than the broader sample of registered voters.
A longer academic paper based on the study summed up the conventional wisdom accepted by most public pollsters: "Though it is impossible to accurately predict the behavior of all survey respondents, it is possible to accurately estimate the preferences of voters by identifying those most likely to vote.”
One important limitation: The Pew study involved a low-turnout, off-year mayoral election, where the difference between the size of the electorate and the pool of registered voters was large. In a high turnout presidential elections in which 80% or more of registered voters cast ballots, there is typically less difference between registered and likely voters.
Too Much Volatility? The most common complaint directed at Gallup’s likely voter model is that it seems to yield more volatile results than other polls. In 2000, Gallup’s daily tracking surveys showed dramatic swings. On October 2, for example, they reported a dead heat between George Bush and Al Gore among likely voters (45% to 45%). Four days later following the first debate, they had Gore suddenly ahead by 11 points (51% to 40%). Four days after that, Bush was ahead by eight (50% to 42% -- see the chart prepared by Gallup). Other polls taken over the same period showed nowhere near as much change, and Gallup’s own registered voter samples were more stable.
While Gallup dropped the daily tracking program this year, they have continued to show more volatility than other surveys. For example, they had Bush ahead by fourteen points (54% to 40%) in mid-September, had Kerry ahead by a single point after the debates (49% to 48%) and now have Bush leading again by six (51% to 46%).
An article in the current issue of Public Opinion Quartely presents evidence that the volatility resulted mostly from changes in the composition of the Gallup likely electorate. In other words, the volatility resulted less from a changing opinions than from changes in the people that Gallup defined as a likely voters. Authors Robert Erikson, Costas Panagopolouos and Christopher Wlezien analyzed the raw Gallup data from 2000 available in the Roper Center Archives. They compared likely voter non-likely voters and found that trend lines moved in opposite directions over the course of the campaign. The concluded that "most of the change (certainly not all) recorded in the 2000 CNN/USA is an artifact of classification,” and that the shifts resulted from:
The frequent short-term changes in relative partisan excitement…At one time, Democratic voters may be excited and therefore appear more likely to vote than usual. The next period the Republicans may appear more excited and eager to vote. As Gallup’s likely voter screen absorbs these signals of partisan energy, the party with the surging interest gains in the likely voter vote. As compensation, the party with sagging interest must decline in the likely voter totals. [The full text is available here]
Although Gallup has not formally responded to the Erikson, et. al. study, the methodologists at Gallup do not quarrel with the basic finding. Gallup’s Jeff Jones recently told the New York Times:
We're basically trying to get a read on the electorate as of the day that we're polling," said Jeffrey Jones, managing editor of the Gallup Poll, "not necessarily trying to predict what's going to happen on Election Day itself.Jones frames the key question perfectly. Most pollsters agree that a pre-election survey is no more than a snapshot of opinions of the moment, but what about the people in the sample? As Erikson put it in an email to me last week, do we want surveys to identify those who are likely to vote on Election Day or those who are likely to vote "if the election were held today?”
Gallup’s answer is to let the composition vary. My view, and the view of most of my colleagues who poll for political candidates, is that we need to impose controls to keep the composition of the likely voters as constant as possible. However, those controls require making subjective decisions about what the likely electorate will look like on Election Day. Some weight by party (like Zogby and others). Others stratify their sample regionally to match past vote returns (like Greenberg/Democracy Corps and Fox/Opinion Dynamics) – an approach I prefer. However, supporters of the Gallup model argue that both alternatives pose a greater risk of imposing past assumptions on an unknown future.
I think those compromises are worthy. but I am a producer. You are consumers. How would you answer Erikson’s question? If you have a strong feeling, enter a comment below.
I’ll take that up one last complaint about the Gallup model in the next post.
[Mispelling of Erikson corrected]
October 27, 2004
Likely Voters IV - The Gallup Model
So how do pollsters select likely voters?
The best place to start is the Gallup likely voter model, the granddaddy of them all. Gallup is also worthy of special scrutiny for other reasons: It is easily the best-known brand name in survey research. Its campaign polls, conducted in partnership with CNN and USA Today, receive more attention and arguably have greater influence than other polls over campaign coverage. Finally, Gallup's methodology has also been the object of far more criticism this year than any of the others.
Before reviewing the Gallup model and its shortcomings, I want to strongly emphasize one point: We are able to nitpick their model largely because Gallup has been extraordinarily open about their internal procedures, more so than other pollsters. They have patiently answered questions from the most critical of outsiders. They routinely turn their raw data over to the Roper Center after each election, where academics can scrutinize their methods and search for flaws. That Gallup has been punished, in effect, for its openness has not been lost on competitors who remain considerably less forthcoming. So while it is appropriate to question Gallup's model, we ought to give them credit for their transparency. By opening themselves up to criticism this way, they are advancing the art and science of survey research.
Gallup has been open about its methods from the start. In 1960, Paul Perry, Gallup's president and research director, published an article in Public Opinion Quarterly detailing their election poll methodology ("Election Survey Procedures of the Gallup Poll," vol. 24, pp. 531-542). Then as now, respondents tended to over-report their true voting intentions, so selecting likely voters was not a matter of simply asking, "will you vote?" To identify the true "proportion of the population old enough to vote who will vote," Perry used internal validation studies that compared respondents' answers to their actual vote history. During the 1950s, Gallup sent its interviewers to vote registrar offices after each election to check whether their respondents had actually voted.
While no single question perfectly predicted whether a respondent would vote, Perry combined a series of questions "related to voting participation" into a 1-7 point scale that was highly predictive of actual turnout: "The system is such that the greater their likelihood of voting, the higher their score. Respondents are then ranked on the basis of their scores." Perry first set aside those who said they were not registered because their studies had shown that only "a negligible percentage of them vote, something on the order of between 1 and 5 percent." Then he used the index to select a subgroup of the highest scoring respondents whose size matched the proportion of adults that typically voted in each election. In presidential and congressional elections from 1950 to 1958, the model reduced the average "deviation" from reality on Gallup's polls from 2.8 among registered voters to 1.1 percentage points among likely voters.
Although Gallup has made minor modifications, the questions and procedures that Perry described 44 years ago remain in use by the Gallup Poll today. Among those who say they are registered to vote (or who plan to do so before the election), Gallup uses the following questions to create a scale that varies from 0 to 7:
- 1) How much have you thought about the upcoming elections for president, quite a lot or only a little? (Quite a lot = 1 point)
2) Do you happen to know where people who live in your neighborhood go to vote? (Yes = 1 point)
3) Have you ever voted in your precinct or election district? (Yes = 1 point)
4) How often would you say you vote, always, nearly always, part of the time or seldom (Always or nearly always = 1 point)
5) Do you plan to vote in the presidential election this November? (Yes = 1 point)
6) In the last presidential election, did you vote for Al Gore or George Bush, or did things come up to keep you from voting?" (Voted = 1 point)
7) If "1" represents someone who will definitely not vote and "10" represents someone who definitely will vote, where on this scale would you place yourself? (Currently 7-10 = 1, according to this "quiz" on USA Today)
A few additional notes: They automatically exclude from the likely voter pool anyone who says they do not plan to vote (on #5). They also give anyone 18-24 an extra point, to help make up for having said they did not vote in the last election (perceptive readers will immediately sense a problem here -- I'll take that up in the next post).
According to Gallup's David Moore, they aim this year to select a pool of likely voters equal to 55% of their adult sample - their estimate of the appropriate "turnout ratio" likely in this election. In practice, the percentage that scores a perfect 7 out of 7 typically comes very close to 55%. If it ever goes over, they will tighten the scoring of the last question about likelihood to vote (giving a point to those who answer 8-10, for example, instead of 7-10), so that likely voters will always be some combination of sixes and sevens this year.
The one hitch is that they usually have more than enough sixes to bring the total size of the likely voter pool to 55%. So Gallup weights down the sixes to make the weighted value of the likely voters equal to 55%. An example makes this easier to follow: (although the following numbers are totally hypothetical - I made them up): Suppose the pool of those scoring 7 out of 7 is 50%, and the sixes are 10%. They would then weight down the value of the sixes by half (multiply times 0.5): 50% + (10% *0.5) = 55%.
What if the sevens are 50% and the sixes are 15%? They would weight the sixes by 0.33: 50% + (15%*0.33)=55%. Make any sense?
Important concept: Gallup does not claim that this model perfectly predicts who will vote, only that the pool of likely voters consists of those most likely to vote. They also designate some voters as likely and others as not likely. In these two respects, their model is consistent with virtually other pollster. From there, however the way pollsters pick likely voters diverges in a big way.
In the next post, the shortcomings and critiques of the Gallup model..
October 26, 2004
About Those Tracking Surveys
So admit it. At least once a day, possibly more often, you've been checking in on the various rolling-average tracking surveys. Most of us are.
If you have been looking at more than one, the results over the last few days may have seemed a bit confusing. Consider:
- As of last night, the ABC News/Washington Post poll agreed with itself and reported John Kerry moving point ahead of George Bush (49% to 48%), a three-point increase for Kerry since Friday.
- Meanwhile The Reuters/Zogby poll had President Bush "expanding his lead to three points nationwide," growing his support from one point to three points (48% to 45%) in four days.
- But wait! The Rassmussen poll reported yesterday that John Kerry now leads by two points (48% to 46%), "the first time Senator Kerry has held the lead since August 23."
- But there's more! The TIPP poll had Bush moving ahead of Kerry by eight points (50% to 42%) after having led by only a single point (47% to 46%) four days earlier.
What in the world is going on here?
Two words: Sampling Error.
Try this experiment. Copy all of the results from any one of these surveys for the last two weeks into a spreadsheet. Calculate the average overall result for each candidate. Now check to see if any result for any candidate on any day falls outside of the reported margin of error for that survey for either candidate. I have -- for all four surveys -- and I see no result beyond the margin of error.
What about the differences between two surveys? Two results at the extreme ends of the error range can still be significant. A different statistical test (the Z-Test of Independence) checks for such a difference.
So what about the difference between the 46% that the TIPP survey reported four days ago and the 42% they reported yesterday? It is close, but still not significant at a 95% confidence level
What about the three-point increase for John Kerry (from 46% to 49%) over the last three days on the that ABC/Washington Post survey? Nope, not significant either. And keep in mind, they had Kerry at 48% before, just seven days earlier.
Might some of these differences look "significant" if we relaxed the level of confidence to 90% or lower? Yes, but that means that we would likely find more differences by chance alone. If you hunt though all the daily results for all the tracking surveys, you will probably find a significant difference in there somewhere (some call that "data mining"). However, you would need to consider the following: Calculate the number of potential pairs of differences we could check given 4 surveys and 13 releases each (2 less for ABC/Washington Post) over the last 14 days (I won't, but it's a big number). At a 95% confidence level, you should find one apparently "significant" difference for every 20 pairs you test. Relax the significance level to 90%, and you will see "significant" yet meaningless differences for one comparison in ten.
If all of this is just too confusing, consider the following. I averaged the daily results released by all four surveys for the last two weeks (excluding October 15 and 22, when ABC/Washington Post did not release data). Here is the "trend" I get:
If all four surveys -- or even three of the four -- were showing small, non-significant changes in the same direction, I would be less skeptical. We could be more confident that the effectively larger sample of four combined surveys might make a small change significant. However, when the individual surveys show non-significant changes that zig and zag (or Zog?) in opposite directions, the odds are high that all of this is just random variation.
Disclaimer (since someone always asks): Yes, these are four different surveys involving different questions, sample sizes and sampling methodologies (although their methods are constant from night to night). One (Rasmussen) asks questions using an automated recording rather than interviewers. All four weight by party identification, though to different targets and some more consistently than others. So the rule book says, we should not simply combine them. Also, averaging the four does not yield a more accurate model of the likely electorate than any one alone. My point is simply that averaging the four surveys demonstrates that much of the apparent variation over the last two weeks is random noise.
Also, as with investing, past performance is no guarantee of future gain (or loss). Just because things look stable over the last two weeks doesn't mean they wont change tomorrow.
So we'll just have to keep checking obsessively...and remembering sampling error.
UPDATE: Prof. Alan Abramowitz makes a very similar point over at Ruy Teixeira's Donkey Rising.
Correction: The original version of this post wrongly identified the TIPP survey as "IBD/TIPP." While TIPP did surveys in partnership with Investor's Business Daily earlier in the year, the current tracking poll is just done by TIPP. My bad - sorry for the error.