February 09, 2005
The Hotline's SurveyUSA Interview
Yesterday, the National Journal's Hotline (subscription required) took up topic of great interest to MP: "Whether the polling community will admit it publicly or not," The Hotline's editors wrote on their front page, "there's a crisis in their industry. From media pollsters to partisan pollsters, more and more consumers of these polls are expressing skepticism over the results, no matter how scientifically they are designed."
They went on to debut a new series of debates they are "hoping to spark" in the political community, the first on the topic of "Interactive Voice Response" (IVR) polling. They kicked it off with a long interview with Jay Leve, editor of SurveyUSA.
Now for those who are not familiar with it, The Hotline is a daily news summary that provides a comprehensive coverage of politics at the national state and congressional district level. Unfortunately, it is only available through a pricey subscription that is out of reach to most individual readers, so I cannot link to it directly.
However...the folks at the National Journal have kindly granted MP permission to reproduce the interview in full, as long as I cite it properly and include their copyright (and did I mention that this interview came from The Hotline, National Journal's Daily Briefing on Politics, a bargain at any price? Good...didn't want to forget that).
Seriously, thanks to the Hotline for providing a forum for this important debate. Whether you believe that political survey research is "in crisis" or not, there is no question that the challenges already facing random sample telephone surveys will increase significantly over the next decade. Those challenges and the responses to them are worthy of debate, and not just among those who produce surveys but among our consumers as well.
I'll chime in with some thoughts on Leve's interview later today or tomorrow. Until then, here is the full interview. The comments section is open as always.
The following is an interview with SurveyUSA Editor Jay Leve. SurveyUSA has come under a great deal of criticism over the years for not using real people to conduct their polls, and use instead the recorded voice of a professional announcer. However, in the '04 election cycle, SurveyUSA had an impressive track record which can be viewed on their web site. In the wake of these results, we decided to ask the questions everyone wants to know the answers to and allow Leve to address his critics head on. We plan to invite many pollsters to respond to this interview; but any pollsters that would like to respond before we ask, email here.
You have enjoyed a great deal of success in this past election cycle. How can you explain your accuracy to your critics?
There are two ways to measure an election pollster's performance: "absolute" accuracy and "relative" accuracy. SurveyUSA keeps track of both, maintains up-to-date scorecards and, alone among pollsters, publishes the scorecards to our website for public inspection. By any measure, SurveyUSA is at the top or near the top of election pollsters - not just for 2004, but ever since SurveyUSA started polling in 1992.
Pollsters talk and write a lot about reducing "Total Survey Error," but most obsess over the mathematical sources of error. I focus more on the how the questions get written, who asks them, how the questions sound to the respondent and how the questions get answered. SurveyUSA has re-thought from scratch exactly how polls can best be conducted, given what professional voices make possible. If the cost of each additional interview is expensive, which it is for others, you think about TSE one way. If the cost of each additional interview is relatively inexpensive, which it is for SurveyUSA, you can make other choices. The amount of intellectual horsepower that gets applied to exactly how a question is asked, and exactly what the respondent hears, as a percentage of total expended intellectual energy, is greater at SurveyUSA than at other firms.
With this accuracy, what prevents any "Homer Simpson" from purchasing an auto-dialer and conducting polls from home?
SurveyUSA has spent a fortune writing software and building hardware. But even if I gave our technology away for free, to Homer or Pythagoras, they would not know what to do with it. SurveyUSA's technology is neutral. It's just a tool, neither good nor evil.
On your web site, you state that "Many media polls are ordered and completed same day. Many market research projects are ordered one day and delivered the next." How can this leave time to accurately develop the questionnaire, ensuring that the questions are unbiased?
SurveyUSA researchers do not start from scratch when a poll is commissioned. Like most pollsters, SurveyUSA asks the same questions over and again. SurveyUSA's library has thousands of poll questions. Every so often, something truly new comes up, and our writers must wrestle with constructs, language, phrasing and the range of possible answer choices. In such cases we may test multiple ways of asking the same question. Ultimately a "keeper" goes into our library. Your question implies that questionnaires must be long and complex. Not true. For others, who go into the field infrequently, questionnaires take weeks to prepare, because both pollster and client know they may not get another chance for 3 months. SurveyUSA goes into the field every day. Our questionnaires are short, by design. Some see this as a limitation. We see it as an advantage. The more questions you put in a questionnaire, the more those questions interact with each other, and the more the early questions color the answers to later questions. Others ask (ballpark) 100 questions, which take 20 minutes to answer. SurveyUSA asks (ballpark) 10 questions, which take 2 minutes to answer.
Do you feel the increased turnaround time has any negative effects (if not the above mentioned)?
Every piece of research has a proper field period. Some SurveyUSA polls are in the field for minutes, some for weeks. SurveyUSA election polls are typically conducted over 3 consecutive days. Minutes after the first presidential debate last fall, ABC News completed one poll of 531 debate watchers. CNN completed one poll of 615 debate watchers. CBS News completed one poll of 655 debate watchers. NBC News did nothing. SurveyUSA completed 35 separate polls in 35 separate geographies, of 14,872 debate watchers. NBC affiliates in Seattle, Salt Lake and Denver had scientific SurveyUSA reaction in-hand minutes after the debate, while Tim Russert and Chris Matthews pondered how many DailyKos bloggers had stuffed the ballot box at the MSNBC website.
On the day before DIA opened in Denver in 1995, SurveyUSA took a poll for KUSA-TV. We asked whether building the new airport was good or bad. The next day, after the airport opened, we re-took the same poll. Approval for the airport went up 20 points overnight. Should we have held the story for 3 days so we could do more callbacks? We had news. Our client led with it. The other stations had nothing. We owned the story.
How do you formulate a representative sample? Do you use random digit dialing or voter lists? Which do you feel has the more accurate results? Why?
SurveyUSA purchases RDD sample from Survey Sampling of Fairfield CT. We have conducted side-by-side testing using RBS (Registration Based Sample). In the testing we have done, RBS did not outperform RDD.
On your website, you state that in order to end up with an accurate sample you use demographic breakdown to ensure you are portraying the population. However, since the questionnaire is being asked of the first person who answers the phone, how can you accurately establish a sample that is appropriate for a poll? Even with screening questions to establish the likelihood of a voter, how can you assure that a caller is actually over the age of 18? Or for that matter, how can you assure that they are citizens, or registered to vote? Is there any systematic way you can verify the accuracy of this once a poll has been completed?
You've asked 5 questions here, the first of which contains a false premise. SurveyUSA can choose to talk to the person who answers the phone, or we can ask to speak to someone else. There is nothing hard about that. By your question, you create the impression that, a) SurveyUSA doesn't understand the importance of selecting a respondent from within a household and, b) even if we did understand it, our technology prevents us from doing it. Both are false. SurveyUSA has read all of the literature on intra-household selection, and SurveyUSA has done side-by-side testing on the different ways that one might do intra-household selection. We have tested the methods that are mathematically defensible in theory, such as asking for the respondent with the most recent birthday (which has problems in practice), and methods that are mathematically indefensible, such as asking for the youngest male over the age of 18. Intra-household selection, in practice, does not make the kind of polls that SurveyUSA conducts more accurate.
2.4 percent of those who take a SurveyUSA poll tell us they are under the age of 18. We exclude them. There is no evidence that people lie to us more often than they lie to a headset operator. There is evidence to the contrary.
Some SurveyUSA competitors want you to think SurveyUSA gets an occasional election right, the way Miss Cleo occasionally gets a psychic prediction right. The facts are published and available for inspection. The odds that chance alone can explain SurveyUSA's success relative to other pollsters is 1,000,000,000:1, by many measures. To those who would like me at this point to disclose that SurveyUSA got the Newark mayor's election wrong in 2002, the San Francisco mayor's runoff wrong in 2003, and that SurveyUSA overstated Dean in the 2004 Iowa caucuses, we did. When you have as many at-bats as SurveyUSA, you are going to strike out from time to time. The question is: how does our entire body of work stand-up? By multiple objective Mosteller measures, SurveyUSA's data need take a back seat to no one's.
In 1999, a subsidiary of the research firm IPSOS wanted to see if interactive voice was a viable alternative to CATI. Senior IPSOS scientists put together a side-by-side test with 93,000 interviews. The test was deliberately designed to isolate and identify biases in interactive voice. As such, respondents were asked as diverse a collection of questions as possible. The testing was designed, carried out and paid for by the IPSOS subsidiary. After the 93,000 parallel interviews were conducted, IPSOS wrote a white paper, summarizing the research-on-research. Findings:
- "IVR produces samples that more closely mirror US demographics than does CATI ... Three demographics stand out as being the reason for these differences: education, income and ethnicity. In all three cases, IVR was much closer to the census than CATI."
- "IVR interviewing generally succeeds on all three fronts: sample projectability, accuracy and production rates. These findings suggest that IVR is a valid method for administering short questionnaires to RDD samples."
- "In the few cases where differences are noted in the data, some can be resolved by the way we ask questions and some, we believe, are already more accurate in IVR."
After this white paper was written, this IPSOS subsidiary began using SurveyUSA for data collection.
Due to the manner in which you obtain your sample, is there a differential in accuracy in general vs. primary elections?
SurveyUSA has polled on 310 general candidate elections. Our average error on the candidate is 2.33 points. SurveyUSA has polled on 167 primary elections. Our average error is 4.13 points (1.8 times greater). We do not believe we are less accurate on primary elections because of the way we obtain sample. Because no pollster has ever been asked for, nor publicly made, this kind of disclosure before, I don't know whether a 1.8 factor deterioration on primary polls is above average or below average.
Do you include "traps" in your screening process? If so, such as? Do they prove to be effective?
We have experimented with as few as 3 and as many as 8 screens for likely voters over the years. In addition to asking the obvious question, "Are you registered?", we have experimented with many different variations on the direct, "How likely are you to vote" question, including running side-by-side testing for many of our 2004 polls comparing a 4-point likely scale to a 5-point scale. We have, in past years, but not in 2004, asked people where they vote. In 2004 we asked respondents whether and how they voted in 2000. We ask people their interest on a 1-to-10 scale. In 2004, we used fewer screening questions than in past years. Our results were superior. We find no simple relationship between the number of screening questions and the accuracy of our results. When SurveyUSA consistently produces a candidate error of 0.0 on pre-election polls, we'll assume we have solved this riddle, and will stop experimenting. Until then, it's a work in progress.
Under what circumstances are your polls more beneficial than traditional telephone polls as conducted by Gallup? What makes automated polls more accurate?
Have you been to Gallup's website lately? Have you watched Frank Newport deliver the Daily Briefing? Have you been to the Gallup Brain? Have you read Gallup's blog? Do you receive the occasional introspective from David Moore? What a tour de force. No other pollster is a close second to Gallup in these areas. I aspire to run my company as openly and transparently as does Gallup, and to provide interactive real-time access to our library of questions and answers. In this regard, I have the highest respect for Gallup. Further, Gallup has a 70-year track on many important questions, which gives Gallup a 60 year head-start on SurveyUSA. That said, I would not trade data with Gallup: 42% of Gallup's final statewide polls in 2004 produced a wrong winner (5 wrong winners out of 12 state polls), compared to 3.4% of SurveyUSA's final statewide polls (2 wrong winners out of 58 state polls).
Professionally-voiced polls are not inherently superior to headset-operator polls, and I do not make that claim. I just rebut the assertion that professionally-voiced polls are inherently inferior. Used properly, SurveyUSA methodology can have advantages. In 1994, SurveyUSA polled California on Proposition 187 for TV stations in Los Angeles, San Francisco and Sacramento. Prop 187 was a plan to deny benefits to illegal immigrants. When others polled, some respondents heard the 187 question this way, "Are you a bigot?" They answered in the politically correct way. "No, I would never vote to deny benefits to illegal immigrants" (before going out and doing just that). It did not matter how much confidentiality Field or LA Times interviewers promised the respondent, or how well trained those interviewers were. Both pollsters understated support for this measure. When SurveyUSA polled 187, respondents did not have to confess anything, but rather, had only to press a button on their phone, paralleling the experience the respondent would later have in the voting booth, where no one speaks his/her choice aloud. SurveyUSA said Prop 187 would pass 60% to 40%. It passed 59% to 41%.
If your only access to polling data is Hotline, you may think Arnold Schwarzenegger scored a remarkable come-from-behind win in the 2003 Gray Davis recall. The only polls Hotline reported showed Cruz Bustamante ahead early in the campaign. What SurveyUSA knows is that Cruz Bustamante never led in California. Californians may have been reluctant at first to tell other pollsters that they planned to vote for the body builder, but they had no problem telling KABC's Marc Brown this every time SurveyUSA was in the field, which was on 38 of the 59 nights of that campaign. Publications, such as Hotline, which abide by the Gentleman's Agreement not to publish SurveyUSA polls, do a terrible disservice to their subscribers on occasions such as this.
In 1998, I received a call at my house from a well known Washington DC polling firm. The interviewer eventually zeroed-in on questions about Bill Parcells, then the coach of the New York Jets, and a Cadillac spokesman. I listened carefully. Why would the interviewer want to know if I thought Bill Parcells was honest? Then I connected the dots. This was not a poll about Bill Parcells, this was a poll about Bill Pascrell, who is my Congressman, and who was running for re-election in New Jersey 8th District. The interviewer was reading the name wrong. I said to the interviewer, "Ma'am, excuse me. Stop. You are mispronouncing the gentleman's last name. It is Pas-crell. Not Par-cells." "No," she said. "It says right here, 'Bill Parcells'." How many times a day do you think something like that happens with headset operators? How many different ways can you think of for an $8/hour employee doing monotonous work to make a mistake? Does it matter how many PhDs worked to draw the sample for that survey? Does it matter how many PhDs pored over the data to write the analysis that the candidate ultimately was handed? It doesn't. The data was worthless. And this - importantly - was one of the best outfits, an outfit that actually runs its own call center. Imagine how much worse it gets at firms that just outsource their calls to a 3rd party, and who have no direct control over who asks the questions.
Now, about the word "automated." Almost all polling firms use purchased auto-dialers. The dialer automatically dials the phone, detects a connection and, once the dialer believes a human is on the line, automatically passes the call to an interviewer. In some cases, that interviewer is well-trained and articulate, sensitive without being intrusive, and in all things neutral. Perfect. But in other cases, that interviewer is an unpaid, untrained college student hoping to get a credit, or the interviewer is convicted criminal, calling from a call center located within a Canadian prison. The people who staff call centers know the dirty little secrets, and they know the kind of people they can attract to do this work. They can tell you about interviewers who come to work drunk, stoned, or hacking phlegm. They can tell you about interviewers who flirt with the respondents, deliberately, to coax answers, interviewers who coach respondents, leading them to the "right" answers, and interviewers who don't ask the questions at all, but who just make up the answers to save time. Not every headset operator is horrible, to be sure, and the majority are well-meaning, but every call center has horror stories.
In SurveyUSA's case, when our proprietary dialer detects a human, the respondent immediately hears the voice of a TV news anchor. News anchors are not paid $8/hour. In some cases they are paid $800 an hour. No one is more acutely aware of the limitations of SurveyUSA methodology than I. But the choice is not between SurveyUSA and perfection. The choice is between a news anchor, who has been on the air 30 years in some cases, and a headset operator, who, if he/she lasts a year in the job, is exceptional. I'll take the news anchor. Were Winston Churchill alive, he might say: "Many forms of data collection have been tried, and will be tried in this world of sin and woe. No one pretends that using TV news anchors to ask the questions is perfect or all-wise. Indeed, it has been said that using TV news anchors is the worst form of data collection ... except all those others that have been tried from time to time."
Because of the nature of your polling system, the types of questions that can be asked are limited. Without giving the respondent an opportunity to choose "other" and then specify what that is on more than one question, doesn't this prevent the client from being privy to the wants/needs of the sample?
"Other" can be included in any question we ask. Structured probing can be done to whatever level is appropriate. Unstructured, open-ended, iterative probing cannot be done, but if you want unstructured, open-ended iterative probing, you need a focus group. In 1992, SurveyUSA identified an opportunity to serve TV newsrooms that were not being served by Gallup and Harris. We built a better mousetrap; the world beat a path to our door. Just as the TVA brought water to small-town America, and the REA brought electricity to small-town America, SurveyUSA brought true, random-sample, extrapolatable opinion research to Wichita, Roanoke and Spokane. Our clients are delighted with the work we do. Some have been customers for 12 years now. A number are under contract through 2008.
How do you compare to Rasmussen? Do you feel you are more/less accurate when it comes to competitive races?
SurveyUSA has competed with Scott William Rasmussen on 68 occasions. We have outperformed Rasmussen using any of 8 academic measures. Our mean error and standard deviation on those 68 contests, and Rasmussen's, are posted to SurveyUSA's website.
What is your response to critics who state that while automated polls are fast, rendering them headline worthy for TV stations, they are not accurate enough to use within a campaign to determine strategy based on the reaction of the electorate to issues or events? In addition to The Hotline, a number of other news organizations have a policy of not running automated dialing polls, stating that it would be a disservice to readers to portray the results as accurate -- Roll Call and the AP to name a couple. How would you convince us of otherwise?
Campaign managers scour SurveyUSA's data, then make media-buy decisions and change strategy. I know because campaign managers call me. They tell me how eerily similar SurveyUSA's data is to their own internal polling. By any objective criteria or honest measure, SurveyUSA years ago earned the right to be included in Hotline's "Poll Track." Yet we're still blacklisted. Evil triumphs when good men do nothing. Here's a chance to do something
© 2004 by National Journal Group Inc., 600 New Hampshire Avenue, NW, Washington DC 20037. Any reproduction or retransmission, in whole or in part, is a violation of federal law and is strictly prohibited without the consent of National Journal. All rights reserved.
Related Entries - Innovations in Polling
Good post. While I never trust samples of one, especially when that one is me, I have been polled both by headset people and computer. I am much more comfortable giving my answers to a computer.
Posted by: Fran | Feb 10, 2005 2:46:05 PM
Fran, I've never been polled - but then again, I'm only 28 and my 18-22 year old years I imbibed a tad too much and didn't answer the phone (hey, that's what an answering machine is for right?).
Dang it Mark, you made me spend 15 minutes of homework time absorbing every word of this post. Thanks for linking it for the poor slobs who can't feed their kids (me of course, although my kids eat, I just go into debt).
One question though about the new questions that they develop. One of my professors has a small local polling company. He said that they always pre-test their survey instruments with focus groups before ever conducting the actual survey. Is that too cautious? Mark, does bpbresearch do that?
Posted by: Rick Brady | Feb 10, 2005 10:25:00 PM
Excellent post. Thanks for doing it. AFter 2004, I have more confidence in these guys than anyone else except Mason Dixon.
Posted by: Cableguy | Feb 12, 2005 2:20:22 AM
Jay Leve is at best a snake oil salesman, and in the least a quack.
Posted by: One who kNOws | Jan 6, 2008 12:24:30 AM
The comments to this entry are closed.