« February 2006 | Main | April 2006 »

March 31, 2006

Focus Groups - What They're Not

Yesterday, the Hotline On Call blog reported on focus groups conducted by Republican pollster Frank Luntz recently among Democrats in Iowa and New Hampshire.  Since MP assumes those observations will get noticed in the blogosphere (and since...well... they asked) now is a good time to talk a bit more about the pros and cons of focus groups. 

The Wikipedia has a good explanation of focus groups that includes this basic description: 

In traditional focus groups, a pre-screened (pre-qualified) group of respondents gathers in the same room. They are pre-screened to ensure that group members are part of the relevant target market and that the group is a representative subgroup of this market segment. There are usually 8 to 12 members in the group, and the session usually lasts for 1 to 2 hours. A moderator guides the group through a discussion that probes attitudes about a client's proposed products or services. The discussion is unstructured (or loosely structured), and the moderator encourages the free flow of ideas. Although the moderator is seldom given specific questions to ask, he/she is often given a list of objectives or an anticipated outline.

Client representatives observe the discussion from behind a one-way mirror. Participants cannot see out, but the researchers and their clients can see in. Usually, a video camera records the meeting so that it can be seen by others who were not able to travel to the focus group site. Researchers are examining more than the spoken words. They also try to interpret facial expressions, body language, and group dynamics. Transcripts are also created from the video tape.

In traditional focus groups, a pre-screened (pre-qualified) group of respondents gathers in the same room. They are pre-screened to ensure that group members are part of the relevant target market and that the group is a representative subgroup of this market segment. There are usually 8 to 12 members in the group, and the session usually lasts for 1 to 2 hours. A moderator guides the group through a discussion that probes attitudes about a client's proposed products or services. The discussion is unstructured (or loosely structured), and the moderator encourages the free flow of ideas. Although the moderator is seldom given specific questions to ask, he/she is often given a list of objectives or an anticipated outline.

Client representatives observe the discussion from behind a one-way mirror. Participants cannot see out, but the researchers and their clients can see in. Usually, a video camera records the meeting so that it can be seen by others who were not able to travel to the focus group site. Researchers are examining more than the spoken words. They also try to interpret facial expressions, body language, and group dynamics. Transcripts are also created from the video tape.

MP, like most pollsters, considers the focus group an invaluable tool in the measurement of public opinion.  The great advantage of the focus group is its wide open and unstructured format.  While a survey must follow a standardized structure and fit within tight time constraints, a focus group can be free-wheeling and spontaneous.  Participants can answer in their own words.  If the initial questions are confusing, the moderator can immediately explain or revise or take the conversation in new and unforeseen direction.  Also, the in-person format allows for the moderator to play "show and tell" with video clips, advertisements or new products - something that would be impossible over the phone.   

Traditionally, focus groups have been used most often as a "pilot test" of language, concepts and theories before conducting a formal survey.   Political consultants often use focus groups after conducting formal survey research to pilot test television spots and other forms of advertising.

Another reason for their great popularity, especially in the corporate world, is that research consumers find opinions expressed in a focus groups easier to understand and relate to than numbers in a table or chart.   As Wikipedia puts it, the results are "believable" and have "high apparent validity."

However, that ease of understanding can often lead to misuse.  Thus, it is important to remember the limitations of focus groups, especially the idea that a focus group is not a survey.  To use the research lingo, focus groups are qualitative not quantitative.  That is, they do not allow for projective, quantitative estimates for some larger population.  Put another way, you cannot count answers to a question posed in focus groups (hypothetically, 10 of 20 in an Iowa focus group like chocolate ice cream) and use them to make estimates about the views of a larger population (50% of Iowans like chocolate ice cream).  The reasons are that a sample size of 10 is tiny and more importantly, given the time and travel required, many in the population of interest will lack either the time or inclination to participate.   

Focus group recruiters typically offer a cash incentive (typically $50 to $75 but sometimes much more), a necessary practice that can create its own challenge.  Focus group researchers must deal with the "professional respondents" who would be happy to participate in focus groups several times a week.  (To hear how this process can break down, MP highly recommends the story that aired in March 2002 on the NPR program Marketplace -- the focus group story begins at about 4:10).

Another limitation is the challenge from what researchers call "group dynamic," and everyone thinks of as "peer pressure."  If one highly opinionated participant makes a compelling or emotional argument, others in the group may have a hard time expressing contradictory opinions.  To get around this reluctance, researchers try to keep the demographic composition of groups as homogenous as possible (all female or all white, for example).  They will try to weed out those who might have an "expert" opinion or who typically speak with authority (such as teachers in a group about education issues).  They will also use written exercises to get respondents rooted in their opinions before the discussion starts. 

Unfortunately, the same format that makes focus groups easy to understand also makes them challenging to interpret objectively.  Focus group analysis can be a lot like interpreting a Rorschach inkblot.  It is all too easy to see what one wants to see in a focus group or to make too much out of too little.  That is why most serious researchers recommend following up focus groups with a quantitative survey to try to confirm their apparent findings. 

In the political context, focus groups can also be very sensitive to the kinds of people recruited and the nature of information presented.  Recruit only undecided voters and those unsure about their vote preference, and opinions are likely to change during the group.  Include strong partisans or those strongly committed to a given candidate, and opinions about candidates will be more resistant to change.  Moreover, since a focus group will often expose voters to more political information and discussion in a single evening than they typically experience all year, their reactions can sometimes be highly artificial and deceptive. 

All of which brings me to Frank Lutz's latest focus group project.  It is hard to evaluate the findings since the Hotline report is second hand and includes no information about "size/demographic balance, etc." of the groups.   However, in general terms, what the Hotline report describes is more or less what an internal campaign pollster would do in Iowa or New Hampshire on behalf of a client running for president.  The focus group moderator would first discuss all the candidates, probing for awareness and existing attitudes.  At some point they would probably ask participants to "vote" for their favorite.  Then they would present more information or video-tape of some or all the candidates followed by yet more discussion and possibly a second vote. 

In this case, it appears that Luntz played just one video clip of each candidate (it would be hard to do more given the time constraint).  As such, the reactions he reports depend greatly on the clip he chose to play.  Those clips may or may not provide a decent simulation of the sort of exposure those candidates will get over the next two years.  So while Luntz observations are interesting, it is hard to know what to make of them.  But I know that won't stop political junkies from speculating, so...enjoy!

Posted by Mark Blumenthal on March 31, 2006 at 01:42 PM in Focus Groups | Permalink | Comments (7)

March 30, 2006

Newport's Answer to Kurtz

A belated link:  Two weeks ago, the Washington Post's Howard Kurtz asked (and Mickey Kaus amplified) a question in his blog about "only compar[e] figures to their own past surveys, when they're fully aware of the others?"  I posted some thoughts, which led to more posts on whether a trend is evident in the measurements of the Bush job rating taken since mid-February.  Gallup's Frank Newport, looking only at surveys conducted by Gallup argues that the rating has been stable since mid-February, Professor Franklin argues convincingly that the downward trend continues. 

Somewhere in the midst of all this, I failed to notice that Frank Newport had posted his own answer to the original Howard Kurtz question on Newport's Pulse of the Nation page (which is free to non-subscribers).  The gist, not surprisingly, is that they do not compare their results to those from other polls because "it's not always a direct apples-to-apples comparison."  But Newport does endorse the notion of more polling that "integrate[s] poll results into the ongoing literature rather than treating any one result in isolation."  He writes:

Where possible, I think that news accounts can point out the trend lines for other organizations that regularly track Bush approval. In the present situation, almost all these trend lines have been drifting downward since the slight uptick measured late last year. And several organizations have reported that their values for Bush job approval have reached the low points for the Bush administration as measured by their particular organizations.

Posted by Mark Blumenthal on March 30, 2006 at 03:25 PM in Interpreting Polls, Polls in the News, President Bush | Permalink | Comments (0)

March 29, 2006

The AMA Spring Break Survey - Part II

Picking up where we left off yesterday, the AMA's Spring Break survey has problems other than the disclosure of its methodology.  We must also consider the misleading reporting of results from the survey that were based on less than the full sample. 

I should make it clear that in discussing this survey, I in no way mean to minimize the public health threat arising from the reckless behavior often in evidence at the popular spring break destinations.  Of course, one need not look to spring break trips to find an alarming rate of both binge drinking and unprotected sex among college age adults (and click both links to see examples of studies that meet the very highest standards of survey research).

MP also has no doubt that spring break trips tend to increase such behavior. Academic research on the phenomenon is rare, but eleven years ago, researchers from the University of Wisconsin-Stout conducted what they explicitly labeled a "convenience sample" of students found, literally, on the beach at Panama City Florida during spring break.  They found, among other things that 92% of the men and 78% of the women they interviewed reported participating in binge drinking episodes the previous day (although they were also careful to note that students at other locations or involved in other activities other than sitting on the beach may have been different than those sampled).

In this case, however, the AMA was not looking to break new "academic" ground but to produce a "media advocacy tool."  The apparent purpose, given the AMA's longstanding work on this subject was to raise alarm bells about the health risks of spring break to young women.  The question is whether these "media advocacy" efforts went a bit too far in pursuing an arguably worthy goal. 

Also as noted yesterday, the survey got a lot of exposure in both print and broadcast news and the television accounts tended to focus on the "girls gone wild" theme.  For example, on March 9, the CBS Early Show's Hannah Storm cited "amazing statistics" showing that "83% of college women and graduates admit heavier than usual drinking and 74% increased sexual activity on spring break."  On the NBC Today Show the same day, Katie Couric observed:

57% say they are promiscuous to fit in; 59 percent know friends with multiple sex partners during spring break. So obviously, this is sort of an everybody's doing it mentality and I need to do it if I want to be accepted. 

And most of the local television news references I scanned via Nexis, as well the Jon Stewart Daily Show graphic reproduced below, focused on the most titillating of the findings in the AP article:  13% reported having sex with more than one partner and 10% said they regretted engaging in public or group sexual activity. 

Dailyshow

But if one reads the AP article carefully, it is clear that the most sensational of the percentages were based on just the 27% of women in the sample who reported having "attended a college spring break trip:"

Of the 27 percent who said they had attended a college spring break trip:

  • More than half said they regretted getting sick from drinking on the trip.
  • About 40 percent said they regretted passing out or not remembering what they did.
  • 13 percent said they had sexual activity with more than one partner.
  • 10 percent said they regretted engaging in public or group sexual activity.
  • More than half were underage when they first drank alcohol on a spring break trip.

The fact that only about a quarter of the respondents actually went on a spring break trip -- information missing from every broadcast and op-ed reference I encountered -- raises several concerns.  First, does the study place too much faith in second-hand reports from the nearly three quarters of the women in the sample who never went on a spring break trip?  Second, how many of those who reported or heard these numbers got the misleading impression that the percentages involved described the experiences of all 18-34 year old women?  See the Hannah Storm quotation above. She appears to be among the misled. 

One might think that the press release from the AMA would have gone out of its way to distinguish between questions asked of the full sample and those asked of the smaller subgroup that had actually been on a spring break trip.  Unfortunately, not only did they fail to specify that certain percentages were based on a subgroup, they also failed to mention that only 27% of their sample had ever taken a spring break trip.  Worse, the bullet-point summary of results in their press release mixes results for the whole sample with results based on just 174 respondents, a practice that could easily confuse a casual reader. 

Ama_bullets


[Highlighting added]

So what can we make of all this?

Consider first the relatively straightforward issues of disclosure and data reporting.  In this case, the AMA failed to indicate in their press release which results were based on the full sample and which on a subgroup.  Their press release also failed to indicate size of the subgroup.  Both practices are contrary to the principals of disclosure of the National Council of Public Polls.  Also, as described yesterday, their methodology statement at first erroneously described the survey as a "random sample" complete with a "margin of error."  It was actually based on a non-random, volunteer Internet panel.  In correcting their error -- two weeks after the data appeared in media reports across the country -- they expunged from the record all traces of their original error.  In the future, anyone encountering the apparent contradiction between the AP article and the AMA release might wrongly conclude that AP's reporter introduced the notion of "random sampling" into the story.  For all of this, at very least, the AMA owes an apology to both the news media and the general public. 

The issues regarding non-random Internet panel studies are less easy to resolve and worthy of further debate.  To be sure, pollsters and reporters need to disclose when a survey rely on something less than a random sample.  But aside from the disclosure issue, difficult questions remain:  At what point do low rates of coverage and response so degrade a random sample as to render it less than "scientific?"  And is there any yardstick by which non-random Internet panel studies can ever claim to "scientifically" project the attitudes of some larger population?  In the coming years, the survey research profession and the news media will need to grapple with these questions. 

For now, MP agrees with those troubled by the distinctions made by the AMA official (as quoted in yesterday's post) that between "academic research" and "a public opinion poll:" 

[T]his was not academic research -- it was a public opinion poll that is standard for policy development and used by politicians and nonprofits."

Apparently I need to reiterate that this is not an academic study and will be published in any peer reviewed journal; this is a standard media advocacy tool

I agree that the release of data into the public domain demands a higher standard than what some campaigns, businesses and other organizations consider acceptable for strictly internal research.  With an internal poll, the degree of separation between the pollster and the data consumer is small, and the pollster is in a better position to warn clients about the limitations of the data.  Numbers released into the public domain, on the other hand, can easily take on a life of their own, and data consumers are more apt to reach their own conclusions absent the pollster's caveats.  Consider the excellent point made by MP reader and Political Science Professor Adam Berinsky in a comment earlier today: 

Why should anyone accept a lower standard for a poll just because the results are not sent to a peer-reviewed journal? If anything, a higher-standard needs to be enforced for publicly disseminated polls. Reviewers for journals have the technical expertise to know when something is awry. Readers of newspapers don't and they should not be expected to have such expertise. It is incumbent on the providers of the information to make sure that their data are collected and analyzed using appropriate methods.

Finally, I put a question to Rutgers University Professor and AAPOR President Cliff Zukin that is similar to one left earlier this afternoon by an anonymous MP commenter.  I noted that telephone surveys have long had trouble reaching students, a problem worsening as adults under 30 are more likely to live in the cell phone only households that are out of reach of random digit dial telephone samples.  Absent a multi-million dollar in-person study, bullet-proof "scientific" data on college students and the spring break phenomenon may be unattainable.  If the AMA had correctly disclosed their methodology and made no claims of "random sampling," would it be better for the media to report on flawed information than none at all?   

Zukin's response was emphatic: 

Clearly here flawed information is worse than none.    And wouldn't the AMA agree? What is the basic tenet of the Hippocratic Oath for a physician:  First, do no harm. 

Would I rather have no story put out than a potentially misleading one suggesting that college students on spring break are largely drunken sluts?  Absolutely.  As a college professor for 29 years, not a question about it.  This piece is extremely unfair to college-aged women. 

And the other question here is one of proper disclosure.  Even if one had but limited resources to study the problem, the claims made in reporting the findings have to be considerate of the methodology used.  I call your attention to the statement they make in the email that the main function of the study was to be useful in advocacy.   Even if I were to approve of their goals, I believe that almost all good research is empirical in nature.  We don't start with propositions we would like to prove.   And the goal of research is to be an establisher of facts, not as a means of advocacy. 

I'm not naive or simplistic.  I don't say this about research that is either done by partisans or about research that is never entered into the public arena.  But the case here is a non-profit organization entering data into the public debate.  As I said to them in one of my emails, AAPOR can no more condone bad science than the AMA would knowingly condone bad medicine.  It's really that simple

As always, contrary opinions are welcome in the comments section below.  Again, I emailed the AMA and their pollster yesterday offering the opportunity to comment on this story, and neither have responded. 

Posted by Mark Blumenthal on March 29, 2006 at 05:52 PM in Internet Polls, Interpreting Polls, Polls in the News, Sampling Issues | Permalink | Comments (12)

March 28, 2006

The AMA Spring Break Survey - Part I

Last week, MP discussed a not-quite projective study of evacuees from Hurricane Katrina published by the New York Times.  I noted the effort made by the Times to differentiate their study from a "scientific poll" and to make clear that the results "cannot be projected to a definable population."  This week, we have the story of results from a widely published study sponsored by the American Medical Association (AMA) that was not nearly so careful.  They took a deceptive approach to disclosure that is becoming more common, inaccurately describing a non-random Internet panel survey as a "random sample" complete with a "margin of error." 

The study that the AMA billed as a poll of college women and graduates certainly made a lot of news.  An story on the poll the AP's Lindsey Tanner appeared in thousands of newspapers and websites.  A search of the Nexis database shows mentions on the NBC Today Show, the CBS Early Show and hundreds of mentions on local television and radio news broadcasts across the country.  Results from the survey also appeared in the New York Times ($), in Ana Marie Cox's new column for Time Magazine and even on Jon Stewart's Daily Show.

Cliff Zukin, the current president of the American Association for Public Opinion Research (AAPOR), saw the survey results printed in the Times, and wondered about how the survey had been conducted.  He contacted the AMA and was referred to the methodology section of their online release.  He saw the following description (which has since been scrubbed):

Ama_methodology_original


The American Medical Association commissioned the survey.  Fako & Associates, Inc., of Lemont, Illinois, a national public opinion research firm, conducted the survey online.   A nationwide random sample of 644 women age 17 - 35 who currently attend college, graduated from college or attended, but did not graduate from college within the United States were surveyed.  The survey has a margin of error of +/- 4.00 percent at the 95 percent level of confidence [emphasis added].

Zukin sent an email to Janet Williams, deputy director of the AMA's Office of Alcohol, Tobacco and Other Drug Abuse to ask "how the random sample of 644 women was selected?" (Zukin's complete email correspondence with the AMA appears in full after the jump).  He asked about the "mode of interviewing, sampling frame, eligibility and selection criteria, and the response rate," as called for in the AAPOR professional code of disclosure

Willams responded: 

The poll was conducted in the industry standard for internet polls -- this was not academic research -- it was a public opinion poll that is standard for policy development and used by politicians and nonprofits.

The internet poll methodology used by the AMA's vendor, Fako & Associates, made use of the Survey Spot volunteer Internet panel maintained by Survey Sampling, Inc. (SSI).  According to the SSI website, panel members "come from many sources, including banner ads, online recruitment methods, and RDD telephone recruitment."  Anyone can opt-in to Survey Spot at their recruitment website.  A poll conducted with the Survey Spot panel may yield interesting and potentially useful data, but that data will not add up to a "random sample" of anything other than the individuals who choose to participate in the panel. 

Zukin replied: 

I'm very troubled by this methodology.  As an op-in non-probability sample, it lacks scientific validity in that your respondents are not generalizable to the population you purport to make inferences about.  As such the report of the findings may be seriously misleading.  I do not accept the distinction you make between academic research and a "public opinion" survey. 

The next day, Williams shot back:

I have been involved in the development of public policy research for more than 15 years using this company and several others.  We do not make any claims that this is a scientific study and again I ask why did you not have a problem with the other two public opinion surveys I have conducted.  I also am afraid that you are looking at the media coverage and not what we issued...

As far as the methodology, it is the standard in the industry and does generalize for the population.  Apparently I need to reiterate that this is not an academic study and will be published in any peer reviewed journal; this is a standard media advocacy tool that is regularly used by the American Lung Association, American Heart Association, American Cancer Society and others. 

On that score, Williams was in error.  Yes, the article by AP's Tanner did refer to the study as "a nationwide random sample of 644 college women or graduates ages 17 to 35," but then so did the original AMA release put out by Williams' office as noted above.  The original release also provided a margin of error, something that is only statistically appropriate for a truly "scientific" random sample. 

MP' and Williams also differ in our perception of what constitutes an "industry standard." While companies that conduct market research using opt-in volunteer panels are certainly proliferating, the field remains in its Wild West stage.  Every company seems to have a different methodology for collecting and "sampling" volunteer respondents and then weighting the results to make them appear representative.  Few disclose much if anything about the demographics or attitudes of the volunteers in their sample pool. 

The one issue on which Williams has a point, unfortunately, involves methodological disclosure.  The AMA poll is unfortunately one of many I have seen that calculate a margin of error for a non-random panel sample. This sort of misleading "disclosure" should not be an "industry standard," but sadly, it fast becoming one. 

Zukin noted much of this in a subsequent reply:

Simply put, statistically, you are wrong.  The methodology is not standard, it is not generalizable to the population.  And, the reporting of a sampling error figure, as you have done in your methods statement is fanciful.  Because of the way you sampled people, with a non-probability sample, there is no way can know about the accuracy of your sampling and error margin.  This is simply without basis in mathematical fact.  100 out of 100 statisticians would tell you that there is no sampling error on a non-probability sample.  It is beyond question that your methodological statement is factually inaccurate and misleading. 

A little over an hour later, Zukin received another message, this time from Dave Fako, the pollster whose company conducted the survey:

Janet at the AMA made some incorrect assessments of the methodology that was used for the survey.  I'd like to clarify some of your questions.

This survey was an online panel study, conducted in accordance with professional standards for this type of study.  We do not, and never intended to, represent it as a probability study and in all of our disclosures very clearly identified it as a study using an online panel.  We reviewed our methodology statement and noticed an inadvertent declaration of sampling error.  We have updated our methodology statement on this survey to emphasize that this was a panel study to represent how the survey was conducted.

The AMA release, last updated on March 23, has been updated as noted in Fako's email.  It no longer describes the survey as a "random sample" and claims the survey has a "margin of error."  At the same time, the release includes nothing to indicate that a correction has been made or that it has been changed from its original version.  Anyone visiting that website today might wrongly conclude that claim of "random sampling" was the invention of AP reporter Lindsey Tanner.

Ama_methodology_323


Does the AMA still consider the survey "generalizable" to the population of 17-35 year old women?   Neither Fako's email nor the corrected methodology statement above make that clear.

 

[Update: As noted by reader TM Lutas in the comments, despite the correction, the news page on the Fako &  Associates web site continues to point to a PDF of a USA Today article on the survey that includes reference to the survey's "margin of error"].

Frustrated with the lack of response from the AMA, who subsequently stopped returning Zukin's calls, he shared his correspondence with me.  When I asked why he wanted to go public with the dispute, Zukin replied that he "first tried to respond quietly by calling the AMA Communications Director, hoping they would voluntarily issue a clarification of their earlier release."  He also sent the New York Times a Letter to the Editor, which so far has not been published.  He continued: 

I did not want to let the issue die, however without an attempt at a wider discussion/circulation of the problem...There just is a strong sentiment within our profession, at least as represented on [AAPOR's Executive] Council, that we need to address these rogue opt-in surveys that masquerade as probability samplings and report their results with a margin of sampling error.  SO, I turned to MP.

I have emailed both Janet Williams and Dave Fako asking for comment, and they have not responded as of this posting. 

I hate to end on a bit of cliff-hanger (no pun intended), but there is more to this story.  I will continue with Part II tomorrow. The full text of the exchange between AAPOR President Cliff Zukin and the AMA follows on the jump.

Interest Disclosed:  I am an AAPOR member currently a nominee to chair AAPOR's Publications and Information committee.

UPDATE: Continues with Part II.

From: Cliff Zukin
Sent: Tuesday, March 14, 2006 4:20 PM
To: Janet Williams 
Cc: 'Nancy Mathiowetz'
Subject: spring break AMA survey

Janet,

Thank you for leaving a message at my house. 

I did follow up as Mary Kaiser suggested and looked at the methodology statement on the web site.  I really would like to get some additional information.

I am most interested in how the random sample of 644 women was selected. This would include the mode of interviewing, sampling frame, eligibility and selection criteria, and the response rate.  I guess I would basically like the basic information called for in the professional code of disclosure. You can find this at aapor.org, but I'll email the relevant section under separate cover.

Thanks in advance.

Cliff Zukin

---------------------------------------------------------------------

From: Janet Williams
Sent: Tuesday, March 14, 2006 4:44 PM
To: Cliff Zukin
Cc: Nancy Mathiowetz
Subject: RE: spring break AMA survey

The poll was conducted in the industry standard for internet polls - this was not academic research - it was a public opinion poll that is standard for policy development and used by politicians and nonprofits.  I guess I am curious as to why you are interested in the methodology for this poll and not for the other two polls I conducted in the exact same way:  alcopops (which surveyed girls and women) (Dec. 2004) and social source suppliers of alcohol (June 2005).

Janet Williams
Deputy Director
Office of Alcohol, Tobacco and Other Drug Abuse
American Medical Association

---------------------------------------------------------------------

From: Cliff Zukin
Sent: Tuesday, March 14, 2006 9:14 PM
To: Janet Williams
Cc: 'Nancy Mathiowetz'
Subject: RE: spring break AMA survey

Janet,

I'm very troubled by this methodology.  As an op-in non-probability sample, it lacks scientific validity in that your respondents are not generalizable to the population you purport to make inferences about.  As such the report of the findings may be seriously misleading.  I do not accept the distinction you make between academic research and a "public opinion" survey.  Moreover, this is not the standard used in policy research, by non-profits or even by politicians.   Surveys are either conducted according to sound methodological practices, or not.  Just as the AMA has standards for sound medical practice, AAPOR has standards for sound opinion research.

I believe this to be true generally, and I think there is an even greater responsibility when research findings are put into the public domain.  I will discuss this with AAPOR's standards chair and committee; we may ask the AMA to issue a clarifying statement.

Cliff Zukin

---------------------------------------------------------------------

From: Janet Williams
Sent: Wednesday, March 15, 2006 10:15 AM
To: Cliff Zukin
Cc: Nancy Mathiowetz
Subject: RE: spring break AMA survey

I have been involved in the development of public policy research for more than 15 years using this company and several others.  We do not make any claims that this is a scientific study and again I ask why did you not have a problem with the other two public opinion surveys I have conducted.  I also am afraid that you are looking at the media coverage and not what we issued.  The purpose was to get some experience info and opinions on how women are portrayed in ads and support for policies.  We are not using this poll to castigate anyone's science or work on alcohol use.  I am very confused by your outrage and have never received any such criticism for the clean indoor air polls I conducted in Illinois and municipalities in my work at the American Lung Association prior to my joining the AMA.

As far as the methodology, it is the standard in the industry and does generalize for the population.  Apparently I need to reiterate that this is not an academic study and will be published in any peer reviewed journal; this is a standard media advocacy tool that is regularly used by the American Lung Association, American Heart Association, American Cancer Society and others. 

I have forwarded your email to our pollster and, if warranted, he will respond. 

Janet Williams

---------------------------------------------------------------------

From: Cliff Zukin
Sent: Wednesday, March 15, 2006 11:26 AM
To: 'Janet Williams'
Cc: 'Nancy Mathiowetz'
Subject: RE: spring break AMA survey

Janet:

I did not respond to any previous research because I was unaware of it. 

I think AMA needs to review what it has done here, including your assertions.  Simply put, statistically, you are wrong.  The methodology is not standard, it is not generalizable to the population.  And, the reporting of a sampling error figure, as you have done in your methods statement is fanciful.  Because of the way you sampled people, with a non-probability sample, there is no way can know about the accuracy of your sampling and error margin.  This is simply without basis in mathematical fact.  100 out of 100 statisticians would tell you that there is no sampling error on a non-probability sample.  It is beyond question that your methodological statement is factually inaccurate and misleading. 

I am also troubled by the fact you actually call this study a "media advocacy tool."  It is unconscionable to put something in the public domain under the guise of a scientific survey when it has such a high potential to be inaccurate and mislead.  Scientific surveys should be done to measure public opinion, not to influence collective opinion.

Giving the benefit of the doubt here, I assume that AMA does not knowingly wish to mislead the public and press.  Now that this might have happened inadvertently, I encourage you to take about what steps the AMA could take on its own to correct the information you have put out.  AAPOR will be discussing this matter later in the week.   

Cliff Zukin

---------------------------------------------------------------------

From: Dave Fako
Sent: Wednesday, March 15, 2006 5:46 PM
To: Cliff Zukin
Subject: Answers to Your Question About AMA Online Survey

Dear Cliff Zukin:

Thank you for your questions about the AMA Spring Break survey.  Like you, I am committed to the credibility of all public opinion research and am dedicated to utilizing rigid standards in all of our research projects.

Janet at the AMA made some incorrect assessments of the methodology that was used for the survey.  I'd like to clarify some of your questions.

This survey was an online panel study, conducted in accordance with professional standards for this type of study.  We do not, and never intended to, represent it as a probability study and in all of our disclosures very clearly identified it as a study using an online panel.  We reviewed our methodology statement and noticed an inadvertent declaration of sampling error.  We have updated our methodology statement on this survey to emphasize that this was a panel study to represent how the survey was conducted.  That updated statement is listed below.  Additional details of how the panel was assembled, maintained and utilized along with disclosure of incentives, etc. have been included in all of our material related to this study. 

This is our updated summary of methodology:

"The American Medical Association commissioned the survey.  Fako & Associates, Inc., of Lemont, Illinois, a national public opinion research firm, conducted the survey online February 27 - March 1, 2006.  A nationwide sample of 644 women age 17 - 35 who are part of an online survey panel who currently attend college, graduated from college or attended, but did not graduate from college, who reside within the United States were surveyed.

The source of the panel is Survey Sampling International's (SSI) Survey Spot Panel.  A strict multi-step screening process was used to ensure that only qualified individuals participated in the survey.  The survey makeup was: 62% women age 17 - 23 and 38% women age 24 - 35.  The survey was conducted in proportion to regional shares of the population based on current census data."

We apologize for any misunderstanding about the survey.  We are committed to conducting legitimate public opinion research that provides our clients with the most accurate data and in-depth strategic analysis of the findings.  We never craft polls to give our clients the answers they want; in fact, we regularly decline to take on clients who ask us to design polls to give them the answer they want; and, we refuse to craft/include questions in our surveys that are not designed to elicit true opinions. 

Fako & Associates has conducted over 500 public opinion/strategic research surveys for political candidates, public policy organizations, corporations and units of government since 1999.  Our record of success and accuracy and repeated use by numerous clients speaks to our commitment to quality and accuracy.

Again, I apologize for any misunderstanding and would be glad to work with you to promote quality and legitimate public opinion research.

I hope this addresses your concerns.

Feel free to ask additional questions.   

Dave Fako

Posted by Mark Blumenthal on March 28, 2006 at 04:33 PM in Internet Polls, Polls in the News, Sampling Error, Sampling Issues | Permalink | Comments (10)

March 27, 2006

Link Roundup

Here is a round-up of some quick links on an otherwise busy day:

  • Prof. Franklin unpacks more analysis on why the Bush job rating seems to indicate week-to-week stability even in the midst of a long term decline.  Franklin provides another impressive array of charts and data that underscore the comment Robert Chung made here last week:  Given sampling error associated with a single poll of 1,000 adults (usually  3%), the odds are quite long of seeing a significant change across individual surveys conducted a week or two apart.   

Of course, keep in mind that Franklin's conclusions make the most sense for a mostly stable attitude like a President's job approval rating.  By the time a President becomes a President, most voters have formed an impression that tends to change slowly.  Impressions of heretofore unknown political candidates can change much more rapidly, especially in the midst of a multi-million dollar advertising blitz.  As a campaign pollster, I have seen very large (double digit) changes in candidate favorability and vote choice in those sorts of races within the span of a few weeks. 

  • For those following the Israeli elections, Franklin also has the most complete graphical summary of Israeli poll results I have seen anywhere.  Franklin's tracking indicates that as Israelis go the polls, the wide lead held by the Kadima party has narrowed over the last month.  Kadima's support has fallen roughly five percentage points, while support for the new right wing party Yisrael Beiteinu has seen its support increase by roughly the same number (although still running a few points behind the Labor and Likud parties). 
  • Finally, last week I neglected to link to the latest Diageo-Hotline poll (press release, results).  This month's survey sampled only registered Republican voters and took an in-depth look at attitudes about the potential Republican candidates for President.  Next month, they will do the same for Democrats.   By "over-sampling" partisans, they allow for a sub-group analysis that gets us closer to the views of the much smaller population of likely primary voters and caucus participants who will eventually select each party's nominee. 

This general approach is something MP wishes other pollsters would emulate, as it provides helpful insights into the coming 2008 campaign.  For example, as noted by the Hotline's editors, the poll shows John McCain doing best among the few Republicans who disapprove of George Bush's performance as president (getting 37% to Rudy Giuliani's 21%).  Yet among Republicans who strongly approve of Bush's performance, McCain gets only 17% to 24% for Giuliani. As Hotline editor Chuck Todd speculates, perhaps the most "adamant Bush supporters are picking Giuliani in polls because they think he's the candidate most supportive of Bush right now" (hat tip: Sullivan). 

At some point in the not too distant future, MP hopes to explore the big challenge involved in selecting "likely voters" for the presidential primaries.  For now, Hotline's polling editor Aoife McCarthy provides helpful details on the questions used to define "registered Republicans." 

PS:  While MP greatly appreciates the focus of the Diageo-Hotline poll on the 2008 primary battles, he remains a skeptic about the results of the question (Q19) purporting to show that the most popular blog among Republican voters is Anderson Cooper's 360 Blog.  It leads the next most mentioned blogs -- those perennial conservative favorites Daily Kos and AmericaBlog - by a 7 to 1 margin.  McCarthy labors to offer possible explanations for these improbable results, but from my perspective the blog question still looks to be suffering from some sort of programming glitch. 

Posted by Mark Blumenthal on March 27, 2006 at 03:21 PM in Likely Voters, Miscellanous, President Bush | Permalink | Comments (1)

March 24, 2006

New Low or Steady? Franklin's Answer

Last week, in looking at the graph of the Bush approval rating data as plotted by our friend Prof. Charles "Political Arithmetik" Franklin, I asked what seemed like a straightforward question:  Did the recent changes represent an abrupt downward shift in late February that has since leveled off, or slow gradual decline as indicated by Franklin's trend line?  Gallup's Frank Newport, in examining their data only, seemed to agree that the decline had leveled off.  Yesterday Franklin posted a long and incredibly detailed answer that indicates that the decline has continued into March.  Of course, as always, what the trend will look like going forward is anyone's guess.

Franklin's high powered analysis is not for the statistically faint feint of heart, and his approach to attacking my question is as wonky as this sort of thing gets.  But the bottom line is that no matter how Franklin crunches and plots the data he still sees "continued downward movement and not stabilization."  He also finds that the "house effects" of the polls conducted in recent weeks ("the tendency of different polling organizations to find approval ratings higher or lower than the average") may be contributing to my perception of stability in recent weeks.  "With house effects removed," Franklin sees "no evidence that the decline in approval has hit a steady state."

Of the nine different graphics in Franklin's post, my favorite is this "low tech" box-plot, which shows the range, the median (the heavy black line in the middle of most boxes), and the 25th and 75th percentile (the box) of all the polls conducted each week since the beginning of the year with house effects removed.  While the boxes for the last three weeks overlap, the medians show a small continuing decline over the last three weeks. 

Franklinoneweekboxnohousefx


Two final thoughts:  First, as Franklin stresses, the recent trend tells us nothing about what the future holds.

[M]y models here are not intended to predict future events. My goal is to clarify what the data show has happened. I'm pretty sure approval has continued down.

Second, in a comment on one of my posts on this subject earlier in the week, Robert Chung makes an important point about the limited statistical power of individual surveys:

I'd expect that two polls taken by the same firm only a few days apart with a reporting resolution of only 1 percent *ought* to look pretty much the same...Basically, with sample sizes of roughly 1000, it would be unusual (but not impossible) for any two closely adjacent polls to be significantly different. Because Gallup tends to poll more frequently than others (see http://anonymous.coward.free.fr/polls/gross-vs-net.html ), its time interval between polls tends to be shorter. That's the kind of situation where *even if approval were changing smoothly* you'd expect to see step-like behavior.

Again, Franklin's discussion gets pretty wonky, but for all of us who consider ourselves students of data analysis, it's worth reading in full.

PS:  While visiting Franklin's blog, don't miss his similarly powerhouse look at changes in party identification over the last year, the second in a series on party ID.  He shows a year-long trend toward decreased Republican identification mirrored by an increase in independents that is consistent across the surveys conducted by 11 different pollsters. 

Posted by Mark Blumenthal on March 24, 2006 at 07:05 AM in President Bush | Permalink | Comments (2)

March 23, 2006

On Random Sampling and NYT Katrina Evacuee Study

An article in yesterday's New York Times on interviews conducted among Katrina evacuees raises some very important questions about how to use a survey-like study that is not quite a projective survey.  In this case, in MP's view, the Times study was an appropriate effort to provide the best information available about a population essentially out of reach of true random sampling.  But the example raises some useful general questions:  When do we cross the line between the appropriate and inappropriate use of a study that cannot project the views of the population of interest with statistical precision?  When do unavoidable compromises so degrade the sampling process that, as one thoughtful MP reader RS put it, "the concept of 'best available' ceases to be meaningful?"  Unfortunately, these are not questions with easy answers.

A bit of background:  A few weeks ago, in discussing the Zogby survey of US troops in Iraq, I wrote the following:

In the business world, commercial market researchers sometimes use non-random sampling (including many Internet based "panel" surveys) when rigorous probability samples are impractical or prohibitively expensive.  However, the most ethical of these market researchers do not attempt to dress up such "convenience" samples as more than they are.  Their clients pay for such projects on the assumption that the information obtained, while imperfect, is the best available.

Although obviously not market research, the Times Katrina evacuee study is just such a project.  As described in the article and a "How the Study Was Conducted" sidebar, the Times researchers had access to the Red Cross database of 160,000 evacuees.  They drew a random sample of that database and "conducted interviews by telephone" and used "standard survey methods in asking the questions and recording the answers."

Yet the Times researchers made it clear that they did not consider this "interview project" to be a "scientific poll:"

The study differs from a scientific survey in that the total population of evacuees is unknown and therefore could not be sampled precisely or randomly.

Three of every five of those interviewed lived in New Orleans before the hurricane; the rest are originally from the area surrounding the city. Almost two-thirds of the study's participants were black and about a third were white - similar to Census Bureau figures for the city's population. But the participants were older and slightly more likely to be women than the city's population. (The precise racial, age and sex breakdown of all evacuees is not known.)

Because the answers from the 337 respondents in the study cannot be projected to a definable population, no margin of sampling error can be calculated. For that reason, the accompanying article avoids giving specific figures in most cases.

One important point:  When a survey can be projected to a larger population, it is appropriate to use the results to describe that population ("48% of Americans believe," "53% of voters approve," etc.)  But in this case, the Times authors are mostly careful to characterize the results as describing only those interviewed, not the larger population of those interviewed.  For example:

Fewer than a quarter of the participants in the study have returned...

[M]ost of those interviewed favor...

The blacks interviewed were more likely...

[W]hile a majority of whites and blacks reported...

4 in 10 of those interviewed said...

Another quarter of those interviewed...

Yet oddly, for all the caution exhibited in the body of the article, the lead of the Times report does tend to go ahead and characterize all evacuees:

Nearly seven months after Hurricane Katrina flooded New Orleans and forced out hundreds of thousands of residents, most evacuees say they have not found a permanent place to live, have depleted their savings and consider their life worse than before the hurricane, according to interviews with more than 300 evacuees conducted by The New York Times.

Is the Times article an appropriate effort to provide the best information available in a situation where a true random sampling was impossible or, as reader PZ wondered, did they go ahead and use a "non-survey" inappropriately to "[draw] lots of conclusions on Katrina victims as if it was a survey?" 

In this case, MP thinks the Times made the right call.  They tell us that "fewer than a quarter" of their respondents "have returned to the same house they were living in before the hurricane."  Based on these and other findings, it seems reasonable to conclude that "most evacuees" have not found permanent housing, especially since the study's methodology probably underrepresented the poorest evacuees .  As the Times article points out:

Some of the poorest people in New Orleans were not included in the project, in part because they did not have access to the database or posted the telephone numbers of emergency shelters where they left no forwarding information.

Although I think the Times handled this issue well, the general questions about the use of non-random sampling are much harder to resolve.  Because the Times did not include "specific numbers" in their report, their study is less likely to produce a percentage that takes on a life of its own (as often happens when a dramatic number enters the public discourse).  Yet even in this case, those who read the Katrina study and cite its findings are not likely to make the same fine distinctions about its ability to "project" the views of all evacuees as the Times authors.

Earlier in the week, before the Times released its Katrina study, I received an email from reader RS who asked a very perceptive question about when the concept of the "best information available" ceases to be meaningful: 

I encounter this [problem] all the time because my research often involves C-suite executives, ultra high net worth individuals, corporate directors and the like.  Sampling is a bitch with populations like these and I almost always recommend that my clients use qualitative research [focus groups] instead.  As you can imagine, however, I often come under significant pressure to field a quantitative study - with my client promising to take the results "only for what they are worth" - only to find myself discussing cross-tabs the size of which wouldn't fill a closet.  So, I am not a real believer in "best available" data, because I don't think clients (let alone journos) really can be expected to appropriately limit its use.

I wish I had an easy answer for this one.  The question gets to the heart of an issue that the survey research profession -- and consumers of its data -- will face more and more as declining coverage and response rates further degrade our ability to interview random samples by telephone.  In many ways, this question underlies many of the controversies we discuss here at MP.  The best general advice I can offer data consumers is to get into the habit of reading the methodology section first and asking:  Who was interviewed, how where they selected and how well did the selection process represent the population of interest?

[Typos corrected]

Posted by Mark Blumenthal on March 23, 2006 at 08:09 AM in Polls in the News, Sampling Issues | Permalink | Comments (0)

March 21, 2006

Gallup CNN Break-Up

And speaking of Gallup...

Here at MP, we are not prone to posting gossipy items about the polling industry, largely because what little gossip MP hears, is well... not worth posting.  Today however, thanks to TVNewser (via Hotline On Call), we have news that Gallup and CNN are "breaking up."  And it's not pretty:

In a memo dated Wednesday, March 15, CEO Jim Clifton wrote: "We have chosen not to renew our contract with CNN. We have had a great relationship with CNN, but it is not the right alignment for our future."

"CNN has far fewer viewers than it did in the past, and we feel that our brand was getting lost and diluted," Clifton continued. "...We have only about 200,000 viewers during our CNN segments."

Gallup no longer wants a broadcast partner, according to the memo. "We are creating our own program and we don't want to be married" to one network, Clifton wrote. Analysts like Frank Newport will be seen as more independent under the new arrangement, he added.

"We have offered to help CNN find a new polling partner and to be as helpful as we can during this transition," Clifton concluded. Gallup IS renewing its deal with USA Today. The newspaper has about 10 million readers per day, the memo noted.

As the Hotline's Aoife McCarthy put it, "ouch." 

In response, CNN issued a statement to TVNewser and the Hotline:

Jim Clifton's statements are not only unprofessional but in every respect untrue.

Jim Walton actually spoke with Jim Clifton, CEO of The Gallup Poll, and was told by Mr. Clifton that the reason that Gallup wanted to end their partnership was that the CNN brand was so dominant that Gallup wasn't getting the attention for the polls that they wanted.

We want to make it clear that the decision to not renew our polling arrangement had to do with Gallup's desire to produce their own broadcasts and not about CNN viewership figures. In fact, Gallup had negotiated with us for four months in an effort to extend the partnership.

While we appreciate that Gallup does not wish to have any broadcasting partner for the future, I must note that CEO Jim Clifton's excuse to his employees for ending the relationship has no basis in fact. It shows ignorance of not only our viewership figures but of the reach and value of the CNN brand.

Domestically, our viewership was grossly misstated in his comments. CNN's average monthly reach in 2005 was 66.7 million, far and away the No. 1 source for cable news.

There's more - see the two posts on TVNewswer for the details.

Ouch indeed.

PS:  Wonkette's new editors have their own unique take here

Posted by Mark Blumenthal on March 21, 2006 at 07:23 PM in Polls in the News, Pollsters | Permalink | Comments (3)

The Perils of Double Negatives

Gallup's David Moore has posted an analysis this morning (free for today only) with an important lesson on how to write - or perhaps, how not to write - survey questions.  The lesson:  Double negatives confuse respondents.  Put another way, when a question asks respondents if the favor a negative or oppose a positive, confusion (and thus "measurement error") is an inevitable result. 

Two weeks ago, on a survey conducted March 10-11, Gallup asked a question that produced an odd result.  Nearly two-thirds (64%) of their sample of adults expressed opposition to "a bill that would prevent any foreign-owned company from owning cargo operations at seaports in the United States" (emphasis added)?   This result was at odds with a number of other polls showing overwhelming opposition to the Dubai ports deal.  Suspecting that the double negative in the question may have confused some respondents, Gallup followed up with an experiment on a subsequent survey (conducted March 13-16) that split their sample into two random halves and asked two slightly different versions of the question.

The first version repeated the question as originally asked (n=502):

Would you favor or oppose a bill that would prevent any foreign-owned company from owning cargo operations at seaports in the United States?
38% favor
58% oppose
4% no opinion

The second question changed the negative phrasing to a positive:   

Would you favor or oppose a bill that would allow only U.S. companies to own cargo operations at seaports in the United States?
68% favor
25% oppose
7% no opinion

Thus, changing the words "prevent any foreign-owned company" to "allow only U.S. companies" produced a very different result even though both questions asked the same thing.  Moore's analysis explained the likely reason:

[T]he results suggested that as people were listening on the phone], they may have missed that the [original] question was referring to a bill to prevent foreign-owned companies from owning cargo operations in U.S. seaports. If people were against foreign-owned companies, they would have had to say they were in favor of the bill. That seemed like a double negative from a linguistic perspective, and thus likely to confuse. Also, it was possible that some people might have heard the "foreign-owned companies" part of the question, and not the "bill to prevent" part. So, when respondents said they were "opposed," they may have meant they were opposed to the foreign-owned companies, not the bill itself.

Moore's analysis speculates that doubt remains about "the exact percentage of Americans who would support the proposed bill," largely because on other questions an overwhelming majority of Americans would support allowing Great Britain to own seaport cargo operations. 

It is worth reading in full for that reason, but more importantly for helping to reinforce a classic lesson:  Double negatives confuse respondents.  Pollsters should avoided them.

Posted by Mark Blumenthal on March 21, 2006 at 08:20 AM in Measurement Issues | Permalink | Comments (1)

March 20, 2006

Gallup: "Bush Approval Steady at 37%"

Helping to underscore the question I asked at the end of Friday's post, Gallup has released another national survey this morning that shows no change in the Bush job rating over the last week:

The March 13-16 Gallup Poll pegs his approval at 37%. This rating is virtually unchanged from the 36% measured over the March 10-12 weekend, and not statistically different from two February polls prior to that.

In general, it's clear that the public's assessment of Bush's job performance has not undergone a dramatic free fall, as some may think. Gallup's regular assessment of president job approval ratings allows us to determine that Bush's ratings have remained remarkably constant for significant periods. The pattern of late has been marked by periods with virtually identical ratings, followed by small shifts and then another period of stability.

The Gallup report is free to all for today.  Read it all.

Posted by Mark Blumenthal on March 20, 2006 at 09:26 AM in President Bush | Permalink | Comments (3)