More Divergent Than They Should Be?

Divergent Polls Legacy blog posts Sampling Error

Picking up where I left off on the last post, let’s start with the basic theory of random sampling. If we draw a series of perfect random samples, the results for any question will show a very predictable variation from survey to survey. The pattern of variance for any given result should resemble a “normal” or bell shaped curve: Some percentages will be higher, some lower, but most will cluster near the true center value. The “margin of error” is a translation of the normal curve into probabilities numbers. The shape of the curve means there are many different margins of error depending on how certain we want to be. If we drew repeated samples of 1000 interviews, for example, 95% (19 of 20) would get a result falling within +/- 3.1% of the value for the entire population; 80% certainty (16 of 20 surveys) would fall within +/- 2%, and half the surveys (10 of 20) should fall within +/- 1.1%.

Apples to Apples

Now let’s look at some real data. The table below shows results of 11 national surveys of self-described “registered voters” conducted since the Republican convention. For the sake of argument, let’s assume that voter preferences have not changed one iota over the last two weeks (unlikely), that all of the surveys used identical methodologies and question wordings (far from it), that each is a perfect random sample of registered voters (hardly) and that each poll surveyed 1,000 registered voters (some were lower). Let’s also make the leap that average of all surveys represents the “true” preferences of all registered voters

 

 

 

 

 

Registered Voters

Bush

 

Date

N=

+/-

 

Bush

Kerry

Nader

Margin

IBD/TIPP

9/14-18

894

3.3%

 

43%

42%

2%

1%

CBS/NYT

9/12-16


1,088

3.0%

 

50%

41%

3%

9%

Gallup/CNN/USAT

9/12-15

935

4.0%

 

50%

42%

4%

8%

Pew

9/11-14


1,002

3.5%

 

46%

46%

1%

0%

ICR

9/8-12

868

3.3%

 

48%

44%

3%

4%

Pew

9/8-10

970

3.5%

 

52%

40%

1%

12%

Newsweek

9/9-10


1,003

4.0%

 

49%

43%

2%

6%

Time

9/7-9


1,013

3.0%

 

50%

39%

4%

11%

AP-IPSOS

9/7-9


1,286

2.5%

 

51%

43%

2%

8%

ABC/WashPost

9/6-8

952

3.0%

 

50%

44%

2%

6%

CBS

9/6-8

909

3.0%

 

49%

42%

1%

7%

 

 

 

 

 

 

 

 

 

Averages

 

 

 

 

 

 

 

 

Average –
All

 

993

 

 

49%

42%

2%

7%

Sept
11-18

 

957

 

 

47%

43%

3%

5%

Results from PollingReport.com and
Rasmussenreports.com

We have 11 polls, and therefore 22 estimates for either Kerry or Bush. Based on chance alone, we would expect 95% of the estimates (roughly 21 of 22) to fall within a range of +/-3%; that’s a range of 46% to 52% for Bush and 39% to 45 for Kerry. As the table shows, two estimates (in bold) are worse – by chance alone we should have seen only one.

Further, we would expect 80% of these estimates, (18 of 22) to fall within a range of +/-2%; that’s between 47% and 51% for Bush and 40% and 44% for Kerry. On the table, 4 estimates (highlighted) fall in that range – exactly what we would expect by chance alone.

It’s also worth noting that the key outliers in this exercise – the most recent Pew and IBD studies – were conducted most recently and narrow the average of Bush’s lead slightly (to 5%). If the race has really gotten a few points tighter, then both surveys would fall within an the expected range for the narrower result.

Thus, given all the various differences in timing, methodology, question wording, and so on, the variance of survey results falls remarkably close to what we would expect by chance alone. For registered voters at least, a population that is essentially comparable across surveys, the disparity has been mostly about sampling error.

Likely Voters – Apples to Bananas

Of course, the polls of “likely voters” are the ones getting the blame for showing divergent results. Pollsters have good reason for trying to identify likely voters. In the year 2000, the U.S. Census estimated 203 million Americans of voting age, 130 million of whom were registered to vote. Of these, 105 million (80% of registered voters and 52% of adults) cast a ballot. Thus, if we want an accurate forecast, we theoretically want to interview only those 50-60% of adults who will actually vote in November.

The problem, as noted by much of the recent coverage, is that we lack an obvious way to identify truly likely voters, especially since respondents tend to exaggerate their likelihood to vote. Here, pollsters use widely varying methods to identify likely voter. No two likely voter screens are created equal.

Consider the data. The table below, which includes results from 16 recent surveys, shows that while polls of likely voters do show a bit more of a spread than we would expect by chance alone, they do not deviate wildly. Again if we assume hypothetically that all of these surveys are comparable, involve perfect random samples, and that the “true” result equal to the overall average of all of these surveys, , then we would expect 95% of the estimates (30 of 32) of Bush or Kerry’s vote to fall within a margin +/-4%. That amounts to a range of 46% to 52% for Bush and 39% to 45 for Kerry. The actual polls do slightly worse, with 4 of 32 (indicated in bold) falling outside these limits.

 

 

 

 

 

Likely Voters

Bush

 

Date

Likely

+/-

 

Bush

Kerry

Nader

Margin

Zogby

9/17-19

1,066

3.1%

 

46%

43%

1%

3%

Rasmussen

9/17-19

3,000

2.0%

 

49%

45%

2%

4%

IBD/Tipp

9/14-18

650

4.0%

 

45%

42%

2%

3%

Gallup/CNN/USAT

9/12-15

767

4.0%

 

54%

40%

3%

14%

Pew

9/11-14

725

 

 

47%

46%

1%

1%

Democracy
Corps

9/12-14

1,003

3.1%

 

47%

45%

3%

2%

Harris

9/9-13

803

4.0%

 

47%

48%

2%

-1%

NDN/Penn
Schoen

9/9-12

800

3.5%

 

49%

44%

3%

5%

ICR

9/8-12

758

3.6%

 

51%

44%

3%

7%

Pew

9/8-10

745

 

 

54%

38%

2%

16%

Time

9/7-9

857

4.0%

 

52%

41%

3%

11%

AP-IPSOS

9/7-9

899

3.5%

 

51%

46%

1%

5%

Zogby

9/8-9

1,018

3.1%

 

46%

42%

2%

4%

Democracy
Corps

9/6-9

1,004

3.1%

 

48%

45%

4%

3%

FOX

9/7-8

1,000

3.0%

 

47%

43%

3%

4%

ABC/WashPost

9/6-8

700

3.5%

 

52%

43%

2%

9%

 

 

 

 

 

 

 

 

 

Averages

 

 

 

 

 

 

 

 

All

 

1,007

3.4%

 

49%

43%

2%

6%

Sept
11-18

 

1,032

3.4%

 

48%

44%

2%

4%

Results from PollingReport.com and
Rasmussenreports.com

Among those falling within the 95% range, the spread is still wider than expected by chance. At an 80% confidence level, for example, we would expect roughly 6 of 32 estimates to fall outside the range of +/-2%; that’s between 47% and 51% for Bush and 41% and 45% for Kerry. On the table, 11 estimates (highlighted) fall outside that range.

So the main point: The differences between polls, especially the methods of identifying likely voters, do create differences beyond sampling error, but those differences are small, perhaps just a few percentage points. Much of the variation still comes from random chance.

Note one more telling point: Contrary to the conventional wisdom, the overall averages since early September show Bush with essentially the same margin among all the polls of likely voters (+6) as among the polls of registered voters (+7) since early September. The same patterns holds, with slightly narrowed margins, over the last 10 days.

[Continue this series with, “So What Should a Junkie Do?”]

Mark Blumenthal

Mark Blumenthal is the principal at MysteryPollster, LLC. With decades of experience in polling using traditional and innovative online methods, he is uniquely positioned to advise survey researchers, progressive organizations and candidates and the public at-large on how to adapt to polling’s ongoing reinvention. He was previously head of election polling at SurveyMonkey, senior polling editor for The Huffington Post, co-founder of Pollster.com and a long-time campaign consultant who conducted and analyzed political polls and focus groups for Democratic party candidates.