Sean Oxendine's blog

A Quick Note About Zogby Showing McCain Ahead

I wouldn't read too much into it.  It's a three-day track of 1200 likely voters, which means we're 95% confident the "true" value for McCain that night is somewhere between 53% and 43%.  This isn't inconsistent with what we're seeing in other polling.  And even then, one out of every twenty polls is going to be an outlier.

If you're looking at good news, I'd take a gander at the trends in Pennsylvania, and the fact that both campaigns are visiting Iowa this week.

Baby Yoda Says

Happy Halloween!

I know this has absolutely nothing to do with politics or elections or movement-building, which is why I'm leaving it in the diaries and not putting it on the front page (if one of the "Big 3" wants to front-page it, they're welcome to do so ).  But I think in this time of high emotion its important to step back and remember that no matter what happens Tuesday the vast majority of us will still get up in the morning on Wednesday, go about our business, and come home to our usual lives.  And I think it's likewise to remember that for the vast majority of us, whether paleo-con, neo-con, libertarian-con (or our visitors from the various dimensions of the other side of the political spectrum) do what we do politically out of a genuinely-held belief that what we are doing for the good of our families, present or future.  I also think that in any online community, such as the one we're trying to build here, it is important to take some time to talk something other than politics.

So take a moment to talk something other than politics.  Big plans for the night? Children got fun costumes?  You got a fun costume?  Unfortunately Baby J is too sick to trick-or-treat, but at least we got the following picture from pre-school yesterday!

 

Photobucket

A Tale Of Two Elections

Jay Cost has recently had two good posts up about the recent polls, here and here.  They’re both well worth the read, but the gist is that the polls are showing variance that can’t be explained just by sampling error.   

Instead, pollsters seem to have two different views of what the electorate will look like.  Indeed, most of the movement that has occurred in the polling average can be explained by different pollsters entering and exiting the average.  Last week the average was at roughly a six-point Obama advantage, but that quickly changed with the addition of the NBC/WSJ poll and the CBS/ poll.  The polls themselves didn’t change much (the seven tracking polls that were in the average at this point have barely moved in their avaerage), but the pollsters in the average have. 

So what we’ve seen is some pollsters, such as IBD/TIPP and Battleground have fairly consistently showed a 3-point race or so.  Other pollsters, such as CBS/NYTimes have consistently been at the high end of the spectrum.  Rasmussen has more consistently been in the middle, with an Obama lead of 6-8 points. 

We see a similar effect in some states. Florida, Minnesota, New Hampshire, and Ohio all have had a number of polls in recent weeks outside of each others’ error margins. Rasmussen and Quinnipiac have consistently been on the opposite ends of the Ohio spectrum.  This is complicated by the fact that individual states tend to be less frequently polled than the nation as a whole, and tend to have less reputable pollsters or less experienced pollsters in them (Big 10 Battleground anyone?), and that their polls tend to have smaller samples, making it more difficult to sniff out outliers. 

The upshot of this is that I’m not sure polling averages will be all that useful this year.  Poll aggregation assumes that the pollsters are working off of basically the same script, and that you can thereby treat them as a single giant sample.  This has the advantage of cancelling (or at least reducing the effect of) the outliers. 

But when the pollsters have fundamentally different views of the electorate for their models, you can’t do this anymore.  If one pollster thinks there will be a massive upsurge in the youth vote, and one pollster doesn’t, and they weight their polls accordingly, then they aren’t working off the same script.  This is always somewhat of the case, but I think it’s more pronounced this time, given substantial uncertainty as to the eventual makeup of the election.  As I said, given the number of apparently outlying polls that Cost identifies, I think that’s the only conclusion we can draw.  Pollsters just aren’t polling the same election.  You may as well try to aggregate the North Carolina and Virginia polls in order to predict Virginia. 

Back in August, I said that the million dollar question this election was the makeup of the electorate  I think that’s still the million dollar question.  If IBD/TIPP or Battleground have the correct model for the electorate, then we can guess that the state polls showing the closer race are the correct ones.  Given that IBD/TIPP and Battleground don’t poll states, we might even see several results to the right of what all the most pro-McCain polls are showing.  And if that’s the case, a comeback and/or a win for McCain is still possible.   

If, on the other hand, Pew has it right, McCain is probably down ten in Ohio, down seven in Florida, down eleven in Virginia, and down five in North Carolina.  In which case, it really is all over but the shouting.  And we can be certain there will be plenty of that. 

But the bottom line is that this isn’t likely to be an election where all the pollsters can write off their error as being the error margin.  Someone is going to be really, really wrong.

2006 And The Bradley Effect

The debate about the Bradley Effect is flaring up again, as witnessed in this back-and-forth between Bill Greener and Nate Silver.  For those just tuning in, Silver and I have had our own back-and-forth about whether the effect occurred in the 2008 primaries.  See here , here, and here.  In short, I believe there is evidence to support something that looks an awful lot like the Bradley effect in the Democratic primaries. Silver does not. 

To be honest, as of right now the whole debate about the Bradley Effect is kind of silly, given that Obama is above 50% in enough swing states to win the election even if McCain wins every undecided.  And there is something surreal about liberal commentators protesting that America has substantially moved on past race in the last 20 years, while conservatives argue that we've barely evolved racially.

Nonetheless, this is of some academic interest, and if polling models are overstating turnout among the youth and Democrats (or understating Republicans), and the electorate looks more like what IBD and/or Battleground have been seeing, it may well be relevant. 

Overall Numbers

I think Silver’s criticism that Greener cherry-picked polls to make his best argument is a fair one.  Nonetheless, overall his argument about 2006 is much weaker than he admits.  The best we can say is that there is no conclusive evidence for a Bradley effect in 2006.  But there is still some evidence, and I think it is substantial.  Indeed, this is shown in his own table, which shows that in the five major races pitting African American candidates against white candidates, the white candidate overperformed his polling numbers on average by 3.6 percent, while African Americans overperformed by only 1.6 percent.   

Silver’s rejoinder is that this is not statistically significant.  But when you are averaging multiple polls (as we are with the RCP average) it isn’t clear how the error margin should be calculated.  The theory behind poll aggregation is that we can treat each poll the same as an  individual pollster treats the people reading the scripts within the larger polling company; just as pollsters aggregate employees’ results to get a large polling group, so too we can aggregate the pollsters to get a "super sample." 

There are obviously some problems with this – different pollsters use different methodologies – but I think if we get to the point of talking about statistical significance of aggregated polling, we’re already accepting that aggregating polling is an acceptable methodology.  With Tennessee, we are therefore looking at a sample of about 2700 respondents, which would be an error margin of +/- 1.89% for that race alone.  For all five races combined our sample size would be about 10,000, for an error margin of +/- .098%, meaning that we’re 95% certain that in 2006 the white candidate performed better than the polls said he should vis-à-vis the black candidate.

 

Weird Polls

Most people have probably noticed that the polls in the RCP average are a little, um, conflicted.  Obama is pulling between 44 and 53 percent of the vote, while McCain is pulling in between 39 and 46 percent of the vote.  McCain is either up 2 in Florida, or Obama is up 7.   Given the distribution of the results and the small sampling errors at 95% certainty, this can't be explained entirely through sampling error.

But it's not the only place where polling is weird.  Witness the battle for Congress.  Let's start with the stipulation that if the NRCC and House Republicans think this isn't going to be a pretty year for House Republicans, it's probably not going to be a pretty year for House Republicans.  My argument isn't that a GOP House renaissance is on the horizon.

Let's also stipulate that the Generic Congressional Ballot is not a perfect depiction of how people will actually vote.  Some of this imperfection comes from Republicans' historic tendency to overperform, and part of it comes from the fact that you don't actually vote for "generic Republican" or "generic Democrat."

With those two stipulations, let's also note that this year, unlike 2006, there is not a huge amount of public polling on House races.  This isn't necessarily a bad thing, since a lot of the polling from 2006 came from mediocre 1-off outfits like "Majority Watch."   And district -by-district polling is pretty difficult without the full resources of a campaign; after all, it is pretty difficult to poll accurately a district that looks like this (which is probably why polling in the last month ranged from Shaw +5 to Klein +9 (Klein won by 4 points).  So what we are left with are a bunch of campaign polls, which also aren't particularly useful (witness the near-simultaneous Kennedy poll showing Landrieu up 5 and the Landrieu poll showing her up 20).  The generic ballot is all we got.

Let's remember that in 2006, Democrats won control of the Congress with about a 7.2% nationwide margin (53.6%-46.4%).  That marked about a 10.5-point swing from the 2004 elections, which the Democrats lost 49.2% to 46.6%. 

So if a 10.5-point swing resulted in a 30-seat gain in 2006, shouldn't Democrats have to overperform their 2006 win of 7.2% in order to gain the 30 or so seats people are speculating they will get?  Indeed, given that Republicans aren't contesting many seats that they didn't also contest in 2006 (eg their playing field is pretty small), and given that many Democrats who won GOP seats in close races in 2006 like, say, Dave Loebsack, are going to win walking away this time, benefitting from incumbency in Democratic districts, we would expect the Democrats' edge to increase even if no seats changed hands.

So let's first remember what the final polls said in 2006.  The final RCP average was around a 10-point win for Democrats, which overstated the result by about 3 points.  Polling in the last week or so of the campaign ranged from a +4 advantage for Democrats (Pew) to a whopping +20 advantage (CNN).  This makes sense, since Republicans traditionally overperform the generic ballot by a few points (for perspective, in 2004, only seven polls all year long had Republicans up at all).

What about this year?  The RCP average is actually closer than last year, at 8.1%.  The current range of poll results is narrower as well, from D+4 to D+14 (that D+14 poll is from the CBS/NYTimes, which consistently skews Democratic 3 or 4 points, and which is actually an improvement from the earlier D+20 poll).

Maybe what we're missing is a longer view.  So I looked at the 2006 polling for all of September through October 23 (all polls that were taken on dates including October 23).  The average results were 51.1% for Democrats and 38.6% for Republicans, a spread of 13.5%.  In 2008, the average results have been 48.46% for Democrats and 39.82% for Republicans, a spread of about 9.5%.

Well, maybe I'm taking too long of a view there.  Maybe I should just look at October?

The October 2008 polling has Democrats at 48.6% and Republicans at 38.7%, a spread of about 10%.  The October 2006 polling had Democrats at 52.5% and Republicans at 38%, a spread of about 14.5%.

Anyway, let's say that, for whatever reason, to pick up another 30 seats the Democrats need to add only 4 points to their spread -- about a half of the swing required in 2006 to bring about a 30-seat swing.  However you slice the generic ballot, Democrats aren't pulling in 12-point leads right now.  In fact, since July, only seven (of thirty-six) polls have shown leads of 12-points or more. And the GOP historically overperforms (I understand the reasons it might not overperform this year, although even adding in a 2-3 point Obama bounce counteracting the typical GOP bounce, we should expect a "break-even" effect).  Moreover, no matter how you slice it, Democrats are performing worse in the generic balloting this year than they did in 2006.

So what's going on?  Discuss amongst yourselves, because I don't really have a good answer.

Like Rain On Your Wedding Day

Well now that we're apparently past what can only be explained as a DOS attack from the Lamont campaign, I thought I would riff quickly off of Jason Bonham from Race42008.  While I disagree with most of his post supporting the gay marriage amendment in California, I saw this interesting nugget:

A few days ago I posted on Proposition 8 gaining traction in California. The newest SUSA polling on this confirms this, with Prop 8 advocates leading [by] 3 percent thanks in a large measure to black voters who are in favor 58%-30% and to a small degree Hispanics who are in favor 47%-41%.

This is consistent with an often unmentioned effect seen nationwide:  Racial minorities are consistently one of the most conservative demographic groups when it comes to gay rights.  Consider the exit polls from the 2006 referendum on gay marriage in Arizona.  There, whites actually had the lowest level of support for the failed referendum; hispanics were slightly more favorably inclined toward the ban than whites.  Blacks overwhelmingly supported the ban.  Approximately thirty-two people in Tennessee opposed the ban there, but they were disproprotionately white.  In Virginia, the numbers were roughly even.  In 2004, there was greater diversity, but blacks, hispanics, and whites all tended to oppose allowing same sex marriage in those states by supermajorities.

This is consistent with the recent Pew poll, which showed that African Americans were about ten points less likely than whites to support same sex marriage, sixteen points less likely to support civil unions, and twenty points less likely than mainline protestants and Catholics to support gay and lesbian adoptions (Hispanic Catholics were fifteen points less likely than white Catholics to support such adoptions).

What does this have to do with the price of tea in China?  There are two points, one short-term and one long-term.  In the short term, the polling on the gay marriage amendment in California is extremely close.  But everyone seems to be expecting a massive surge in African American turnout everywhere this year, which I do not dispute.  If the same happens in California (where blacks make up a fairly small portion of the population), Barack Obama may, somewhat ironically, end up dooming gay marriage in that state (of course civil unions, which are the functional equivalent, will remain intact).

In the long term, there remains an interesting question about the makeup of the parties.  Hispanics Democrats in Congress already tend to be more conservative than their brethern, especially on cultural issues, and especially when they hail from rural areas like South Texas.  As time progresses, we also begin to see increasing diversity in the African American congressional delegation, with more and more Congressmen like Sanford Bishop, Harold Ford, and David Scott sounding conservative themes when it suits their constituents.

I don't expect to see African Americans and Hispanics vote Republican anytime in my lifetime, although if the party goes the Huckabee route (the opposite of what I'd like to see) of increased economic liberalism combined with cultural conservatism, this may become more likely.  Of more interest is the effect this has within the Democratic caucus as time goes on.  Will we see more ideological primaries in minority-majority districts, as we saw with Majette-McKinney or Cuellar-Rodriguez.  Ironically, this is what we saw in the South in the 1930s-1960s, where race held together a Southern Democratic party split by serious ideological divisions.

A Quick Reminder Regarding Error Margins

As we draw ever closer to election day, we see more and more people referring to polls being "within the error margin" or "outside the error margin" or even worse, "a statistical tie."  Even Tom Sowell made the statement today.

So let's remind ourselves quickly:  When you are dealing with polls, you are always within the error margin.  If a poll shows one candidate at 60% and another at 40%, the poll is still within the error margin.

Polling and error margins can be thought of this way:  There is a giant bag of 1,000,000 red, blue, and green marbles.  You want to know what the bag looks like.  If you draw three marbles, there's a chance that you will draw one blue, one green, and one red marble, and know the "true" value.  But there's also a good chance that you'll draw all blue, all green, or all red marbles.

The more marbles you draw out, the less likely it is that you have drawn a sample of marbles that does not resemble the bag.  But it is almost always a possibility that, if you draw, say, 10,000 marbles, that you could draw 10,000 marbles of the same color, or 10,000 marbles made only of red and blue marbles.  Even doing everything possible correctly, you can whiff.  When you are sampling, there are always error margins, and they are determined in part by the sample size.

I say "in part" because -- and this is important -- with a given sample size, there are an infinite number of error margins.  The question is, how confident do we want to be.  If a poll of 750 people is conducted and the result is 50%, we can be 95% confident that the "true" result is located somewhere between 53.6% and 46.4%.  HOWEVER, we can also be 90% confident that the "true" result is located somewhere between 53% and 47%.  We can be 75% confident that the "true" result is located somewhere between 52.1% and 47.9%.  And we can be 51% confident (ie, it is more likely than not the case) that the "true" result is somewhere between 51.25% and 48.75%.  Almost all polls use 95% as their level of confidence, but it is important to remember that this is what they are saying, and if something is barely within in the error margin at 95% confidence, we're still pretty darned sure there is a difference.

You should also remember that most polls have two error margins, not one.  Even very smart commentators mix this up.  Since you are sampling two candidates, each has its own number.  This is somewhat complicated by the fact that the two numbers are semi-binary -- to oversimplify, Obama's vote share can only go up so much while McCain's also goes up, and the "undecideds" category, plus the shape of the bell curve, complicates things.  The rule of thumb is that the error margin for the spread is about 1.7 times the actual error margin, but you aren't going to kill yourself if you simplify and use 2x the error margin as your guide (to 95% certainty).

Because there are an infinite number of error margins, its impossible to isolate methodological error from sampling error.  So just remember that both are almost always present, and you'll be fine.

Play around with this for a while, if you want to see better how this works.

And remember, this is just sampling error.  There is also what can be thought of as "methodological error."  In reality, society is not distributed like a bag of marbles (more or less evenly).  It is distributed like a Snickers bar.  If you grab into a shaken bag of marbles, you'll probably get a good distribution, and it shouldn't vary much regardless of where you grab into the bag of marbles.

A Snickers bar is different.  If you slice even slightly off-vertical, you'll get too much caramel/peanut mix, or too much nougat.  Heck, you might even end up with only chocolate.

In the real world, you end up dealing with methodological problems that prevent pollsters from getting a good "slice" of society such as people lying about whether they will vote (solved somewhat by likely voter screens), trying to reach cellphone-only households, or the biggest problem for modern pollsters:  Increased rates of declined response.  You also have pollsters who inadvertantly "prime the pump," by asking, say, a question about Bush Job approval right before a question about who people will vote for.  Even slight changes in questionairre wording can affect the response elicited by the poll question.  So even when you see two pollsters outside of the 95% error margin, there might not be a difference between what they are reporting; they might be effectively conducting two completely different examinations of society!

Fifteen Days Out

A week ago, I opined that one of two things was occurring in the Presidential race.  Either this year was becoming like 1980, where the country decided that it had accepted the candidate of change and was breaking heavily toward him, or it was like 2000 and 1996, where the country was skeptical of the change agent, would not let him above 50% consistently in the polls, and would decide at the last minute that it preferred the old, boring, steady hand (yes, Dole was the old, boring, status quo agent.  Remember his "Bridge to the past" theme?). 

At the time, I actually believed that what we were seeing was the former, and I expected the bounce we were seeing for Obama to continue, and that he would end up winning by 10 points or so.  As I said at the time, I thought Obama had about a 90% chance of winning.

That may still be the case, but for now it is looking as if the race is returning to the old dynamic where Obama maintained a lead, but found it awfully difficult to break above 50%.  That continues to be the only thing holding me back from declaring the race over.

How We Got Here

But before getting into thoughts on the present state of the race,  I thought it would be useful to explore how we got to where we find ourselves today.

Below is a chart taken of the RealClearPolitics.com averages from September 1 through October 14.  It has various important dates marked prominently, which should allow us to ascertain which events were "game changers," and which were not.

 

McCain Trend

 

As you can see, when the RNC began, Obama found himself up about five points, and peaked at about a 6.5-point lead after the first day of the RNC was effectively cancelled.  But after a surprisingly successful RNC, McCain shot up to a 2.5% lead.

Test

McCain Trend

Shameless Self-Promotion

I'll be on XM Radio's POTUS '08 show tonight around 7:20 or so to discuss the election (well what did you think I'd discuss there, competitive underwater b-b-stacking?).

Syndicate content