Here is an excellent point made by commenter Glen Tomkins at Matthew Yglesias’ blog:
Predictive polling is forced to rely on biased sampling, because only a small fraction of the likely voters whose names are pulled from the hat can be contacted and/or agree to participate. If this small percentage were a random sample, it is true that all we would have to worry about would be the random variance. But it isn’t a random sample, people are filtered out or in based on their attitudes towards answering polls, when they’re home to answer the phone, etc.
Random sampling only works reliably on events that are periodic or that occur according to a random probability distribution!
If every voter determined their choice by rolling a dice, or picking the candidate round robin according to their order entering the voting building, the polling would work perfectly!
Polling is a psychological one and the idea that you can somehow magically use random sampling for nonrandom nor non-periodic events, is the problem.
Combine this notion with record primary turnouts and a lot of very committed, but angry Democratic electorate looking at the first African American and woman who have a legitimate shot at being nominated to a major party ticket, and you can see why the primary polls seem to swing so wildly. The constituencies are not mutually exclusive so you have some competing loyalties and that makes for difficult sampling.
Gary Langer, director of polling for ABC News wants to look at the data before making any pronouncements.
But we need to know it through careful, empirically based analysis. There will be a lot of claims about what happened – about respondents who reputedly lied, about alleged difficulties polling in biracial contests. That may be so. It also may be a smokescreen – a convenient foil for pollsters who’d rather fault their respondents than own up to other possibilities – such as their own failings in sampling and likely voter modeling.