Nate Silver created a remarkably accurate computer system that projects stats for baseball players and teams. Now he's turned his attention to polling data for the presidential election with his website Five Thirty Eight. Silver explains how his site can out-perform the polling firms, whose data he relies on.
Artist: Railroad Jerk
BOB GARFIELD: So, if the polls gave an inaccurate picture of both public opinion and elections, what are we to do? Nate Silver runs the blog FiveThirtyEight.com, and his solution is to compile the data from all polls, give weight to those with a proven track record, project data for areas where none is available and combine everything into a computer algorithm that runs thousands of simulated elections. This way, Silver generates a projected popular vote and, more importantly, a projected tally of the electoral votes, of which there are 538.
It might seem like voodoo to take a bunch of wrong answers to create a right one, but in the primaries, Silver’s system outperformed other polling organizations. NATE SILVER: In North Carolina, which is a state where Obama had been only ahead by six or seven in the polls, if you looked at other Southern states that have the same demographics, in all those states he had beaten his polls. It’s not that much of a trick to figure out what’s going to happen in North Carolina if it’s so demographically similar to South Carolina, to Virginia.
So we didn't look at the polls at all in that case, and we turned out to be, you know, close to the mark. I think we had him at 16 points; he won by – by 14 or so. BOB GARFIELD: Your reputation is built substantially on your experience with PECOTA, which data-mined baseball statistics in order to make predictions about a current year’s performance by a player or a team. Tell me about PECOTA and how that informed what you’re doing with FiveThirtyEight.com. NATE SILVER: Yeah, what PECOTA is, it’s a projection system, so we tell you, you know, how many home runs Derek Jeter’s going to hit every year. And the way it works is to look throughout history for what we call comparable players, and they’d give you distribution of outcomes. So maybe one player continued to be very good until he was into his forties, maybe one player, you know, started to stink and had to retire. Take an average of that, that gives you the projection.
We do, do some similar things at FiveThirtyEight. For example, which states are similar to one another? Ohio and Michigan, both very important states, they'll probably move in the same direction from here on out. BOB GARFIELD: Okay, now what about the idea of weighting some poll organizations more heavily than others? How does that work? NATE SILVER: Well, we've gone back to 2000 and looked at essentially every election, every poll since then, but that includes general elections for senate and governor, as well as for president, as well as, you know, primaries in 2004 and this year in 2008.
Some polls do a better job than others. They're more thorough. They're more scientific. They might have larger sample sizes. So they might be weighted twice as much as a poll with a marginal track record. BOB GARFIELD: As I look at how the polls that you weight are weighted, there are a couple that are just consistently kind of off the charts in inaccuracy. Who’s in your hall of shame? NATE SILVER: Yeah, there are two. One is Zogby Interactive. Zogby’s a pollster who’s been around a long time. He’s kind of an average pollster. But they have a poll they do purely online where they distribute surveys to people. They can fill them out if they want. You’re only going to respond to it if you’re motivated by politics, so it’s kind of a self-selected sample.
The other poll that’s really bad is one from The Columbus Dispatch. It’s kind of the same thing, really, which is that they mail, get a ballot out to every voter in Ohio. People only send them back if they're interested in it. BOB GARFIELD: Now, your methodology has taken some criticism in The American Spectator, which is a right-leaning publication. The criticism is that you’re just extrapolating current numbers. That assumes nothing happens between now and November 4th, two months from today. NATE SILVER: We don't assume that we know everything. We think there’s a pretty big margin of error, literally. We're just trying to say, here’s what the relative likelihood of different kinds of things happening. If you average it all together, yeah, that middle point, the median’s at about 300 electoral votes for Obama right, now but could go way in another direction.
In 1980, for instance, Jimmy Carter had been ahead of Ronald Reagan. It moved into a tie. Then they had a last debate really late and people kind of flocked to Reagan. The polls didn't even pick it up. He wound up winning by 10 points. BOB GARFIELD: Now, you are a declared supporter of Barack Obama. Why shouldn't we just take everything that you have just said and dismiss it as some sort of spin, you know, under the guise of statistical analysis? NATE SILVER: You can, you’re welcome to, but we do have a 6,000-word-long explanation of how we do everything. So, in theory, it should all be reverse-engineerable, I think. We've made adjustments at times that have benefited McCain, for instance. You know, there are rules we establish. If I make changes, we'll kind of discuss them.
It’s a good community. You get a lot of smart people contributing. So we try and be as open and transparent as possible, and that includes actually saying who I'm most likely to cast my vote for in November. BOB GARFIELD: Nate Silver runs the blog FiveThirtyEight.com. Thank you so much. NATE SILVER: Thank you.