The immense amounts of data collected by local, state and federal government agencies can be an incredibly valuable trove for enterprising journalists. It can also be a pointless slog. NPR's StateImpact project database reporting coordinator Matt Stiles and computational journalism professor at Duke Sarah Cohen explain how they find good stories in a sea of government data.
BROOKE GLADSTONE: This is On the Media. I'm Brooke Gladstone. And we proceed now from data tailored to the individual to data as a tool of public policy. The sad reality that data are often subverted by politics is summed up in that old cliché about lies, damned lies and statistics, but that’s certainly not always the case. There are wellsprings of reliable data that lead to good policy, and one of them is the nonprofit, nonpartisan newsgathering website called The Texas Tribune, which bases its clear, cogent reporting on information gleaned from the state’s databases. It also engages readers as citizens by offering them interactive access to raw data. Texas Tribune reporter Matt Stiles says that the site ranges far and wide across fertile fields of data.
MATT STILES: We had a story that came out the other day, a traditional story, that said one of the agencies that was going to be caught severely was the Texas Forest Service, which backs up local and rural fire departments when they're dealing with these massive fires. So I went back and pulled data for several years showing which fires they had worked and how many homes had been destroyed and just did a thematic map with symbols on it that showed the size of the fire and the number of fires. You can bet that lawmakers looked at that map and thought about their districts and what they might mean. Almost weekly, we push out a new map or a new interactive of some sort. We have a database of 160,000 prisoners so that the public can come and if they happen to know a prisoner, they can search. But they also see all kinds of views and visualization of the data so that they can understand where prisoners are, what the most common crimes are. We have a database of 700,000 public employees from all types of government agencies in Texas, a database of all lobbying activity over the last couple of years. We're developing a census app now.
BROOKE GLADSTONE: Is there any information that you've wanted but you just haven't been able to get your hands on?
MATT STILES: We've had a lot of luck getting what we want. There are some hurdles left, though. State officials in Texas don't have to file their personal financial disclosure forms in an electronic format, so if we request them, we get a bunch of PDF files or essentially paper files that can't be searched, necessarily. We'd love to be able to pull that data down and slice it and dice it and make some connections about lawmakers and their finances.
BROOKE GLADSTONE: Do you think that Texas is better than many states when it comes to releasing data?
MATT STILES: I do. I think there’s something in the political culture in Texas. It’s a conservative state, and conservatives like government transparency. And so, when you go to an elected official and ask them for information, it’s just more difficult, I think, for them to deny your request. In Texas you can get an applause line from a crowd by saying you want to increase government transparency.
[BROOKE LAUGHS] I don't know if it’s like that everywhere else.
BROOKE GLADSTONE: Matt, thank you very much.
MATT STILES: You’re welcome. Thanks for having us.
BROOKE GLADSTONE: Texas Tribune reporter Matt Stiles. Sarah Cohen is a Knight Professor of the Practice of Journalism and Public Policy at Duke University and an expert in computer-assisted investigative reporting. She wishes that the federal government were as keen as Texas to share relevant data, but she says President Obama’s promises of transparency, starting with his Open Government Initiative, haven't delivered the goods.
SARAH COHEN: The Open Government Initiative was really never much about accountability and transparency. It was about getting information that the government wanted you to have in entrepreneurs’ hands so they can build iPhone apps and things like that. That doesn't really help us hold government accountable. It doesn't help us report stories.
BROOKE GLADSTONE: They're more responsive to FOIA Requests, aren't they?
SARAH COHEN: Eh, sometimes.
BROOKE GLADSTONE: [LAUGHS] And they've launched a website called Data.gov.
SARAH COHEN: Yeah, that’s a case where there’s not really much on there that was never available if you went looking for it. And that’s what I kind of mean is this idea of putting what the government wants you to already have in one centralized place. It’s got car seat ratings and things like that on it, but you won't find many spending records there and you’re not going to find very many inspection records there, and you’re probably not going to find records of who’s been prosecuted on that site.
BROOKE GLADSTONE: So, Sarah, what would make you happy?
SARAH COHEN: What would make me happy would be to get access to the actual records that are used in government. We're still not getting access to things like secretaries’ actual calendars rather than the public calendars. We're not getting access to congressional correspondence, the things that go on within an agency that are done with our money and in our name. And I know a lot of other people are really interested in this idea of an information floor, that every agency should have certain things that they give out to everybody, how they spend their money, for instance. Spending records, those are a little bit better, but you can't get behind any of them to get to the actual contracts and grants.
BROOKE GLADSTONE: What do you think of the success of an outlet like the nonprofit website The Texas Tribune? Do you see this as the beginning of a trend of making available information meaningful?
SARAH COHEN: Yeah, I mean, I think The Texas Tribune is probably among the best examples of really good work being done through data journalism, really putting together records from government that were difficult to get and now are easy for the public to get. But I think there are a lot of stories that go undone because these records aren't available. As an example, there’s a reporter in Texas who’s been trying to look into payments by the Federal Marshal Service there and has never been able to get his FOIAs answered. There’s a lot of information that is still being kept secret. So I think there is a flood of data, but I'm not entirely sure that there’s a flood of records and information behind it.
BROOKE GLADSTONE: Sarah, thank you very much.
SARAH COHEN: Thank you. BROOKE GLADSTONE: Sarah Cohen teaches at Duke University.