The NYPD's Crash Data Is Bad and There's Not Enough of It

A New York City Council committee hearing that was convened to discuss bills about street safety highlighted the NYPD's reluctance to make more traffic crash data publicly available.

On Thursday, the Transportation Committee was discussing Intro 1163, a bill requiring the city's Department of Information Technology and Telecommunications (DOITT) to maintain an interactive traffic crash map.

This is data that street safety advocates, community board members, and some city council members have been pushing for for years, because it allows people to visualize data. Users could see, at a glance, a 'heat map' of what neighborhoods have the highest and lowest crash numbers. That data could also be sorted by different characteristics and used to make graphs or compare statistics. And so at face value, this would seem to be a straightforward request, given how data-driven the NYPD is. 

But the NYPD, represented at the hearing by assistant commissioner of intergovernmental affairs Susan Petito, is opposed to the bill.

Petito mounted a number of objections: the bill doesn't define the term 'traffic crash' or distinguish between 'reported' and 'unreported.' Also worrisome, she said, the bill requires collisions to be mapped — and the NYPD doesn't compile information in that manner. So a 'pin' on a map could potentially represent a collision as being at an intersection, when in reality the crash might be down the block, making address data "misleading." A map would require the NYPD to collect information not contained in its standard DMV collision report. The bill would add "new and potentially complicated elements" to the in-the-works crime map.

But the bill, Petito said, is ultimately unnecessary. "Anyone who wishes to map the vehicle collision data already posted by the Police Department may do so," she said, "without requiring the city to expend the police and technological resources necessary to design and implement such a map."

More on that last statement in a moment.

Council Member Dan Garodnick gamely tried to address Petito's concerns. Could the NYPD change its reporting procedures to include addresses of traffic crashes? (Answer: no. "The utility of a street address," she said, "I can’t sit here and tell you that would add anything.”) Could the Council concede the map would only contain 'reported' collisions? Finally, he asked: "Is there actually a philosophical objection by the Police Department to posting this sort of...crash data on a site, or is this just a matter of you feel it needs to fit into the appropriate check boxes?"

It was the former.

"We would have to expend more resources, effort, energy, collecting something that we, as the Police Department, don't really need," Petito said. "And it would be collected for the purpose of reporting it to the public, and that's something that we generally would rather have to do if there's a police department need to collect that type of information."

"We would like to view the Police Department as a partner with the public here," said Garodnick. "We'd like to ask the NYPD's openness to these concepts because they are important."

After Petito stepped down, members of the public were able to speak out regarding 1163. Among those in favor: Code for America's Noel Hidalgo (his testimony can be found here) and John Krauss, the volunteer coder behind Crashmapper and Crash Data BandAid.

Hidalgo called himself a "civic technologist" who is part of a Code for America group that builds interfaces for public data.

"We are concerned about safe streets for everyone — pedestrians, cyclists, and vehicles," Hidalgo said. "Because of poor, inconsistent, and non-existent crime and crash data, creating tools to make safer streets is next to impossible."

(Hidalgo's view was underscored by a Transportation Alternatives report released Thursday. The advocacy group says the NYPD "ignores its own traffic safety data and chooses not to enforce the traffic violations that are the most harmful to New Yorkers." TA is calling on the next mayor to close what it calls "The Enforcement Gap.")

Krauss, a freelance developer, works with data already released and he described the process of working with it. "I wrote the code to process the existing Excel and PDF releases because the data that is currently published is virtually unusable," he said. "The massive PDFs are difficult to search by intersection...the Excel sheets have significant formatting errors that seem to be an artifact at being copy/pasted from the PDFs." (Translation: Hidalgo tweeted that the Excel data is copied from a PDF full of errors.)

Krauss said he must totally reorganize and reprocess the NYPD data — which doesn't include GPS coordinates. Krauss said he tries to attach GPS data after the fact, but that process introduces more errors. "We have to establish a high-quality original source," he said.

Council member Vincent Ignizio's reaction: "John, your testimony can be defined as ‘garbage in, garbage out.”

"Yeah, essentially," Krauss said.

Ignizio pressed the developers to talk about other data issues.

Hidalgo said agencies often want data to be released — but they're worried it will be used "judgmentally," and agencies don't release it in a timely fashion. Case in point: bike share. "We want to understand the availability of (Citi Bike), we want the usability of the system," he said. "But because of the current franchise agreement that Citi Bike has, we can't get all of that data because it's currently routed through DOT. And so DOT says 'We'll eventually release it, you'll get it when you get it."

He said he wanted to talk about ways to make the city's open data law more workable. "We still have deficiencies within different agencies, where they're not living up to the letter of the law. NYPD and DOT are two agencies that are very much in that position...the Department of Education isn't releasing any more of their data sets until December 31, 2018. NYPD was very restrictive of the types of data sets they're going to be releasing...Department of Transportation is not going to be releasing the data sets that they already have on their website until sometime between now and 2018."

"This is just unheard of in the 21st century," Hidalgo added. "There are smaller cities than New York City that are doing a better job than this."

We've reached out to both the NYPD and the DOT for a response. We'll update if we hear anything.

Want more NYC data info? Michael Flowers, Mayor Bloomberg's director of analytics, will be on the Brian Lehrer Show Friday morning.