A judge in New York City heard arguments Wednesday over whether individual teachers’ ratings can be released to the public. The judge did not issue a ruling immediately. The city has rated 12,000 fourth through eighth grade elementary and middle school teachers on how well their students perform on standardized math and reading tests. So far, the ratings have only been shared with the teachers and their principals. But now, the city wants to give them to the media and the teachers union considers that a breach of privacy.
There are lots of ways to assess a teacher. Principals can observe classroom lessons, or examine students' papers and projects. Academics call that "soft data," because it's subjective. But in the ongoing quest to find a more objective measurement, they've been using mathematical formulas that seek to isolate an individual teacher's effect on student achievement. This is called "value-added data" because it seeks to quantify how much value the teacher adds based on students' standardized test scores.
"We take the individual predicted score for each child and then we average those scores," explains Deputy Chancellor John White. "At the end of the year, we look at each individual child’s results. And then we average those scores. And when you look at the difference between the average of the prediction and the average of the result that’s how much impact the teacher had."
Picture the top of a bell curve. All teachers are ranked on a curve -- meaning most will cluster around the average. But a teacher whose students score far better than expected, when compared to similar teachers, will wind up in a higher percentile at the right of the curve. Those whose students perform worse will be in the bottom, or left of the curve, and rated less effective. White says the city says is primarily interested in those who consistently score in the top and bottom 10th to 20th percentiles. White says the city makes sure to compare similar groups of students, such as those still learning English, and teachers with similar levels of experience.
The city has been quietly compiling this data for three years on the 12,000 teachers. The teachers union says it agreed to allow these "teacher data reports" to be used for evaluation purposes. They were shared with some researchers and a few media outlets in 2009, but without the names of the teachers or their schools.
However, these data reports became the subject of controversy this year. Over the summer and fall, several media outlets requested them in full, under the state's Freedom of Information Law, and the Department of Education said it would release the names of the teachers. That triggered a lawsuit by the teachers union, which sought an injunction preventing the city from releasing the reports unless the names were redacted. A hearing was scheduled for November 24 and then postponed until December 8.
In its brief, the United Federation of Teachers refers to an agreement with a former deputy chancellor not to turn over the names of the teachers to media organizations. If that memorandum is breached, it says, other public employees would be reluctant to participate in experiments such as the teacher data reports. It also warns of parents who might demand that their children be transferred to another teacher's class if they think their current teacher's ranking is too low.
The union also argues that the reports are "fundamentally flawed," and that they're likely to "mislead rather than inform the public." As evidence, it includes several teacher data reports with errors. For example, it says many teachers were reported as having taught a class with a co-teacher when they did actually not have one. And some teachers who taught subjects other than math and English Language Arts were rated when they shouldn't have been. The union claims these inaccuracies "are legion."
But the union saves much of its ammunition for the methodology behind these teacher data reports. It claims the average margin of error is plus or minus 27 points, or a spread of 54 points. That means a teacher who shows up in the 85th percentile mght actually be in the 58th percentile, making him or her closer to average than above average. Similarly, the union argues, a "bad teacher" could "just as easily fall in the 60th percentile and not the 33rd."
Academics who have studied teacher ratings with value-added data say these problems are typical. But they disagree on whether they render the reports meaningless to schools and to parents. Eric Hanushek, a senior fellow at the Hoover Institution of Stanford University, has been studying teacher evaluation systems since the 1970s. "The argument being used with measuring value-added is that we shouldn't do this until it's perfect," he says, adding that the current methods for evaluating teachers are far from perfect.
Hanushek agrees the evaluations are less reliable with small numbers of students. In some cases, a teacher's rating can go up -- or down -- dramatically in just a year. But he says they are much more stable with larger groups of students and over multiple years.
"What we want to do is put people into categories of 'highly effective,' 'pretty effective,' 'not so effective,' 'ineffective,' or something like that," he explains. "So we divide people up into ranges and for the most part that's possible." Hanushek notes that studies have found the bottom five percent of teachers are the ones that do the most harm, and that moving just those teachers out of the classrooms and replacing them with average teachers would have a dramatic impact on student achievement. He says value-added measurements could help principals and districts identify those teachers who are consistently in that low range.
But others say the reports aren't yet ready for prime time. Sean Corcoran, an education economist at New York University, found problems with the New York City data reports for 2007-08. And Daniel Koretz, a professor at Harvard's Graduate School of Education, says problems with New York State exams make the reports especially unreliable. Koretz notes that the state acknowledged this year that its annual math and reading tests had become too predictable.
"What is going to happen is it’s going to be very hard to distinguish between teachers who appear effective because they taught well, and teachers who appear effective because they found ways to cut corners in coaching for the test," Koretz explains. "And conversely you could have very effective teachers who rank more poorly than some of their peers simply because they have not been engaging in inappropriate test preparation."
But the city stands by its ratings and says its continually worked to improve the reports by responding to feedback from principals, teachers and researchers. It raised the minimum number of students a teacher must have to generate a report (from 6 to 10 for all grades in math and elementary English Language Arts, and 20 for ELA middle grades). It also says principals were given time to report any errors during a verification process.
The Department of Education also says its required to turn over the names of individual teachers this year because five media outlets specifically asked for them this time unlike in previous years (The New York Times, Daily News, Wall Street Journal, New York Post and NY1). Those organizations filed their own court briefs arguing against the union's request for an injunction. The one filed by The New York Times calls the teachers union "fundamentally mistaken in advancing the paternalistic notion that the public should not have access to teacher data reports because 'untrained laypersons' will not appreciate the imprecision in the methdology" of the reports, or that they're intended for professional development. And it says the state's Freedom of Information Law is meant to be interpreted as broadly as possible, and that teachers are public employees.
Earlier this year, The Los Angeles Times caused a controversy when it hired a consultant to rate the effectiveness of thousands of teachers based on student test scores. That was the first time teacher ratings were shared with the public. Other cities use value-added measurements, though, for evaluation purposes and for awarding bonuses (see Teacher Ratings: What Are Other Cities Doing?)
Hanushek, of Stanford, says the release of teacher ratings in Los Angeles means parents will forever insist upon having more information about their schools and teachers. If New York City is allowed to share its own ratings, he predicts "this will add an impetus to a movement to have better evaluations of teachers, and ultimately to use evaluations fo performance to reward or sanction teachers."
Regardless of what happens, he says, the genie’s already out of the bottle. Because parents know these ratings exist, he says, they’re sure to keep pushing for more information about their children’s teachers.
What the Experts Say
Several think tanks and universities have been studying teacher ratings and the use of value-added data. All agree teachers should be evaluated by multiple factors, not just how their students score on standardized tests. But some are more supportive than others about the usefulness of value-added data systems.
These three groups are more supportive of using the data to evaluate teachers:
Hoover Institution at Stanford University
Center for Great Philadelphia, University of Pennsylvania
These groups tend to be more cautious about the limitations of teacher ratings:
Annenberg Institute for School Reform at Brown University
Economic Policy Institute
National Education Policy Center
The Rand Corporation
UPDATED at 5:30 p.m.