An Evaluation Architect Says Teaching Is Hard, but Assessing It Shouldn't Be

Theodoric Meyer

SchoolBook | Feb 15, 2012

Sixteen years ago, Charlotte Danielson, an Oxford-trained economist, developed a description of good teaching that became the foundation for attempts by federal and state officials and school districts to quantify teacher performance.

The Danielson method -- articulated in her book, "Enhancing Professional Practice: A Framework for Teaching” (ASCD, 1996) -- describes good teaching using numerous criteria within four broad areas of performance: the quality of questions and discussion techniques; a knowledge of students’ special needs; the expectations set for learning and achievement; and the teacher’s involvement in professional development activities.

“If all you do is judge teachers by test results,” Ms. Danielson told Ginia Bellafante in an interview for a Big City column in the Metropolitan section of The New York Times last month, “it doesn’t tell you what you should do differently.”

As the standoff over adoption of a teacher evaluation system moves toward an Albany-style showdown this week, both the teachers' unions and state and district officials in New York largely agree on the Danielson method, but are struggling over its implementation.

Ms. Danielson, a West Virginia native, studied history at Cornell before doing graduate work at Oxford. She spent five years working as an economist with Chatham House in London, the Council of Economic Advisers in Washington and other organizations before making the leap to teaching.

SchoolBook visited Ms. Danielson in Princeton, N.J., where she has a private education consulting practice, and followed up with a telephone call. Here are highlights from the interviews; the answers have been edited and condensed.

You’re an Oxford-trained economist. How did you transition from that to creating teacher evaluation systems?

I was working as an economist in Washington, D.C., in the inner city, in the Adams-Morgan neighborhood. I don’t know if you know Washington. It’s become upscale now, but at the time it was not. It was quite poor, but stable, and I got to know the people on the block, of course. I got interested in the children -- I got to know them and liked them and enjoyed spending time with them. They said things like, ‘I wonder if I’m going to pass this year in school.' I knew nothing about education, but I thought, ‘That seems peculiar that they would be wondering that.’

At the time, the District of Columbia was transforming many of its schools over to be community controlled. We were getting these notices saying, ‘Come up to a meeting at the school. We’re taking it over.’ So I started going along to those. And I had never been in an inner-city school before. It was a beautiful old structure, but very sort of barren. And I just thought, ‘Oh, this doesn’t feel right.’ I just decided I wanted to learn about teaching and kids and education.

I went to school at night, got my teaching credential. I taught in that local school for a couple of years. And then I married and moved up here and things became more normal. I had a job in a local school here, and gradually transitioned into professional development and curriculum development.

Then I was offered a position at ETS (the nonprofit teaching and assessment company responsible for the SAT and A.P. exams, among many others) to help them develop a program on teaching thinking — teaching kids to really think — through their curriculum work.

Right at that time, ETS was also redesigning its teacher assessment system for licensing. I learned that when you have clear standards of practice, then everybody involved in that enterprise — the teachers themselves, the people assessing them, everybody — learns a lot. They become sort of a professional community, which they aren’t before that.

You can’t begin to help teaching get better if you’ve not even defined it. If you think it’s some sort of magic, you’re never going to help it improve. That whole exercise convinced me that offering a clear statement of ‘What is good teaching?’ would be of service to the profession.

How did you develop the framework outlined in the book?

It’s an outgrowth of the work done at ETS. The basic architecture was essentially the same. But the other thing that I did, which was an innovation, was produce what are called rubrics or levels of performance. It’s the different aspects of teaching — what it looks like in words. That was not part of the ETS program, and it’s actually turned out to be the major contribution.

Those levels of performance turned out to be what everybody else started copying. And that’s what gives people guidance. If you’re an evaluator, it tells you what you should be looking for in the classroom or in a teacher’s work plan. And if you’re the teacher, it tells you what a trained observer’s going to be looking for.

The first rule of assessment is you don’t assess people on things they don’t know what they are. It shouldn’t be a mystery. But it always has been.

It seems sort of amazing to me that something like this hadn’t already been written.

No kidding! I know. It’s not like rocket science, you know? Teaching is rocket science. Teaching is really hard work. But doing that isn’t that big a deal. Honestly, it’s not. But nobody had done it.

In the ETS project, we had something we called scoring rules, which were little guidelines for observers, for evaluators. But they weren’t clean, and they were very general.

The work I’ve done since then has been to tighten them even more. That is actually quite a challenge. If you observe a class, there are some things that it’s easy to train people to recognize the same way, and to therefore have high levels of what they call inter-rater agreement. Did the class start on time? Is the teacher wearing a tie? Et cetera. We agree on that. Nobody would argue that’s important.

The important stuff, like the teacher’s use of questioning strategies to help students develop concepts — that’s much harder to assess. But that’s the kind of thing I was trying to capture. It is hard.

How do you get from the idea of assessing first-year teachers to the idea that all teachers should be assessed continuously?

That’s a policy decision, and I’m not sure I actually support that or agree with that — that all teachers should be assessed continuously. Public schools take public money, and the public has a right to expect good teaching, period. It’s a public trust. Actually, in private schools it’s worse — they take people’s money directly. (Laughs.) I don’t think ensuring quality — I don’t think that’s optional. That’s necessary.

Now, how you ensure quality is then the question. Does that mean continuous assessment of teacher practice? I don’t think so. But what does it mean? Periodic? And what counts as evidence?

There’s observations of practice, but there’s other things that you can do for evidence, like look at the work that teachers are asking kids to do. Is it intellectually engaging, for example? Is it high-level work, or is it busy work? That can all be part of an evaluation system.

What I do with states and with districts is help them design their systems, so that they capture evidence of those things.

What was the first district or state to implement your ideas?

The first one anywhere happened to be in England, actually. It was an international school. But the first school district in this country happened to be in Coventry, R.I. The book was published in 1996, and I met with the superintendent, John Deasy, in the spring of 1997. He got it right away. He said, ‘This is what we should be using.’ He has had several jobs since then, and he’s now superintendent of Los Angeles public schools. He was quite visionary in that sense.

After that first school district in Rhode Island, how did this sort of ramp up?

It’s very much driven by word of mouth. Like the superintendent in Rhode Island — he’s a good example. He left there and went to Santa Monica, Ca. Well, he implemented it there. And then he came to a district in Maryland; he implemented it there. And back to L.A. When these school administrators move around, which they do quite a bit, they take their ideas with them.

Meanwhile the policy landscape has changed dramatically in the last 18 months, two years, for sure. That was largely fueled by the Race to the Top competition and the feds putting a tremendous amount of money on the table for states to compete for.

And so in order to make themselves attractive to the reviewers, many states passed laws that they probably would have never done without that carrot. Some of them passed the laws and then didn’t get the money, but they still got the laws! (Laughs.) So they’re having to implement them anyway.

Up until last May, I had a small team of maybe 15, and then we had this enormous amount of interest. I asked every one of the 15 if they could recommend somebody who they worked with in their work and school district that they think could be brought up to speed pretty quickly to become another layer of trainers, and many of them did.

We’ve been developing some standardized materials, which we’ve never had to have before. It’s been an interesting evolution. We’ve had to get up to speed really quickly.

Tell me about the work you’re doing now. Any states or districts that stand out?

I can tell you some things we’ve learned that makes the implementation of an evaluation system go better. One is to give it enough time. Start on a small scale and learn from that.

Arkansas is a good example. I worked with the State of Arkansas over about an 18-month period to design a new statewide system, and then they offered it up on a small-scale basis to, I think, 10 districts, and the districts had to apply to do it. They were selected based on various criteria. They wanted them to be representative of the state regionally, different sizes of districts and so on.

That’s been a pretty good process there, and Arkansas has not had a lot of backlash the way some states and big districts have, where they’ve been in a hurry to do it and people have panicked.

Michigan passed a law last July mandating that every district in the state implement a new evaluation system by that September. Well, the ink wasn’t even dry. I don’t think anybody thought that was a good idea, except maybe the legislators who passed the law.

Many of them don’t understand the complexity of it. I mean, they’re not doing it because they’re bad people. They just don’t know. They don’t have a clue what’s really involved.

What’s really a good thing to do, although not everybody has the luxury of this, is to introduce the framework, and the language of teaching that is embedded in it, before you use it for evaluation. Just introduce it. Just introduce it as, ‘This is the language of teaching, and let’s get used to talking this language.’

If you can afford a year of that before it becomes the basis of an evaluation system, the more likely it is to be successful, because the minute you say evaluation, people get nervous, as is natural. ‘I’m going to be judged.’

But if they get used to it first and they say, ‘Oh, yeah, this is what I believe good teaching is,’ that becomes a very different conversation.

Why would teachers embrace this?

When a school district seriously invests in the quality of the training and the assessment, that is very reassuring to the teachers, as you can imagine. So the unions really like this stuff a lot, because they realize then that it’s not just sort of arbitrary.

The problem that teachers and teachers' unions have had all along with teachers’ evaluations is that they basically don’t trust the administrators to know what they’re doing. And there’s been just enough of this in the past so that it’s not an unrealistic fear. They’ll say, ‘Oh, well, my principal just doesn’t like me. He’ll take it out on me by giving me a bad evaluation.’

Now, I don’t think that has been widespread, but I’m sure it’s not unknown. Being able to ensure the accuracy of the evaluators and to have backup systems (her system includes videotaping of teachers' performance, as well as online training and assessment of observers). They have all this video capability, so a backup system for a school district is to let teachers videotape the same class that an observer is observing and evaluating, and then send it via the cloud to an external expert, and for somebody else to take a look. It’s panoramic, 360 video, so you can really see.

It’s not quite the same as being there, but it’s a lot better than just one camera pointing at the teacher. So you can get a second opinion, in other words.

Are there any parts of the framework that have been misapplied or misconstrued?

One of the things they’ve done in New York, and this has happened in a few other places as well, is they’ve initially focused their attention on a few subsets of the framework and not the whole thing. If you use a subset of the framework and you leave out certain parts, evaluators may see certain aspects of teaching and not know where to mark them down.

When you’re watching a class in which the teacher is interacting with students, say, you see that, but you don’t know where to put it. Let’s say you put it in 2B instead, which is not exactly the right place to put it. Now your assessment of that component is going to be inaccurate.

When states design evaluation systems, like New York is doing at the moment, do you think standardized test scores should factor into evaluations?

My expertise is on defining good teaching. It’s not on how to use test scores. I do think that it’s reasonable for teachers to demonstrate that their kids have learned. I think that does make sense. Beyond that, though, I’m not at all convinced that it can be done fairly for teachers based on what we know now, particularly in a high-stakes environment.

What’s changed in the last two years or so, since Race to the Top began?

Everything. It’s not so much Race to the Top -- it’s the state policy landscape, partly in response to Race to the Top. Teacher evaluation has become important, and high stakes. It never was either of those things before.

Two years ago, five years ago -- nobody cared.

Theodoric Meyer

WNYC is funded by sponsors and member donations

NYC could see hottest July 4 since 2010 as dangerous heat approaches

The city is expanding cooling centers and urging New Yorkers to prepare for extreme temperatures.

Gothamist |

Supreme Court Opinions

Emily Bazelon offers legal analysis of the last cases argued this term.

The Brian Lehrer Show |

The Fantasy of America at 250

Plus, life sentences for zine-makers in Texas.

On the Media |

America at 250: A View from Britain, with “The Rest Is History”

The historian-podcasters Dominic Sandbrook and Tom Holland explain why losing the thirteen colonies “annoyed” the British, but “it could have been a lot worse.”

The New Yorker Radio Hour |

Theodoric Meyer

WNYC is funded by sponsors and member donations

An Evaluation Architect Says Teaching Is Hard, but Assessing It Shouldn't Be

Top Stories

NYC could see hottest July 4 since 2010 as dangerous heat approaches

Supreme Court Opinions

The Fantasy of America at 250

America at 250: A View from Britain, with “The Rest Is History”

Inside WNYC

Get Involved

Get in Touch