About Those New State Exams: Questions for a Testing Expert

Friday, August 16, 2013 - 04:00 AM

Daniel Koretz teaches educational measurement at Harvard University's Graduate School of Education which means he knows a thing or two about standardized tests. He was also a member of a technical committee that led New York State to decide its old exams were too easy, and that the state needed to raise its standards. In an interview, excerpted here, he offered suggestions for how to interpret the results.

Education historian Diane Ravitch wrote a column in which she said it wasn't fair to make these new exams similar to the national tests (National Assessment of Educational Progress), because if you are proficient on NAEP, that's really like getting an A. Their proficiency standard is so high. Do you needed to get the equivalent of an A to be proficient on this test instead of a B or C?

That brings up the first thing that I think is confused in some of the press, and this is something I actually specifically alerted [State Education Commissioner] John King to be worried about - which is that the performance levels we see now are different than in the past for two entirely different reasons.

One is that the test is different. Kids are asked to do things they were not asked to do before. And it's quite common when a test is changed, when a testing program is changed, the performance is low the first year or two.

The second is completely different, which is that the standards have been set again. And one of these things, this is not a New York problem this is a national problem, is everybody now reports performance in terms of what we technically call percent above cut. The proportion proficient, percent proficient or above. It is, in my opinion, an absolutely dreadful way to report performance. For a lot of reasons. One is that it confounds actual changes in performance with how a group of judges sitting in a room sets standards. So the standards were set from scratch this time. And the definitions of performance that the judges were given were necessarily different from what they were in the past because they're supposed to be thinking about college and career readiness -  a different definition of proficiency.

So even if the tests were identical, the percent proficient could be quite different. And we don't have enough data to separate the impact of the change in standard-setting from the change in the test itself.

You are distinguishing two different things: it's not that the tests were that much harder, it's that they were much harder to pass by what students need to be proficient?

It's even more complicated than that. Because let's say you had two versions of this new test. One written by company A and one written by company B, and they differed only in that the mix of item difficulties was different. The items, if you looked at them, looked like the same content but my test happened to be easier than your test.

If the standard-setting method which was used in New York - which is the most commonly used (nationally) - works properly, and you and I are both marginally proficient kids, we should both be selected as marginally proficient regardless of which test, which of those two tests, were given.

The critical point is the difficulty of the test. If you mean by that what proportion of kids get items right (which is how most people think of difficulty) it doesn't tell you where the standards are going to be set. The same kid should be identified as proficient regardless of the mix of item difficulties. This is really hard for people who are not technically immersed in the stuff to understand. But the bottom line is that when I say the tests may have been harder, I don't mean the typical item was answered correctly by fewer people. I mean more of the content was novel for kids.

There are three different things going into the performance level of kids. This is a nightmare for laypeople who are trying to make sense of the change and it's not New York problem, it's the way it happens everywhere. It's exactly what happened in Kentucky for example.

So back to Diane Ravitch's point, was this comparable to saying that in order to be proficient you needed to get an A instead of the B or a C?

What's an A?

What do we really mean by college and career readiness? Is the level of preparation and mathematics that will allow someone to take a job in a high-tech machining plant in Saratoga Springs at all similar to the math a kid needs to be able to apply to MIT for RPI? That's a discussion that I would like to see more of, or one of them.

I think there's a real argument about which skills are important for college and career readiness and there's also a question of keeping kids' options open. So you may say, this kid is going to go to a job in such and such a field, he doesn't need the following mathematics, but do you want that door closed when he's in grade 8? Or do you want to keep the door open? So New York is like every other state in the country, it is in a sense stuck with a phrase [college and career readiness] that has very ambiguous meaning.

So how do you think we should interpret these results, when we see only 31 percent of students in New York State were proficient in math and English Language Arts, and even fewer in New York City?

What I suggest is as we get more results, more than percent proficient, as we begin to learn more about how kids did on specific content in the test, and how different kinds of kids did on specific content, then I think people ought to start debating that, and what we want our kids to do and what does this suggest about what we need to change? Because some of the changes in the Common Core are clearly for the better. Some of the changes that New York wants to make are clearly for the better. So to the extent that the changes are for the better, what do we need to make sure the level of performance is higher in the future than it is now? What do we need to change about teaching and other aspects of the educational system?

You're saying we still have a big debate ahead of us. So I'm wondering was it too soon for New York and Kentucky to have the state exams? Are we still in the process of the conversation that needs to be settled a little bit more before we start testing our kids on this?

Well, researchers always ask for more time!

If you go back even more than half a century, in the measurement literature you find people bemoaning the fact that we always really are stuck with assessing curriculum that people devise. And we rarely have the ability or leisure to track kids for 10 or 15 years and see what they're actually doing with stuff later on. So I think, in general, I think the push is in the right direction. Further details I think we should have more debate about, mostly at the high school level.


Patricia Willens


The Morning Brief

Enter your email address and we’ll send you our top 5 stories every day, plus breaking news and weather.

Comments [8]

ann from NYC

I applaud Ms. Fertig for interviewing this key figure in our students education. While more in- depth questioning of his "methods" could have occurred, his complete failure to utilize the data/results from this years testing and to basically take it on faith that we are "going in the right direction" is the same old story from education researchers. I would have more faith in his methods if he could explain himself in clearer language to teachers and parents, or provide an actual curriculum and lesson plan examples to enable the kids to reach proficiency

Aug. 26 2013 11:01 AM

Again, Beth Fertig failed to ask hard hitting questions. In this case she wasn't even able to help clarify Koretz' points. We all know the testing culture is out of control in this country. When will Fertig begin asking challenging questions?

Aug. 17 2013 10:36 PM
Barbara Deutsch

The education system is missing the boat. There is way too muchemphasis put on standardized testing and not enough placed on what is going to help students become better readers, writers , researchers and mathmeticians. Standardized tests do not prepare students for college and life-better teaching methods and more instruction time do. Precious time is being wasted on preparing for tests and teachers talents are falling to the wayside. This past year teachers should have been able to not give final exams and instead had students look at subject they studied during the year in greater depth. They might actually learn something by devising a project , researching it, and engaging in critical thinking. These are the skills necessary to college success and not the ability to pass a standardized test. Common core standards could be reflected in each student's project as a way to guagewhether or not they understand the concepts. We might even end up developing a country of independent thinkers.

Aug. 17 2013 10:06 PM

Everyone's an ``expert''. This one advises New York on state exams? No wonder he spoke in general terms but doesn't discuss the exams themselves. Anyone else see the conflict of interest?

Here's an independent look at some of the actual questions students saw:

Aug. 16 2013 10:30 PM
Mary Adams

"What I suggest is as we get more results, more than percent proficient, as we begin to learn more about how kids did on specific content in the test, and how different kinds of kids did on specific content, then I think people ought to start debating that, and what we want our kids to do and what does this suggest about what we need to change?...." Who is the "WE" Dr. Koretz says will be getting results, learning how kids did and debating test "improvements"? Certainly not students, parents, teachers, administrators or local policy makers -- none of us are allowed to see the tests or itemized results, even after tests are completed, even under "secured" conditions! Question for Dr. Koretz: What are the distribution assumptions about raw test scores in NYS student growth percentile model? What happens if there is relatively little variability in raw results?

Aug. 16 2013 06:27 PM
Leonie Haimson from NYC

Beyond the critiques expressed above, which point out that the exams themselves as well as the cut scores may be invalid, I'd like to know what evidence there is that setting standards higher actually leads to more learning -- or simply more pressure on kids and teachers, and more failure.

Aug. 16 2013 06:08 PM
Katie Lapham from Brooklyn

I may not be a mathematician, but I do consider myself an expert on assessments, having administered countless - both standardized and my own - over the past seven years of teaching ELLs (English-language learners) in the NYC public school system.

The 2013 Common Core New York State tests were way too long and developmentally inappropriate. The sheer length of them is tantamount to child abuse, in my opinion. I don't bother trying to make sense of Dan Koretz's analysis because, in my opinion, these tests are invalid and meaningless. They tell me nothing about what my individual ELLs have learned and what skills they still need to work on. Likewise, they do not reflect the invaluable real life knowledge my students have acquired through engaging curricula, which is now being compromised by NYC/Pearson's CCSS ReadyGEN ELA program, but that's a separate issue I shall address in a different conversation.

Standardized testing has taken over my teaching program. As an out-of-classroom teacher, I lost over 40 teaching days in the 2012-2013 SY to the preparation, administration and scoring of NYS assessments.
We are TOO busy testing to teach meaningful content in a consistent manner.

I urge policymakers, think-tankers, foundations and corporate parties to read my blog, which details the ways in which NYC public school students are being robbed of a meaningful education as a result of their corporate ed reform policies, including high-stakes testing.

Katie Lapham, elementary ESL teacher
Brooklyn, NY

Aug. 16 2013 12:18 PM
Carol Burris

Dan Koretz helped NYS create its flawed "college and career readiness" scores on Regents exams. While he claimed they should not be used for high stakes purposes, they have become the new proficiency cuts for high schools. Read about the problems with their creation here :

Aug. 16 2013 10:52 AM

Leave a Comment

Email addresses are required but never displayed.