I have enormous respect for Matt. He commands a great deal of information about a complex topic, he has a strong grasp of research methods, and he has the ability to distill the thorny language of academic research into writing that lay people can not only understand, but use to inform themselves about this critical debate.
Which is why I am so very, very disappointed in his latest post:
Using value-added and other types of growth model estimates in teacher evaluations is probably the most controversial and oft-discussed issue in education policy over the past few years.
Many people (including a large proportion of teachers) are opposed to using student test scores in their evaluations, as they feel that the measures are not valid or reliable, and that they will incentivize perverse behavior, such as cheating or competition between teachers. Advocates, on the other hand, argue that student performance is a vital part of teachers’ performance evaluations, and that the growth model estimates, while imperfect, represent the best available option.
As Atrios said the other day: when you're confronted with a "Clowns to the left of me, Jokers to the right..." column, watch out.I am sympathetic to both views. In fact, in my opinion, there are only two unsupportable positions in this debate: Certainty that using these measures in evaluations will work; and certainty that it won’t. Unfortunately, that’s often how the debate has proceeded – two deeply-entrenched sides convinced of their absolutist positions, and resolved that any nuance in or compromise of their views will only preclude the success of their efforts. You’re with them or against them. The problem is that it’s the nuance – the details – that determine policy effects.
The issue has never been "certainty"- everybody understands that no measure is perfect, and that there will some inevitable flaws in any system of evaluating teachers. The issue is "appropriateness." It is not appropriate to use test scores in high-stakes decision making when everyone - especially Matt - knows the error rates are far too high.
Even if you create an evaluation system where you mitigate for the huge margins of error (60% spreads?! Seriously?!), you're still left with the question of what you're going to do with the teacher's score once you have it. Fire them? Deny seniority? Pay them less or more? How can anyone possibly be for making these high-stakes decisions when they know the error rates are so high?
Matt, buddy - that's what we're talking about, isn't it? That's the entire issue. Nobody is against "well-designed teacher evaluations"; we're against poorly-designed ones. Do you think the evaluations people like Michelle Rhee and Chris Christie and Arne Duncan are selling are any good?Let’s be clear about something: I’m not aware of a shred of evidence – not a shred – that the use of growth model estimates in teacher evaluations improves performance of either teachers or students.Now, don’t get me wrong – there’s no direct evidence that using VA measures has a positive effect because there’s really no evidence at all. This stuff is all very new, and it will take time before researchers get some idea of the effects. There is some newer evidence that well-designed teacher evaluations can have positive effects on teacher performance (see here, for example), but these systems did not include test-based measures. [emphasis mine]
Given this, is it so extreme to say "don't use test scores to make high-stakes decisions about teachers"? Is that a position that is just as far out of the rational center as saying "fire and pay teachers based on test scores"?This situation would seem to call for not simple “yes/no” answers, but rather proceeding carefully, using established methods of policy evaluation and design. That is not what is happening. Thanks in large part to Race to the Top, almost half of public school students in the U.S. are now enrolled in states/districts that already have or will soon have incorporated growth estimates into their evaluations. Most (but not all) of these states and districts are mandating that test-based productivity measures comprise incredibly high proportions of evaluation scores, and most have failed to address key issues such as random error and the accuracy of their data collection systems. Many refused to allow for a year or two of piloting these new systems, while few have commissioned independent evaluations of these systems’ effects on achievement and other outcomes, which means that, in most places, we’ll have no rigorous means of assessing the impact of these systems.[emphasis mine]
Dude, my side isn't implementing ANYTHING! The corporate "reformers" are doing all the implementing! They want to radically change the way teachers are employed, paid, and fired on the basis of this stuff - not us teachers! And they want to do so without any of the caveats you're suggesting.In my view, this failure to address basic issues reflects extreme polarization between the “sides” in this debate. When positions are black and white, details and implementation get the short end of the stick.
And yet, you seem to think it's incumbent on us teachers to give a little here:
Matt, if you were here, I'd make you look me in the eye while I say this:On the other “side” of the divide, any admission that growth measures might play even a small, responsible role in evaluations risks the dreaded slippery slope, while a cautious acknowledgment that standardized testing data do provide “actionable” information somehow represents a foot in the door for an evil technocratic regime that will sap public education of all its humanity. [emphasis mine]
You admit that their side is pushing for a test score-driven method of evaluating teachers that is full of error. You admit that their side is going to make all sort of high-stakes decisions based on this system, even though we all know that is completely inappropriate and will certainly cause great harm to both the teaching corps and the schools in the coming years.
And yet - even though you admit these people are doing something very, very wrong - you want me to give them the benefit of the doubt, concede to piloting their methods, and not assume this is a "slippery slope"?
Matt, you have got to be kidding me.
What do you think will be the outcome of their "studies"? How "independent" do you think the "researchers" who come up with the conclusions will be? We may as well let BP study the damage from the Gulf oil spill; we may as well let Goldman-Sachs determine whether the markets are rigged (actually, I think we may be letting Wall Street do just that...).
These people have already shown their hand, Matt. There's no doubt what these "studies" will conclude. They have made up their minds and are going to cherry-pick whatever they can to conform with their world view.
How do I know this? Simple: they are doing it right now. If you want me to give them the benefit of the doubt, they're going to have to stop their march to implement a program you and I and everyone else knows has not been studied nearly enough and should not be implemented.
What do you think the odds of that happening are, Matt?