For now, here's today's testimony.
* * *
AchieveNJ, as currently constituted, is fundamentally flawed.
It is critical for this board and the Department of Education to understand
that AchieveNJ violates the most basic laws of measurement and statistical
practice; consequently, it is simply not viable.
AchieveNJ consists of three basic parts: a score based on
teacher practice, a score based on Student Growth Objectives (SGOs), and, for
teachers in tested grades and areas, a score based on Student Growth
Percentiles (SGPs). These scores are weighted and combined to create a
summative rating, which determines the final effectiveness rating of a teacher.
No doubt you are aware of the work of Dr. Bruce Baker, who has
shown that the SGPs have inherent biases at the school level against schools with larger proportions of at-risk students.[1]
If these measures are biased at the school level, there is every reason to
believe they are biased at the teacher level as well. No teacher should be
punished in his or her ratings simply because they choose to work with the
neediest students.
You may also aware that there is no evidence that SGOs are
either valid or reliable, particularly in the many untested subjects in which
they were used this past year.[2]
The NJDOE has released research about SGOs on its website; my review of that
literature, however, confirms that we have next to no evidence of any
predictive validity for SGOs that would read us to conclude that they are
viable measures of student achievement, let alone teacher effectiveness.
These are serious concerns. But as my time is limited, I’d
like to focus on one particular part of AchieveNJ that has received little
attention, yet is probably its greatest flaw: the illusion of precision.
In both the creation of scores for each of its three
components, and in the combination of those scores to create a summative rating,
AchieveNJ creates scores that are more precise than they should be. Even though
it is impossible under the laws of mathematics to create finely-grained scores
for teacher evaluations, AchieveNJ violates those laws and does just that;
consequently, the ratings teachers receive under this system are ultimately
arbitrary and capricious.
To understand this phenomenon, it is necessary to understand
the idea of “significant figures.” As any high school student will tell you,
you can’t average measures without rounding up or down to the nearest
significant figure. The reason is that a measure with many digits implies that
the measuring instrument can make distinctions as fine as the resulting
average.
Unfortunately, AchieveNJ’s teacher practice scores violate
this most basic of mathematical concepts. Take, for example, the Danielson
instrument: in every component of this model, teachers receive a score of 1, 2,
3 or 4. Yet in the Teachscape system used by many districts, and in materials
published by the NJDOE, the scores of each Danielson component are averaged to
numbers in between these integers.
In an example on the NJDOE’s website[3],
a hypothetical teacher is given a rating of 3.15 in their teacher practice
score. This is, to put it bluntly, innumerate. Any student in a high school
science class who averaged this way would fail. I know this because,
ironically, the Common Core State Standards in Mathematics specifically require students to demonstrate the ability to “Choose a level of accuracy appropriate to
limitations on measurement when reporting quantities.”[4]
(CCSS.MATH.CONTENT.HSN.Q.A.3)
Even Charlotte Danielson herself would tell you her
instrument is incapable of distinguishing between a teacher who gets a score of
3.15 and a teacher who gets a score of 3.25. AchieveNJ is perpetuating an illusion of precision.
This is critically important because the cut scores set by
the NJDOE are based on this illusion. AchieveNJ actually pretends that the
teacher practice instruments can distinguish between a teacher who gets either
below or above a 2.65, the cut score that determines “effectiveness.” Not only
is there absolutely no research base to support this arbitrary cut score: the
cut score itself is in violation of a
mathematical concept we expect our high school students to comprehend.
Why did NJDOE decide to perpetuate this illusion? I can only
guess, but I suspect it is because they had a problem with combining SGPs,
which are on a 1-to-99 scale, with teacher practice scores, which should be on a 1-to-4 integer scale. But
adding phony precision to teacher observations is not a solution to this
problem.
An evaluation system that ignores the basics of measurement
cannot and should not be trusted. I am afraid, looking at both the membership
of and the list of witnesses brought before the Educator Effectiveness Task Force, this basic concept slipped the grasp of those who were charged with
developing AchieveNJ.
I would urge this board to avail itself of the many
excellent scholars and researchers in New Jersey who have expertise in this
area, and give them extended time to explain to you these and the many other
many flaws found in AchieveNJ. And I would urge you to put a moratorium on any
high-stakes decision based on AchieveNJ until such time as its many problems
can be corrected.
Thank you for your time.
AchieveNJ, aka: Operation Hindenburg
SADLY THESE FLAWED EVALUATION AVERAGES ARE PLACING STIGMAS ON GREAT TEACHERS. THE ENTIRE PROCESS HAS WASTED MORE NEEDED MONEY THAT SHOULD BE USED FOR THE NEEDIEST SCHOOLS AND STUDENTS. EVERY YEAR PROGRAMS GET TESTED BECAUSE THE STATE IS LOOKING FOR SOME WAY TO EVALUATE AND IMPROVE EDUCATIONAL OUTCOMES. I REMEMBER TOSSING OUT WASTED PAGES OF RESEARCH AFTER SEVERAL CONFERENCES AS HOTELS TO DETERMINE A BEST TEACHING METHOD FOR EACH SCHOOL, ONLY TO REALIZE THAT THIS COMING TOGETHER WASTED SO MUCH TIME AND ENERGY BECAUSE NONE OF IT WAS GIVEN A FAIR CHANCE TO WORK. THE STATE CONTINUES TO CHANGE POLICY WITH EACH NEW GOVERNING BODY. THIS PROCESS GOES ON AND ON AND ALWAYS HAS FLAWS WITH EACH CHANGE OF GOVERNING BODIES. GREAT TEACHERS ARE LET GO BECAUSE THEIR EQUATIONS FOR EVALUATIONS ARE FLAWED AND THE STUDENTS ARE NOT REALLY THE MAIN OBJECT OF CONCERN. IT WILL ALWAYS BE ABOUT MORE MONEY FOR CORPORATIONS AND POLITICIANS AND THEIR CRONIES. IF YOU PAID STUDENTS TO LEARN IN THE POOR DISTRICTS I THINK THIS WOULD BE THE BEGINNING OF CHANGE. THE MAIN CAUSE IS POVERTY NOT THE TEACHERS.
ReplyDelete