I will protect your pensions. Nothing about your pension is going to change when I am governor. - Chris Christie, "An Open Letter to the Teachers of NJ" October, 2009

Thursday, March 21, 2013

NJ Ed Commish Cerf: Wrong On Poverty, Wrong On Teacher Evaluation

If my email is any indication, teachers across New Jersey are just now being introduced to our new, top-down, state-mandated evaluation system, AchieveNJ. There's a lot of concern about this system - as well there should be. AchieveNJ has many serious flaws that I've documented over the last two weeks: see here, here, here, herehere, here, and here.

But none of these seem to concern Commissioner of Education Chris Cerf; if anything, he seems more confident than ever in AchieveNJ's infallibility:

Students in New Jersey are going to be placed in peer groups with students who scored similarly on the previous year’s test. They will get a score based on how much better or worse they do than others in their peer group.
You are looking at the progress students make and that fully takes into account socio-economic status,” Cerf said. “By focusing on the starting point, it equalizes for things like special education and poverty and so on.” [emphasis mine]
According to Cerf, AchieveNJ doesn't just take socio-economic status into account: it takes SES "fully" into account. Notice the certainty in Cerf's tone; he is completely confident that the research backs up his position.

Alas - once again - the facts are not on Cerf's side:

There is substantial evidence, based on previous uses of test-based teacher evaluations, that teachers who educate students in poverty will pay a penalty when judged by AchieveNJ.

Everyone who cares about education in New Jersey - arguably, the highest-performing state in the nation when accounting for student characteristics - should be gravely concerned that Commissioner Cerf is selling his teacher evaluation plan under demonstrably false pretenses:

The Education Commissioner argues students who come from low-income areas will likely score similarly on the previous year's test, so he says they will likely end up in the same peer group where they are only compared to others like them. Same with students with disabilities, or students who come from high-income areas.
The assumption he’s making is that the initial score embeds all of the background disadvantage of the student, therefore there’s no reason to use other measures for accounting for that,” said Bruce Baker, a professor st the Graduate School of Education at Rutgers University. 
Baker studies teacher evaluation models and argues the state’s system doesn’t accurately isolate the role a teacher has on learning. [emphasis mine]
Well, who's right? Is there a bias in a teacher's evaluation if that teacher has a disproportionate number of disadvantaged students? Has there been a previous use of SGPs that can be studied to determine if they are biased against students - and, therefore, their teachers - who are in economic distress?

Luckily for us - and apparently unknown to Commissioner Cerf - we know that there is.

AchieveNJ uses a method called Student Growth Percentiles (SGPs) to determine a teacher's "effectiveness" as evinced in state test scores. The method is fatally flawed to begin with: even it's "inventor" says it can't determine why a student has a particular level of "growth," which makes it inappropriate for use in teacher evaluation. But let's put this enormous flaw aside; have any other states used SGPs to evaluate teachers, and did they find a bias against students in need and, consequently, their teachers?

It turns out that just last year, New York collected data for millions of students on the results of state-level standardized tests. The state then calculated evaluations for thousands of teachers across the state. based on the SGPs of their students. The NY "growth model" was then analyzed by the American Institutes for Research and presented in a final Technical Report that was released last fall. Their conclusions (p.33)?
Table 11 indicates that there is a moderate correlation with the percent SWD [Students With Disabilities] and percent economically disadvantaged in a class. This correlation is negative, indicating that as the percent SWD or percent economically disadvantaged in a class increases, the MGP [median growth percentile] tends to decrease. However, teachers with high and low concentrations of SWD and economically disadvantagedstudents are still able to achieve MGPs across the distribution. [emphasis mine]
Translation: if a teacher has more children with disabilities or who are economically disadvantaged in his or her class, that class's average Student Growth Percentile will go downThis stands in direct contradiction to Chris Cerf's assertion that SGPs "fully take into account socio-economic status." 

But hold on! Some of you might say: "Hey, Jazzman, it says there's a "moderate" correlation! What's the big deal?" Well, when you're dealing with forced decisions based on test scores, it is a big deal: again, some of the evaluation, all of the decision.

But aside from that, there's good reason to believe that AchieveNJ has a worse bias against economic disadvantage than the NY model. Because the New York growth model at least tries to account for the disadvantage that students in economic distress suffer on state tests (p. 14).
NYSED’s regulations permit three specific control variables at the individual student level for inclusion in the model designed to produce adjusted scores through 2011–2012, without additional classroom- or school-level variables. These variables, which are listed below, were selected after consultation with the Regents Task Force.. Additional variables may be included in a value-added model in future years, including classroom- or school-level variables that may reflect the context in which learning occurs. These are the student-level predictor variables: 
ELL status: A Y/N variable was provided to indicate ELL status. 
Students with disabilities (SWD) status: A Y/N variable was provided to indicate SWD status. 
Poverty or economic disadvantage (ED): A Y/N variable was provided reflecting New York State’s rules related to family income levels and participation in economic support programs. A description is provided in Appendix E.
So the New York growth model at least attempts to account for the differences in test score outcomes, based on whether a child speaks English at home, whether they have a disability, and whether they are in poverty. Let's be very clear about this: the NY growth model fails - but at least they tried. It's reasonable to assume there is some mitigation of the effects of poverty in the teacher evaluation scores as reported by MSGPs.

But - and this is critical to understand - AchieveNJ doesn't even make the attempt to correct SGPs for poverty! 

In other words: the NY model attempts to account for students who are in economic distress, and tries to correct for the disadvantage they have on state tests, so that teachers who have these children in their classes are not unfairly penalized. However, even after this attempt, AIR found there is still a bias against these teachers. But AchieveNJ doesn't even try to account for students' socio-economic status.

It is, therefore, reasonable to assume that the bias against New Jersey's teachers who take on the arduous task of educating kids who are in poverty in will be even greater than the bias that has already been demonstrated in New York!

But hold on - it gets even worse! Because the New York model only disaggregates children in "economic disadvantage" by a "yes/no" variable. There's no accounting for different levels of poverty. Why does this matter?

As Bruce Baker has demonstrated, there is (at the school level at least) a substantial difference in the test score outcomes of children who are "less poor among the poor," and those who are in "deep" poverty. The New York model doesn't account for this; it treats all children below a certain threshold as equally disadvantaged. It is, therefore, reasonable to assume that the teachers of children who are in "deep" poverty will pay an even higher price on their evaluations for taking on this task.

So AIR's evaluation of New York State's growth model stands in direct contradiction to Commissioner Cerf's claim: SGPs are biased against teachers who educate our poorest children. And the penalty is most likely worse for New Jersey's teachers than it was for New York's. The evidence is clear and unambiguous.

And yet here stands Commissioner of Education Chris Cerf, absolutely certain in his righteousness, undeterred by the facts and past practices. He states, with the utmost confidence, that AchieveNJ "fully takes into account socio-economic status."

Teachers and parents of New Jersey: don't believe the hype. Your Commissioner of Education is flat out wrong: the teachers who educate poor children will be penalized under AchieveNJ.

Does everyone understand how incredibly irresponsible this is? Do Chris Cerf and the NJDOE even care?

Accountability begins at home.


Tupper Cooks! said...

You know what the problem with politicians is? They get bought off too easily by the modern day snake oil salesman.
Once they start drinking it they can't stop, even when faced with data they clearly refutes their lies and mistruths. That said, (I am Capt. Obvious this morning) great post Jazzman. I'm going to a forum tonight where Dick Iannuzzi, preident of NYSUT will giving us his take on the state of things in NY, APPR and the Common Core. Can't wait to see if he can handle hard questions or if our Union is in bed with State Ed.

Unknown said...

Eric Hoffer had this guy pegged: "Those in possession of absolute power can not only prophesy and make their prophecies come true, but they can also lie and make their lies come true."

I'm sure everything will turn out fine if we just trust Commissioner Cerf. He MUST know what he's doing, or he wouldn't be in charge, right? Why worry?

(Sorry, bureaucrats bring out the sarcasm in me...)

technokat said...

This blog entry needs to be sent to all state BOE members. They read our letters and they do respond to the feedback of educators in the state. This also needs to be sent to the legislative education committees.

The following is from the NJEA.org site:

"The N.J. Department of Education (NJDOE) has released its proposed regulations for teacher evaluation. The State Board of Education (SBOE) got its first look at the proposal at its meeting on Wednesday, March 6. Starting on March 13, the NJDOE will hold eight presentations across the state to communicate the details of these regulations. NJEA members are encouraged to attend one of these free events. The department has stated that these presentations 'will provide an opportunity for educators to not only learn more about the system, but also to provide feedback to the department on how best to support local districts as they implement changes.' "

I have not been able to attend and of the sessions so far, but a few from my district were scheduled to attend last night's session. Providing feedback is important, although it appears that the DOE is only interested in feedback on how to support the implementation of these regulations--not how to improve or amend them. At least the DOE will be able to hear the arguments--whether or not these regulations will be amended or scrapped is another thing. We can and should take this fight further. Every NJ educator who can attend should make every effort to do so.

ad77 said...

If the NJ DOE is trying to win friends with their road show presentations, they need to have more respect for the audience.

At the Toms River event, questions were written down and handed up front for the presenter to answer. The first question read, "what is the definition of multiple?" the question obviously related to the "multiple" people that would be observing teachers for evaluation purposes. Since the system itself is overly prescriptive, one would think that the DOE would be dictating at the very least the minimum amount of people involved in this assessment. A good question.

The presenter read the question, and quite arrogantly & in a very condescending tone said, "MUL-TI-PLE, MEAN-ING MORE THAN ONE -- question answered". Then slammed the card down.

Vidoqo said...

Again, we see such a simplistic model of how SES relates to a student's human capital, or their total abilities to be successful. SES is a powerful general predictor, but it is just that - general. Too often it is assumed to represent more than it should. As it is mostly used in education, it refers only to parent income. But that is only one measure of what is a much more complex picture of SES.

A proper measure of SES would include not only family income but things like parent education, whether the family is intact, are there health issues both physical and psychological, neighborhood safety, substance abuse, etc. The number of factors is almost endless, and there are limits to what can reasonably be measured when designing policy. But that doesn't mean that confining ourselves to a simplistic model frees us from being limited by our data. Just the opposite, it means our data is superficial, and any conclusions we derive from it will be limited.

So, by looking only at parent income, two families could look similarly disadvantaged on paper, while in reality having wildly different levels of disadvantage.

james boutin said...

I was talking to a group of college students in New Jersey about urban education reform yesterday, and I told them that on Twitter, at conferences, and among the blogs, it's the New Jersey teachers who are often most angry/animated regarding the direction education reform is moving.

This is a nice piece of evidence I'll be able to provide to them as to why.

Ultimately, as far as I can tell, virtually every major policy that educators often fight against has the tremendously flawed understanding of standardized test scores this blog post points out at the heart of its rhetoric.

I noted just that in a message I sent my Washington State legislator today regarding WA Senate Bill 5242, which would force districts and labor associations to accept the state's version of mutual consent in the placement of "displaced" teachers.

Thanks for the continued detailed analysis of NJ.