Pages

Sunday, January 26, 2014

What Our "One Newark" Report Means

Late Thursday, Bruce Baker and I released our analysis of "One Newark," the plan to restructure schools in the state-run district. I'll admit the report is a bit heavy on technical language, but that's as it should be: we wanted to be clear about how we approached the task of evaluating One Newark, and why we reached the conclusions that we did.

But I think it will be helpful here to put our findings into a more vernacular language. Understand: what follows are my words alone, and not Bruce's.

Let's start by stating what An Empirical Critique of “One Newark" doesn't conclude:
  • We're not saying that there aren't good reasons to close or redesign schools, and that Newark shouldn't pursue a restructuring plan. Frankly, NPS hasn't released enough information for us to determine whether a One Newark-type plan is necessary. The state requires Newark to have a current long Range Facilities Plan, (LRFP), but the Education Law Center tells us they have asked for it repeatedly and have never received the document. Maybe school closings and redesignings are warranted -- but it's hard to say without all of the facts.
  • We're not saying Newark's schools can't and shouldn't improve. Of course all schools and all districts should strive continuously to get better, and everyone agrees that Newark's students can and should improve their academic outcomes. I would never make the case that schools don't matter or aren't an important part of a class mobility strategy. Newark's children deserve great schools (but that is not all they deserve, nor will it be the only way they get out of poverty).
  • We're not saying we've found a definitive way to measure school performance. Anyone who thinks they have is fooling themselves: school "effectiveness" is nebulous concept to begin with, and it's very hard to disentangle school outcomes from student outcomes, a point Matt DiCarlo has made many times. That said, we should use the data we have -- assuming it's of a high enough quality -- to inform policy decisions.
Speaking for myself (again, Bruce may have a different take), these are the important takeaways from our report:

- The consequences of One Newark disproportionately affect some types of students more than others: specifically, black students and students in economic disadvantage are more likely to experience disruption in their schools than other Newark students. The "patterned" bars represent differences that are statistically significantly different from the "No Major Change" group.



On its face, this ought to concern everyone, for four reasons: first, any sanction that affects students disproportionately by race or socio-economic status has got to be questioned purely on civil rights grounds. When the charter takeover schools have comparatively high numbers of black students, it suggests a racial bias that may not be deliberate but is nonetheless quite real. Same with income bias in the Renew schools.

Second, this is a state-run district: administration can implement these changes without the consent of locally elected officials, a huge difference with the vast majority of districts in the rest of the state. The biases are all the more reason why Newark's citizens ought to be allowed to accept or reject this plan on its own merits.

Third: as Bruce points out in an upcoming paper, there is a serious question as to whether the rights of students attending charter schools and their families are similar to the rights public school enrollees enjoy. Charter schools are not public schools, even if they are publicly funded. They are private entities acting as government contractors or agents: to make an analogy (admittedly, a bit of a stretch), private security forces working on behalf of the US Government in Iraq are not part of the military. And if the last snow day in Newark doesn't convince you of this truth, I guess nothing will.

Finally: the evidence that "turning around" these schools, or closing them, or handing them over to charter operators will lead to better results is, at best, dubious. The "turnaround" model is not particularly promising; nor is the closure model. Why, then, subject a group of Newark students who are statistically significantly different demographically to these interventions when there's very little evidence, if any, that this strategy will work?

- NPS says it made One Newark decisions based on student outcomes. But, on the whole, many of these differences are not significant.


We have to be careful here, because the sample sizes for each category of One Newark sanction are quite different, and that can influence statistical tests. But the only statistically significant difference we found between the groups was in the Renew schools, and that was on proficiency rates -- the number of students who "cleared the bar" on state tests. The MGP scores -- a measure of "growth" in tests that supposedly takes into account where students started, as a way of not penalizing schools that show growth, even if their students don't all show proficiency -- are not significantly different. In fact, the closure schools show higher average growth than the schools that aren't subject to One Newark sanctions.

Again: this isn't a comprehensive breakdown of each schools' effectiveness. What we're showing here is that if NPS is making One Newark decisions based on student outcomes that don't penalize schools for where students start, we can't find the pattern. And NPS has not published a comprehensive account of how they classified schools; until they do, their system remains in doubt.

- There is no evidence that NPS took student characteristics into account when judging schools on academic performance. This is an enormous mistake, because those characteristics have a huge impact on student achievement.

Remember junior high algebra? Remember how you could have an equation, where "y" equals something you do to "x," and you could plug in different numbers, and for every "x" there'd be one unique "y"? And you could plot that out on a graph?

Suppose we could come up with an equation to do that with student proficiency rates for a school. Suppose, if you gave me some statistics that describe a school -- percentage of students who are economically disadvantaged, percentage of students who don't speak English at home, percentage of students who have special education needs, percentage of girls vs. boys, etc. - I could plug them into an equation, and give you a pretty good prediction of what that school's proficiency rate would be. Not perfect, but pretty good.

That's basically linear regression: we can look at all of the schools in Newark, charter and NPS, and come up with an equation that allows us to predict what that school's proficiency rate will be. Again, it's not a perfect equation, because there are other things that contribute to student outcomes aside from the four variables above: testing error, other student characteristics, and yes, school effectiveness. It's also incorrect to say we always know that the variables cause the differences in outcomes; we don't. What statisticians often say is that the independent variables (here: free lunch eligibility, special education status, Limited English Proficient status, and gender*) "explain" the dependent variable (here, proficiency rates).

In our regression models, about 70% of the differences in school proficiency rates can be "explained" by these four variables**. Yet NPS apparently never took these student characteristics into account when classifying schools. And, in this case, we most certainly do know that these variables can and do "cause" a school's test-based outcomes to rise or fall. Poverty matters; so do, obviously, special education needs.  

So how could NPS possibly make good judgements about the effectiveness of a school without taking these student characteristics into account? It turns out they didn't...

- The One Newark classifications are largely arbitrary and capricious. We used a relatively advanced statistical tool -- multinomial logistic regression -- to check whether there was any pattern in the way schools were assigned for renewal, closure, or charter takeover, based on 8th Grade test-based outcomes, school growth percentiles, and student characteristics. Mostly, we couldn't find a pattern; there were a few purely test-based correlations, but those were all gone when we added in student characteristics.

Maybe NPS has additional data they are using to make their decisions. Maybe they have a methodology that they believe makes sense. Fine - NPS needs to release that data and their complete analysis before going ahead with a huge and potentially damaging restructuring like One Newark. Anything less is simply not transparent, and in today's New Jersey, transparency is desperately needed.

- There's no evidence that the charter operators that will take over several of the NPS schools will do any better with the same types of students. Let's look at one of the scatterplots from the report:


The filled-in diamonds are charter schools in Newark. The diamonds not filled in are the schools they are slated to take over. Yes, the charters do better -- they're higher up in the proportion of kids they teach who show proficiency than the takeover candidates. But look at how many fewer kids they teach that qualify for free lunch. And the charters that do better have fewer free lunch-eligible kids.

We don't know if the charters will do better than the NPS schools at teaching larger numbers of economically disadvantaged children, because they don't teach similar numbers of those children now. But there is a way to gain some insight into this:

Remember that equation we had, where we plugged in FL percentage, LEP percentage, special education percentage, and percentage of girls? We said that equation "explained" about 70% of the differences in schools -- but it doesn't explain all of the differences. The rest of those differences are called "residuals": the difference from prediction. Again, we can't say for sure what causes those differences, although it's safe to assume at least some of the difference is due to statistical noise. But some may also be due to the effectiveness of the school; some schools may "beat the odds" because they are better at getting more kids over proficiency.

I'll be the first to say that is a limited goal; there's a good argument to be made that some tactics for raising proficiency, like drilling-and-killing, aren't beneficial for students. Still, whether a school "beats the odds" with their current students is an indicator we might pay attention to as a way of ascertaining whether a charter is well-poised to "beat the odds"with different students. What do we find?



There are Bragaw and Alexander, about one-quarter of the way in from the right, "beating the odds" and doing better than prediction with their proficiency rates. Now, according to a draft of the One Newark plan published at NJ Spotlight, TEAM Academy, the local branch of the national KIPP charter chain, was being considered as the CMO to take over these two schools (the Star-Ledger reports TEAM is still working out the plans). But TEAM performs below prediction. So how does this make any sense? Where is the evidence that TEAM will do a better job with the kids who go to Hawthorne and Bragaw than the NPS schools they attend now? Especially since TEAM doesn't teach the same types of kids now?



So those are what I consider the main points of our report. Again, I think the best thing NPS can do right now is release all of its data and methodology. Let's get everything out into the open. Then the people of Newark can decide for themselves if they want One Newark.

Of course, that would mean giving them the freedom to choose how to govern their own schools...



It's way past time to have an honest discussion of why they don't have that freedom right now.



* There are a good number of education researchers who question the idea of "gender" being dichotomous: either "male" or "female." They see gender more as a continuous variable; I'm sympathetic to that view, but, for our purposes, we're using the state's data as is.

** The regressions we ran were interesting as far as Limited English Proficient status is concerned; the outcomes were not what you'd necessarily expect. I'll try to post about that some other time.

1 comment:

  1. Awesome report! Were you as annoyed as I am that facilities utilization had no relationship to assigned classifications? In CPS, they at least attempted to use a bogus utilization stat (that used 30 kids per classroom and failed to account for lower class sizes for Sp Ed and ELLs along with ancillary rooms). NPS appears to have not even bothered to pretend.

    Also, I am currently in the process of applying to the PhD program with Bruce Baker at Rutgers. Would it be ok to email you with questions about the program? Thanks again!

    ReplyDelete

Sorry, spammers have forced me to turn on comment moderation. I'll publish your comment as soon as I can. Thanks for leaving your thoughts.