I will protect your pensions. Nothing about your pension is going to change when I am governor. - Chris Christie, "An Open Letter to the Teachers of NJ" October, 2009

Sunday, September 23, 2018

Charter Schools Do Not Promote Diversity

Peter Greene had a useful post the other day about how to spot bad education research. One sure sign is cherry-picking: focusing on a few observations – or even just one – and then suggesting these few are representative of the whole. This tactic is a favorite among charter school cheerleaders, who will extoll X's high test scores and Y's high special education rates – without mentioning X's special education rates and Y's test scores.

Here's a recent example from New Jersey:

Earlier this month, the New Jersey Charter School Association (NJCSA) filed a motion to intervene in a lawsuit: Latino Action Network v. State of New Jersey. The lawsuit contends New Jersey has some of the most segregated public schools in the nation (it does), and proposes a series of remedies. One notable feature of the lawsuit is that it is critical of charter schools:
Because charter schools are thus required to give priority in enrollment to students who reside in their respective districts, and because they tend to be located predominantly in intensely segregated urban school districts, New Jersey’s charter schools exhibit a degree of intense racial and socioeconomic segregation comparable to or even worse than that of the most intensely segregated urban public schools. Indeed, 73% of the state’s 88 charter schools have less than 10% White students and 81.5% of charter school students attend schools characterized by extreme levels of segregation, mostly because almost all the students are Black and Latino. [emphasis mine]
As you can imagine, this didn't sit well with the NJCSA:
On Thursday, September 6, the New Jersey Charter Schools Association asked a state court judge for permission to intervene into the historic school desegregation case [Latino Action Network v. State of New Jersey] on behalf of its member schools. Charter schools are part of the desegregation solution—they are not the problem. In fact, an important tool to combat school segregation is empowering parents with meaningful public school choice. While we share the values and goals of diverse, high-performing schools that serve a broad range of students, we are intervening to address baseless attacks on charter schools and ensure that our students and families have a seat at the table. [emphasis mine]
Now that is a provocative claim: NJCSA is stating not just that New Jersey charters aren't making school segregation worse, they are actually contributing to the desegregation the state's schools. On what do they base this claim?

In the motion*, NJCSA references this data point to make their case:
Three of the most “diverse” schools in New Jersey are charter schools when measured by the probability that any two students selected at random will belong to the same ethnic group (Learning Community Charter School, The Ethical Community Charter School and Beloved Community Charter School). In the 2017-2018 school year, about 49,100 children in New Jersey were charter school students. A true and correct copy of NJCSA fact sheets are attached hereto as Exhibit B. [emphasis mine]
Attached to the motion is a document found here, published by NJCSA. Here's the relevant factoid:

I checked the claim and it is, indeed, factually correct. It's also a brazen example of cherry-picking.

I'll go through all the data below.., but even if I didn't, it should be obvious that this is an absurdly narrow way to judge the entire New Jersey charter sector. Yes, three charter schools in Jersey City are diverse by this measure -- but what about the others? How can we assess the entire sector based on three schools from one city?

NJCSA is apparently using a measure known as Simpson's Diversity Index to calculate school-level racial diversity. I'll leave aside a discussion of whether this is the best measure available or not, and instead note that the SDIs that I calculated, using data from the NJ Department of Education, showed that these three charter schools did, in fact, rank as numbers 2, 9, and 10 in the state. This means that it is more likely, relative to the other schools in New Jersey, that if you selected two students from these schools they would be of different races.

The obvious question, however, is whether these schools are typical of the entire NJ charter sector. There are several way to approach this; I'm going to present three.

First, let's look at all NJ charters, keeping in mind that they vary in the size of their enrollments. Let's rank all NJ schools by their SDI, then divide them into 10 "bins," weighting those bins by student enrollments. How would charter schools be distributed?

34 percent of New Jersey's charter students are in the least diverse schools by rank. The bottom diversity decile has, by far, the most charter students.

I am using rank here because NJCSA used it; however, there are (at least) two problems with this analysis. First, using rank can spread out measures that are clustered, making the distribution look more "flat" than it really is. Second, we can't see how charters compare with public district schools in diversity.

So here's a histogram that shows compares how charter students and public district students are distributed into schools of differing diversities:

This takes a little explaining, so hang with me. The SDI in New Jersey varies from 0 (the least diverse school) to .76 (the most diverse school). I've divided all the students in New Jersey into 10 bins again; then I marked whether they were in charter or public schools. The green bars represent students in public district schools; the clear bars are the charter students.

The bar on the far left represents the least diverse schools. About 1 percent of the students in public district schools are in the least diverse schools. But 11 percent of charter students are in the least diverse schools by SDI. You can clearly see similar disparities for the next two bars.

This switches around at the other end of the graph, where the most diverse schools are. A greater proportion of public district school students are in the most diverse schools; a greater proportion of charter school students are in the least diverse schools.

The graph above is admittedly tough to wrap your head around. Let's make it simple: we'll divide all students into those who attend schools that are above average in diversity, and those who attend schools that are below average in diversity. How does that play out?

On average, New Jersey's charter school students attend schools that are less diverse than public, district school students.

Look, I'll be the first to say the using Simpson's Diversity Index as a measure of school diversity has its limitations. But NJCSA chose the metric – and then they cherry-picked their results.

If you want a seat at the table when it comes to addressing the serious problems New Jersey has with school segregation, you should be prepared to contribute positively and meaningfully. Stuff like this doesn't help.

* I was sent the motion by one of the parties involved. Can't find a copy on the internet, though, including the NJCSA website. If someone can direct me to a link, I'll add it.

Tuesday, September 4, 2018

An Open Letter to NJ Sen. Ruiz, re: Teacher Evaluation and Test Scores

Tuesday, September 4, 2018

The Honorable M. Teresa Ruiz
The New Jersey Senate
Trenton, NJ

Dear Senator Ruiz,

As thousands of New Jersey teachers are heading back to school this week, this is an excellent time to address your recent comments about changes Governor Murphy's administration has made to state rules regarding the use of test scores in teacher evaluations.

As you know, the Murphy administration has announced that test score growth, as measured in median Student Growth Percentiles (mSGPs), will now count for 5 percent of a relevant teacher's evaluation, down from 30 percent during the Christie administration.

Here is your complete statement on Facebook:
State Sen. President Steve Sweeney and I are deeply disappointed that the administration is walking away from New Jersey's students by reducing the PARCC assessment to count for only five percent of a teacher’s evaluation. These tests are about education, not politics. We know teacher quality is the most impactful in-school factor affecting student achievement. That is why we were clear when developing TEACHNJ and working with all education stakeholders that student growth would have a meaningful place within evaluations. Reducing the use of Student Growth Percentile to five percent essentially eliminates its impact. It abandons the mission of TEACHNJ without replacing it with a substantive alternative. In fact, a 2018, a Rand study concluded that, ‘Teaching is a complex activity that should be measured with multiple methods.’ These include: student test scores, classroom observation, surveys and other methods. This is the second announcement in a series concerning the lowering of standards for our education professionals and students. We look forward to the department providing data as to why these decisions are being made and how they will benefit our children. Every child deserves a teacher who advances their academic progress and prepares them for college and career readiness. We must provide the data and resources for all our teachers to excel and ensure every student has the opportunity to realize their fullest potential. No one should see this move as a ‘Win.’ This is a victory for special interests and a huge step backward towards a better public education in New Jersey.”
Senator, as both a teacher and an education researcher, I share your commitment to providing New Jersey's children with the best possible public education system. I certainly agree that teachers are important, although, as Dr. Matt DiCarlo of the Shanker Institute has noted, the claim that teacher quality is the most important in-school factor affecting student outcomes is highly problematic.

I'll leave aside a discussion of this for now, however, to focus instead on the idea that reducing the weight of SGPs in a teacher's evaluation is somehow "a huge step backwards." To the contrary: when we consider the evidence, it is clear that the way New Jersey has been using SGPs in teacher evaluations until now has been wholly inappropriate. Governor Murphy's policy, therefore, can only be described as an improvement.

Allow me to articulate why:

- SGPs are descriptive measures of student growth; they do not show how teachers, principals, schools, or many other factors influence that growth. If anyone doubts this, they need only read the words of Dr. Damian Betebenner, the creator of SGPs:
Borrowing concepts from pediatrics used to describe infant/child weight and height, this paper introduces student growth percentiles (Betebenner, 2008). These individual reference percentiles sidestep many of the thorny questions of causal attribution and instead provide descriptions of student growth that have the ability to inform discussions about assessment outcomes and their relation to education quality.(1)
- You can't hold a teacher accountable for things she can't control. Senator, in your statement, you imply that student growth should be a part of a teacher's evaluation. But a teacher's effectiveness is obviously not the only factor that contributes to student outcomes. As the American Statistical Association states: "...teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions."(2)

Simply put: a teacher's effectiveness is a part, but only a part, of a child's learning outcomes. We should not attribute all of the changes in a student's test scores from year-to-year solely to a teacher they had from September to May; too many other factors influence that student's "growth."

- SGPs do not fully control for differences in student characteristics. In 2013, then Education Commissioner Chris Cerf claimed that an SGP "... fully takes into account socio-economic status." (3) Repeated analyses (4), however, show he was incorrect; SGPs do, in fact, penalize teachers and schools who teach more students who qualify for free lunch, a marker of socio-economic disadvantage.

For example:

This scatterplot shows a clear and statistically significant downward trend in schoolwide math SGPs as the percentage of free lunch-eligible students grows. A school where all of the students are eligible for free lunch will have, on average, a math SGP 14 points lower than a school where no students qualify for free lunch.

I have many more examples of this bias, using recent state data, here.

- The bias in SGPs is due to a statistical property acknowledged by its inventor; there is no evidence it is due to schools or teachers serving disadvantaged children being less effective. In a paper by Betebenner and his colleagues (5), the authors acknowledge SGPs have statistical properties that cause them to be biased against students (and, therefore, their teachers) with lower initial test scores. The authors propose a solution, but acknowledge it cannot fully correct for all the biases inherent in SGPs.

Further: there has been, to my knowledge, no indication that NJDOE is aware of this bias or has taken any steps to correct it. To be blunt: New Jersey should not be forcing districts to make decisions based on SGPs when they have inherent statistical properties that make them biased – especially when there is no indication that the state has ever understood what those properties are.

- SGPs are calculated through a highly complex process; it is impossible for any layperson to understand how their SGP was determined. SGPs are derived from a quantile regression model, a complicated statistical method. As researchers at the University of Massachusetts, Amherst (6) note:
Clauser et al. (2016) surveyed over 300 principals in Massachusetts to discover how they used SGPs and to test their interpretations of SGP results. They found over 80% of the principals used SGPs for evaluating the school, over 70% used SGPs to identify students in need of remediation, and almost 60% used SGPs to identify students who achieved exceptional gains. These results suggest SGPs are being used for important purposes, even though they are full of error. The study also found that 70% of the principals misinterpreted what an average SGP referred to, and 70% incorrectly identified students for remediation based on low SGPs, when they actually performed very well on the most recent year’s test. Extrapolating from this Massachusetts study, it is likely SGPs are leading to incorrect decisions and actions in schools across the nation. (emphasis mine)
It is worth noting the authors could not find any empirical studies to support the use of SGPs in teacher evaluation.

Senator, in my opinion, one of the problems with TEACHNJ is that it mandates that school districts make high-stakes personnel decisions on the basis of SGPs, which are biased, prone to error, and unvalidated as teacher evaluation tools. SGPs could, in fact, be useful for teacher evaluation if they informed decisions, rather than forced them.

Principals might use the information from SGPs to select teachers for heightened scrutiny when conducting observations. Superintendents might use school-level SGPs to check whether their district's schools vary in their growth outcomes. The state might use SGPs as a marker to determine whether a school district's effectiveness needs to be looked at more carefully.

But when the state forces a district to make a high-stakes decision by substantially weighting SGPs in a teacher's evaluation, the state is also forcing that district to ignore the many complexities inherent in using SGPs. For that reason, minimizing the weight of SGPs was, in fact, a "win" for New Jersey public schools, and for the state's students.

As always, Senator, I am happy to discuss these and any other issues regarding teacher evaluation with you at any time.


Mark Weber
New Jersey Public School Teacher
Doctoral Candidate in Education Policy, Rutgers University


1) Betebenner, D. (2009). Norm- and Criterion-Referenced Student Growth. Educational Measurement: Issues and Practice, 28(4), 42–51. https://doi.org/10.1111/j.1745-3992.2009.00161.x (emphasis is mine)

2) American Statistical Association. (2014). ASA Statement on Using Value-Added Models for Educational Assessment. Retrieved from http://www.amstat.org/asa/files/pdfs/POL-ASAVAM-Statement.pdf

3) https://www.wnyc.org/story/276664-everything-you-need-know-about-students-baked-their-test-scores-new-jersy-education-officials-say/

4) See:

- Baker, B.D. & Oluwole, J (2013) Deconstructing Disinformation on Student Growth Percentiles & Teacher Evaluation in New Jersey. Retrieved from:

- Baker, B.D. (2014) An Update on New Jersey’s SGPs: Year 2 – Still not valid! Retrieved from: https://schoolfinance101.wordpress.com/2014/01/31/an-update-on-new-jerseys-sgps-year-2-still-not-valid/

- Weber, M.A. (2018) SGPs: Still Biased, Still Inappropriate To Use For Teacher Evaluation. Retrieved from: http://jerseyjazzman.blogspot.com/2018/07/sgps-still-biased-still-inappropriate.html

5) Shang, Y., VanIwaarden, A., & Betebenner, D. W. (2015). Covariate Measurement Error Correction for Student Growth Percentiles Using the SIMEX Method. Educational Measurement: Issues and Practice, 34(1), 4–14. https://doi.org/10.1111/emip.12058

6) Sireci, S. G., Wells, C. S., & Keller, L. A. (2016). Why We Should Abandon Student Growth Percentiles (Research Brief No. 16–1). Center for Educational Assessment, University of Massachusetts. Amherst. Retrieved from https://www.umass.edu/remp/pdf/CEAResearchBrief-16-1_WhyWeShouldAbandonSGPs.pdf