I will protect your pensions. Nothing about your pension is going to change when I am governor. - Chris Christie, "An Open Letter to the Teachers of NJ" October, 2009

Sunday, July 22, 2018

SGPs: Still Biased, Still Inappropriate To Use For Teacher Evaluation

Let's suppose you and I get jobs digging holes. Let's suppose we get to keep our jobs based on how well we dig relative to each other.

It should be simple to determine who digs more: all our boss has to do is measure how far down our holes go. It turns out I'm much better at digging than you are: my holes are deeper, I dig more of them, and you can't keep up.

The boss, after threatening you with dismissal, sends you over to my job site so you can get professional development on hole digging. That's where you learn that, while you've been using a hand shovel, I've been using a 10-ton backhoe. And while you've been digging into bedrock, I've been digging into soft clay.

You go back to the boss and complain about two things: first, it's wrong for you and me to be compared when the circumstances of our jobs are so different. Second, why is the boss wasting your time having me train you when there's nothing I can teach you about how to do your job?

The boss has an answer: he is using a statistical method that "fully takes into account" the differences in our jobs. He claims there's no bias against you because you're using a shovel and digging into rock. But you point out that your fellow shovelers consistently get lower ratings than the workers like me manning backhoes.

The boss argues back that this just proves the shovelers are worse workers than the backhoe operators. Which is why you need to "learn" from me, because "all workers can dig holes."

Everyone see where I'm going with this?

* * *

There is a debate right now in New Jersey about how Student Growth Percentiles (SGPs) are going to be used in teacher evaluations. I've written about SGPs here many times (one of my latest is here), so I'll keep this recap brief:

An SGP is a way to measure the "growth" in a student's test scores from one year to the next, relative to similar students. While the actual calculation of an SGP is complicated, here's the basic idea behind it:

A student's prior test scores will predict their future performance: if a student got low test scores in Grades 3 through 5, he will probably get a low score in Grade 6. If we gather together all the students with similar test score histories and compare their scores on the latest test, we'll see that those scores vary: a few will do significantly better than the group, a few will do worse, and most will be clustered together in the middle.

We can rank and order these students' scores and assign them a place within the distribution; this is, essentially, an SGP. But we can go a step further: we can compare the SGPs from one group of students with the SGPs from another. In other words: a student with an SGP of 50 (SGPs go from 1 to 99) might be in the middle of a group of previously high-scoring students, or she might be in the middle of a group of previously low scoring students. Simply looking at her SGP will not tell us which group she was placed into.

To make an analogy to my little story above: you and I might each have an SGP of 50. But there's no way to tell, solely based on that, whether we are digging into clay or bedrock. And there's no way to tell from a students' SGP whether they score high, low, or in the middle on standardized tests.

And this is where we run into some very serious problems:

The father of SGPs is Damian Betebenner, a widely-respected psychometrician. Betebenner has written several papers on SGPs; they are highly technical and well beyond the understanding of practicing teachers or education policymakers (not being a psychometrician, I'll admit I have had to work hard to gain an understanding of the issues involved).

Let's start by first acknowledging (and as Bruce Baker pointed out years ago) that Betebenner himself believes that SGPs do not measure a teacher's contribution to a student's test score growth. SGPs, according to Betebenner, are descriptive; they do not provide the information needed to say why a student's scores are lower or higher than prediction:
Borrowing concepts from pediatrics used to describe infant/child weight and height, this paper introduces student growth percentiles (Betebenner, 2008). These individual reference percentiles sidestep many of the thorny questions of causal attribution and instead provide descriptions of student growth that have the ability to inform discussions about assessment outcomes and their relation to education quality. A purpose in doing so is to provide an alternative to punitive accountability systems geared toward assigning blame for success/failure (i.e., establishing the cause) toward descriptive (Linn, 2008) or regulatory (Edley, 2006) approaches to accountability.(Betebenner, 2009) [emphasis mine]
This statement alone is reason enough why New Jersey should not compel employment decisions on the basis of SGPs: You can't fire a teacher for cause on the basis of a measure its inventor says does not show cause.

It's also important to note that SGPs are relative measures. "Growth" as measured by an SGP is not an absolute measure; it's measured in relationship to other, similar students. All students could be "growing," but an SGP, by definition, will always show some students growing below average.

But let's put all this aside and dig a little deeper into particular matter:

One of the issues Betebenner admits is a problem with using SGPs in teacher evaluation is a highly technical issue known as measurement endogeneity; he outlines this problem in a paper he coauthored in 2015(2) -- well after New Jersey adopted SGPs as its official "growth" measure.

The problem occurs because test scores are error-prone measures. This is just another way of saying something we all know: test scores change based on things other than what we want to measure.

If a kid gets a lower test score than he is capable of because he didn't have a good night's sleep, or because he's hungry, or because the room is too cold, or because he gets nervous when he's tested, or because some of the test items were written using jargon he doesn't understand, his score is not going to be an accurate representation of his actual ability.

It's a well-known statistical precept that variables measured with error tend to bias positive estimates in a regression model downward, thanks to something called attenuation bias. (3) Plain English translation: Because test scores are prone to error, the SGPs of higher-scoring students tend to be higher, and the SGPs of lower-scoring students tend to be lower.

Again: I'm not saying this; Betebenner -- the guy who invented SGPs -- and his coauthors are:
It follows that the SGPs derived from linear QR will also be biased, and the bias is positively correlated with students’ prior achievement, which raises serious fairness concerns.... 
The positive correlation between SGP error and latent prior score means that students with higher X [prior score] tend to have an overestimated SGP, while those with lower X [prior score] tend to have an underestimated SGP. (Shang et al., 2015)
Again, this means we've got a problem at the student level with SGPs: they tend to be larger than they should be for high-scoring students, and lower than they should be for low-scoring students. Let me also point out that Betebenner and his colleagues are the ones who, unprompted, bring up the issue of "fairness."

Let's show how this plays out with New Jersey data. I don't have student-level SGPs, but I do have school-level ones, which should be fine for our purposes. If SGPs are biased, we would expect to see high-scoring schools show higher "growth," and low-scoring schools show lower "growth." Is that the case?

New Jersey school-level SGPs are biased exactly the way its inventor predicts they would be -- "which raises serious fairness concerns."

I can't overemphasize how important this is. New Jersey's "growth" measures are biased against lower-scoring students, not because their "growth" is low, but likely because of inherent statistical properties of SGPs that make them biased. Which means they are almost certainly going to be biased against the teachers and schools that enroll lower-scoring students.

Shang et al. propose a way to deal with some of this bias; it's highly complex and there are tradeoffs. But we don't know if this method has been applied to New Jersey SGPs in this or any other year (I've looked around the NJDOE website for any indication of this, but have come up empty).

In addition: according to Betebenner himself, there's another problem when we look at the SGPs for a group of students in a classroom and attribute it to a teacher.

You see, New Jersey and other states have proposed using SGPs as a way to evaluate teachers. In its latest federal waiver application, New Jersey stated it would use median SGPs (mSGPs) as a way to assess teacher effectiveness. This means the state looks at all the scores in a classroom, picks the score of the student who is right in the middle of the distribution of those scores, and attributes it to the teacher.

The problem is that students and teachers are NOT randomly assigned to classrooms or schools. So a teacher might pay a price for teaching students with a history of getting lower test scores. Betebenner et al. freely admit that their proposed correction -- and again, we don't even know if it's currently being implemented -- can't entirely get rid of this bias.

As we all know, there is a clear correlation between test scores and student economic status. Which brings us to our ultimate problem with SGPs: Are teachers who teach more students in poverty unfairly penalized when SGPs are used to evaluate educator effectiveness?

I don't have the individual teacher data to answer this question. I do, however, have school-level data, which is more than adequate to at least address the question initially. What we want to know is whether SGPs are correlated with student characteristics. If they are, there is plenty of reason to believe these measures are biased and, therefore, unfair.

So let's look at last year's school-level SGPs and see how they compare to the percentage of free lunch-eligible students in the school, a proxy measure for student economic disadvantage. The technique I'm using, by the way, follows Bruce Baker's work year after year, so it's not like anything I show below is going to be a surprise.

SGPs in math are on the vertical or y-axis; percentage free lunch (FL%) is on the horizontal or x-axis. There is obviously a lot of variation, but the general trend is that as FL% rises, SGPs drop. On average, a school that has no free lunch students will have a math SGP almost 14 points higher than a school where all students qualify for free lunch. The correlation is highly statistically significant as shown in the p-value for the regression estimate.

Again: we know that, because of measurement error, SGPs are biased against low-scoring students/schools. We know that students in schools with higher levels of economic disadvantage tend have lower scores. We don't know if any attempt has been made to correct for this bias in New Jersey's SGPs.

But we do know that even if that correction was made, the inventor of SGPs says: "We notice the fact that covariate ME correction, specifically in the form of SIMEX, can usually mitigate, but will almost never eliminate aggregate endogeneity entirely." (Shang et al., p.7)

There is more than enough evidence to suggest that SGPs are biased and, therefore, unfair to teachers who educate students who are disadvantaged. Below, I've got some more graphs that show biases based on English language arts (ELA) SGPs, and correlations with other student population characteristics.

I don't see how anyone who cares about education in New Jersey -- or any other state using SGPs -- can allow this state of affairs to continue. Despite the assurances of previous NJDOE officials, there is more than enough reason for all stakeholders to doubt the validity of SGPs as measures of teacher effectiveness.

The best thing the Murphy administration and the Legislature could do right now is to tightly cap the weighting of SGPs in teacher evaluations. This issue must be studied further; we can't force school districts to make personnel decisions on the basis of measures that raise "...serious fairness concerns..."

Minimizing the use of SGPs is the only appropriate action the state can take at this time. I can only hope the Legislature, the State BOE, and the Murphy administration listen.

Years ago, a snarky teacher-blogger warned New Jersey that test-based teacher evaluation was a disaster waiting to happen.


Here's the correlation between ELA-SGPs and FL%. A school with all FL students will, on average, see a drop of more than 9 points on its SGP compared to a school with no FL students.

Here are correlations between SGPs and the percentage of Limited English Proficient (LEP) students.  I took out a handful of influential outliers that were likely the result of data error. The ELA SGP bias is not statistically significant; the math SGP bias is.

There are also positive correlations between SGPs and the percentage of white students.

Here are correlations between students with disabilities (SWD) percentage and SGPs. Neither is statistically significant at the traditional level.

Finally, here are the correlations between some grade-level SGPs and grade-level test scores. I already showed Grade 5 math above; here's Grade 5 ELA.

And correlations for Grade 7.


1) Betebenner, D. (2009). Norm- and Criterion-Referenced Student Growth. Educational Measurement: Issues and Practice, 28(4), 42–51. https://doi.org/10.1111/j.1745-3992.2009.00161.x

2) Shang, Y., VanIwaarden, A., & Betebenner, D. W. (2015). Covariate Measurement Error Correction for Student Growth Percentiles Using the SIMEX Method. Educational Measurement: Issues and Practice, 34(1), 4–14. https://doi.org/10.1111/emip.12058

3) Wooldridge, J. (2010). Econometric Analysis of Cross Section and Panel Data (Second Edition). Cambridge, MA: The MIT Press. p. 81.

Monday, July 16, 2018

The PARCC, Phil Murphy, and Some Common Sense

Miss me?

I'll tell you what I've been up to soon, I promise. I'm actually still in the middle of it... but I've been reading and hearing a lot of stuff about education policy lately, and I've decided I can't just sit back -- even if my time is really at a premium these days -- and let some of it pass.

For example:
Gov. Phil Murphy just announced that he will start phasing out the PARCC test, our state's most powerful diagnostic tool for student achievement.

Like an MRI scan, it can detect hidden problems, pinpointing a child's weaknesses, and identifying where a particular teacher's strategy isn't working. This made it both invaluable, and a political lighting rod.
That's from our old friends at the Star-Ledger op-ed page. And, of course, the NY Post never misses a chance to take down both a Democrat and the teachers unions:
New Jersey Gov. Phil Murphy is already making good on his promises to the teachers unions. Too bad it’s at the kids’ expense.
Officially, he wants the state to transition to a new testing system — one that’s less “high stakes and high stress.” It’s a safe bet that the future won’t hold anything like the PARCC exams, which are written by a multi-state consortium. Instead, they’ll be Jersey-only tests — far easier to water down into meaninglessness.

The sickest thing about this: A couple of years down the line, Murphy will be boasting about improved high-school graduation rates — without mentioning the fact that his “reforms” have made many of those diplomas worthless.
First of all -- and as I have pointed out in great detail -- it's the Chris Christie-appointed former superintendents of Camden and Newark, two districts under state control, who have done the most bragging about improved graduation rates. These "improvements" have taken place under PARCC; however, it's likely they are being driven by things like credit recovery programs, which have nothing to do with high school testing.

The Post wants us to believe that the worth of a high school diploma is somehow enhanced by implementing high school testing above and beyond what is required by federal law. But there's no evidence that's true.

In 2016-17, only 12 states required students to pass a test to graduate; the only other state requiring passing the PARCC is New Mexico. Further, as Stan Karp at ELC has pointed out, the PARCC passing rate on the Grade 10 English Language Test in 2017 was 46%; the passing rate on the Algebra I exam was 42%. That's three years after the test was first introduced into New Jersey.

Does the Post really want to withhold diplomas from more than half of New Jersey's students?

The PARCC was never designed to be a graduation exit exam. The proficiency rates -- which I'll talk about more below -- were explicitly set up to measure college readiness. It's no surprise that around 40 percent of students cleared the proficiency bar for the PARCC, and around 40 percent of adults in New Jersey have a bachelors degree.

I don't know when we decided everyone should go to a four-year college. If we really believe that, we'll have a lot of over-educated people doing necessary work, and we'll have to more than double the number of college seats available. Anyone think that's a good idea? NY Post, should New Jersey jack up taxes by an insane amount to open up its state colleges to more than twice as many students as they have now?

Let's move on to the S-L's editorial. The idea that the PARCC is somehow the "most powerful diagnostic tool" for identifying an individual child's weaknesses, and therefore the flaws in an individual teacher's practice, is simply wrong. The most obvious reason why the PARCC is not used for diagnosing individual students' learning progress is that by the time the school gets the score back, the student has already moved on to the next grade and another teacher.

There are, in fact, many other assessment tools available to teachers -- including plenty of tests that are not designed by the student's teacher -- that can give actionable feedback on a student's learning progress. This is the day-to-day business of teaching, taught to those of us in the field at the very beginning of our training: set objectives, instruct, assess, adjust objectives and/or instruction, assess, etc.

The PARCC, like any statewide test, might have some information useful to school staff as a child moves from grade-to-grade. But the notion that it is "invaluable" for its MRI-like qualities is just not accurate. How do I know?

Because the very officials at NJDOE during the Christie administration who pushed the PARCC so hard admitted it was not designed to inform instruction:

ERLICHSON: In terms of testing the full breadth and depth of the standards in every grade level, yes, these are going to be tests that in fact are reliable and valid at multiple cluster scores, which is not true today in our NJASK. But there’s absolutely a… the word "diagnostic" here is also very important. As Jean sort of spoke to earlier: these are not intended to be the kind of through-course — what we’re talking about here, the PARCC end-of-year/end-of-course assessments — are not intended to be sort of the through-course diagnostic form of assessments, the benchmark assessments, that most of us are used to, that would diagnose and be able to inform instruction in the middle of the year.
These are in fact summative test scores that have a different purpose than the one that we’re talking about here in terms of diagnosis.
That purpose is accountability. That's something I, and every other professional educator I know, is all for -- provided the tests are used correctly.

As I've written before, I am generally agnostic about the PARCC. From what I saw, the NJASK didn't seem to be a particularly great test... but I'll be the first to admit I am not a test designer, nor a content specialist in math or English language arts.

The sample questions I've seen from the PARCC look to me to be subject to something called construct-irrelevant variance, a fancy way of saying test scores can vary based on stuff you're trying not to measure. If a kid can't answer a math question because the question uses vocabulary the kid doesn't know, that question isn't a good assessor of the kid's mathematical ability; the scores on that item are going to vary based on something other than the things we really want to measure.

As I said, I'm not the best authority on the alleged merits of the PARCC over the NJASK (ask folks like this guy instead, who really knows what he's talking about when it comes to teaching kids how to read). I only wish the writers at the Star-Ledger had a similar understanding of their own limitations:
If this were truly for the sake of over-tested students, we wouldn't be starting with the PARCC. Unlike its predecessors, this test can tell educators exactly where kids struggle and how to better tailor their lessons. It's crucial for helping to close the achievement gap between black and white students; not just between cities and suburbs, but within racially mixed districts.
Again: the PARCC is a lousy tool for informing instruction, because that's not its job. The PARCC is an accountability measure -- and as such, there is very little reason to believe it is markedly better at identifying schools or teachers in need of remediation than any other standardized test.

Think about it this way: if the PARCC was really that much better than the NJASK, we'd expect the two tests to yield very different results. A school that was "lying" to its parents about its scores on the NJASK would instead show how it was struggling on the PARCC. There would be little correlation between the two tests if one was so much better than the other, right?

Guess what?

These are the Grade 7 English Language Arts (ELA) test scores on the 2014 NJASK and 2015 PARCC, the year it was first used in New Jersey. Each dot is a school around the state. Look at the strong relationship: if a school has a low score on the NJASK in 2014, it had a low score on the PARCC in 2015. Similarly, if it was high in 2014 on the NJASK, it was high on the 2015 PARCC. 80 percent of the variation on the PARCC can be explained by last year's score on the NJASK; that is a very strong relationship.

I'll put some more of these below, but let me point out one more thing: the students who took the Grade 7 NJASK in 2014 were not the same students who took the Grade 7 PARCC in 2015, because most students moved up a grade. How did the test scores of the same cohort compare when they moved from Grade 7, when they took the NJASK, to Grade 8, when they took the PARCC?

Still an extremely strong relationship.

No one who knows anything about testing is going to be surprised by this. Standardized tests, by design, yield normal, bell-curve distributions of scores: a few kids score low, a few score high, and most score in the middle. There's just no evidence to think the NJASK was "lying" back then any more than the PARCC "lies" now.

And let me anticipate the argument about "proficiency":

Again, I've been over this more than a few times: "proficiency" rates are largely arbitrary. When you have a normal distribution of scores, you can set the rate pretty much wherever you want, depending on how you define "proficient." I know that makes some of you crazy, but it's true: there is no absolute definition of "proficient," any more than there's an absolute definition of "smart."

So, no, the NJASK wasn't "lying" about NJ students' proficiency; the state could have used the same distribution of scores from the older test* and set a different proficiency level. And no, the PARCC is not in any way important as a diagnostic tool, nor is there any evidence it is a much "better" test than the old NJASK.

Look, I know this bothers some of you, but I am for accountability testing. The S-L is correct in noting that these tests have played an important role in pointing out inequities within the education system. I am part of a team that works on these issues, and we've relied on standardized tests to show that there are serious problems with our nation's current school funding system.

But if that's the true purpose of these tests -- and it's clear that it is -- then we don't need to spend as much time or money on testing as we do now. If we choose to use test outcomes appropriately, we can cut back on testing and remove some of the corrupting pressures they can impose on the system.

ADDING: This is not the first time I've written about the PARCC fetishism.

ADDING MORE: Does it strike any of you as odd that both the NY Post and the Star-Ledger came out with similar editorials beating up Governor Murphy and the teachers unions over his new PARCC policy -- on the very same day?

As I've documented here: when it comes to education (and many other topics), editorial writers often rely on the professional "reformers" in their Rolodexes to feed them ideas. If there is a structural advantage these "reformers" have over folks like me, it's that they get paid to make the time to influence op-ed writers and other policy influencers. They are subsidized, usually by very wealthy interests, to cultivate relationships with the media, which in turn bends the media toward their point of view.

One would hope editorial boards could see this past this state of affairs. Alas...

ADDING MORE: From the NJDOE website:
a) What if my child is doing well in the classroom and on his or her report card, but it is not reflected in the test score?
  • PARCC is only one of several measures that illustrate a child’s progress in math and ELA. Report card grades can include multiple sources of information like participation, work habits, group projects, homework, etc., that are not reflected in the PARCC score, so there may be a discrepancy.
Report cards can also reflect outcomes on tests made by teachers, districts, or other vendors, administered multiple times. The PARCC, like any test, is subject to noise and bias. It is quite possible a report card grade is the better measure of an individual student's learning than a PARCC score.

If there is a disconnect between the PARCC and a report card, OK, parents and teachers and administrators should look into that. But I take the above statement from NJDOE as an acknowledgment that the PARCC, or any other test, is a sample of learning at a particular time, and it's outcomes are subject to error and bias like any other assessment.

Again: by all means, let's have accountability testing. But PARCC fetishism in the service of teachers union bashing is totally unwarranted. Stop the madness.

SCATTERPLOT FUN! Here are some other correlations between NJASK and PARCC scores at the school level. You'll see the same pattern in all grades and both exams (ELA and math) with the exception of Grade 8 math. Why? Because the PARCC introduced the Algebra 1 exam; Grade 8 students who take algebra take that exam, while those who don't take algebra take the Grade 8 Math exam.

The Algebra 1 results are some of the most interesting ones available, for a whole variety of reasons. I'll get into that in a bit...

* OK, I need to make this clear: there was an issue with the NJASK having a bit of a ceiling effect. I've always found it kind of funny when people got overly worried about this: like the worst thing for the state was that so many kids were finding the old test so easy, too many were getting perfect scores!

Whether the PARCC broke through the ceiling with construct-relevant variance is an open question. My guess is a lot of the "higher-level" items are really measuring something aside from mathematical ability. In any case, the NJASK wasn't "lying" just because more kids aced it than the PARCC.

Tuesday, May 1, 2018

What Do We Teach In America's Schools? "Hey, Honey, Sit Down and Shut Up!"

America, it's time to play Spot The Pattern!™

First, Chicago (all emphases mine):
Earlier this month, we posted a story about discipline practices inside Noble Network of Charter Schools, which educates approximately one out of 10 high school students in Chicago. One former teacher quoted in the piece described some of the schools’ policies as “dehumanizing.” 
Through the teacher, several students also agreed to communicate by text message. 
One described an issue raised by others at some Noble campuses, regarding girls not having time to use the bathroom when they get their menstrual periods. 
“We have (bathroom) escorts, and they rarely come so we end up walking out (of class) and that gets us in trouble,” she texted. “But who wants to walk around knowing there’s blood on them? It can still stain the seats. They just need to be more understanding.” 
At certain campuses, teachers said administrators offer an accommodation: They allow girls to tie a Noble sweater around their waist, to hide the blood stains. The administrator then sends an email to staff announcing the name of the girl who has permission to wear her sweater tied around her waist, so that she doesn’t receive demerits for violating dress code. 
Last year, two teachers at Noble’s Pritzker College Prep helped female students persuade administrators to change the dress code from khaki bottoms to black dress pants. Although their initiative was based in part on a survey showing that 58 percent of Pritzker students lack in-home laundry facilities, it remains a pilot program available only at the Pritzker campus.
Next, New York City:
A veteran city educator who said officials botched her sexual harassment case is calling out Mayor de Blasio for shaming victims — and omitting dozens of sexual harassment complaints from recently published city statistics.

The educator, who asked to remain anonymous because she fears retaliation, said she was sickened to hear de Blasio say this week that the Education Department substantiated less than 2% of complaints because of a "hyper-complaint dynamic" in the city agency.

"I'm certainly offended that Mayor de Blasio would say that," said the educator, who sued the city over her harassment by a supervisor and won a settlement.

"With a wife and daughter of his own, I was in shock," she added.

She called the city Education Department's investigation into her claims "a long, complicated, ugly process," that ultimately failed to bring her justice.

"No one would go through this if it were not true," she said. "It is a horrific experience. It upends your entire life."

City officials are scrambling to contain a growing sex harassment scandal in the city schools.

A tally of sex harassment complaints published by the city Friday omitted 119 Education Department complaints erased from the record because officials deemed them "non-jurisdictional."  
Figures published by the de Blasio administration on April 20 showed 471 cases of sexual harassment complaints in city schools from 2013 to 2017. But internal records kept by Education Department officials showed 590 complaints during the same period — a figure 25% higher than the number reported by de Blasio. 
Observers said it looks like the Education Department is trying to hide the facts about sex harassment cases. 
"That's exactly what's happening here," said New York City Parents Union President Mona Davids. "They covered things up and they squashed the complaints."
NYC teacher Arthur Goldstein has more on this.

Let's go to Washington:
At a roundtable with the nation’s top educators on Monday afternoon, at least one teacher told Education Secretary Betsy DeVos that her favored policies are having a negative effect on public schools, HuffPost has learned. HuffPost has also obtained video of DeVos expressing disapproval of the teachers strikes currently roiling Arizona.

DeVos met privately with more than 50 teachers who had been named 2018 teachers of the year in their states. As part of the discussion, teachers were asked to describe some of the obstacles they face at their jobs and were given the opportunity to ask the education secretary questions. 
DeVos also expressed opposition to teachers going on strike for more education funding, per video of the meeting obtained by HuffPost. DeVos made her comments after Josh Meibos, Arizona’s teacher of the year, asked her about when striking teachers will be listened to. In response, DeVos told Meibos that she “cannot comment specifically to the Arizona situation,” but that she hopes “adults would take their disagreements and solve them not at the expense of kids and their opportunity to go to school and learn.”

“I’m very hopeful there will be a prompt resolution there,” DeVos can be heard saying in the video. “I hope that we can collectively stay focused on doing what’s right for individual students and supporting parents in that decision-making process as well. And there are many parents that want to have a say in how and where their kids pursue their education, too.”

She continued, “I just hope we’re going to be able to take a step back and look at what’s ultimately right for the kids in the long term.”
When reading this, keep in mind that about three-quarters of America's teachers are women. So when DeVos tells teachers they shouldn't protest against receiving low wages, she's very much telling women to stop complaining that their pay is low compared to other professions for college-educated workers -- professions more like to employ men.

It's also worth noting that DeVos is sticking to a set of talking points about the teachers strikes that she paid for.

Back to Washington:
We all know that black girls are disciplined more harshly for the same infractions as their white peers in schools (and life), but a new study shows that part of this disparity is linked to school-uniform policies.
The National Women’s Law Center recently looked at school dress codes in Washington, D.C., and found that black girls are unnecessarily and predominantly penalized under uniform rules.  
In fact, because humans in their unconscious and implicit biases are the ones who enforce rules around dress codes, it goes without saying that sexism, racism and traditional gender roles play a part.
According to the study, black girls were found to often be in violation of dress codes for so-called infractions like being “unladylike,” “inappropriate” or “distracting to the boys around them.”
Of course, no one should expect DeVos's Department of Education to investigate racial bias in school discipline anytime soon: her crew is too busy suppressing investigations. But while the intersection of sexism and racism makes these dress codes especially pernicious for girls of color, girls of all races are regularly made to feel ashamed of their bodies while in school.

Like in Florida:
Lizzy Martinez, 17, a junior at Braden River High School in Bradenton, Fla., had been swimming and tanning all weekend at a water park in Orlando. But when Monday morning came and she had to get dressed for school, Lizzy’s bra felt painfully constricting on her burned skin. 
So she ditched the bra and purposely chose to wear something dark and loose — a long sleeve, oversize, crew neck gray T-shirt — so she wouldn’t draw attention to her chest.
But around 10 a.m., about 15 minutes into her veterinary assistance class, Lizzy was called out of the classroom for a meeting with two school officials, Dean Violeta Velazquez and Principal Sharon Scarbrough. They asked her why she wasn’t wearing a bra
She said she told her school administrators about the sunburn. They insisted that she was violating the school dress code. (The 2017-2018 Code of Student Conduct does not say bras must be worn by female students.) They told her to put on an undershirt because boys were “looking and laughing” at her, a detail she later challenged. “No one said a thing to me until I got to the dean’s office,” Lizzy said. 
She was crying and wanted to go home, so Lizzy’s mother, Kari Knop, a registered nurse, was called at work. “I said, ‘Lizzy, I’m working,’” Ms. Knop said in a phone interview. “I told her, ‘Can you just put the undershirt on and call it a day?’” 
Lizzy was embarrassed and angry but she relented. When she returned wearing the undershirt, the school principal had left. The dean, according to Lizzy, instructed her to “stand up and move around for her.” 
“I looked at her and said, ‘What do you mean?’” Lizzy said. “I was a little creeped out by that.” The school has a strict disciplinary policy and she didn’t want to appear defiant. (School officials refused to comment, except in a statement.) 
The dean told her that her nipples were still showing through her T-shirt and she should use bandages to cover them up. “She told me, ‘I’m thinking of ways I could fix this for you.’ She said, ‘I was a heavier girl and I have all the tricks up my sleeve,’” Lizzy said.  
Lizzy was given four adhesive bandages from the school clinic. “They had me ‘X’ out my nipples,” she said.
Even if you have a conservative point of view on what is and isn't appropriate for students to wear at school... you can't tell me this story isn't creepy. But this is how we tell girls to think about their bodies now.

Another story from Michigan*:
With prom season in full swing, many teens attending schools with harsh dress codes are taking to social media to call them out. This week, one school in Michigan has decided to take their policies a step further with items that they’re calling “modesty ponchos,” and the students are not having it. 
Prom night at Divine Child High School in Dearborn, Michigan is set for May 12, and the school has already announced that they would be handing a colorful poncho-like piece of fabric to all of the girls who show up wearing something that the school deems too revealing, reports Fox 2 Detroit. A student told the news source that “teachers will determine whether what they’re wearing is compliant or not when they walk in the door.” She added, “I do believe the school has gone too far with this. As we walk into prom, we are to shake hands with all the teachers and if you walk through and a teacher deems your dress is inappropriate you will be given a poncho at the door.”
To be clear: I am not against schools setting some reasonable restrictions on student dress. No student, for example, should be allow to wear clothing that has wording intended to denigrate others. Reasonable people can disagree about where the lines are. But there is, to my eye, a distinct odor of slut-shaming in many of these policies -- which goes a long way toward explaining the racist skew in how they're implemented.

So, what have we got going on in America's schools these days?

  • Girls can't use the bathroom when they have their periods.
  • Women teachers who file charges of sexual harassment are told they are "hyper-complainers."
  • Teachers -- again, most of whom are women -- are told their protests against making a pittance are "at the expense of kids."
  • Girls are told by school officials they need to cover up, because their bodies are too distracting.
America's schools are swimming in sexism. Both teachers and students suffer from the consequences of systemic misogyny.

Add to all this the hidden (and not so hidden) curricula in racism, homophobia, heteronormativity, Islamaphobia, and so on...

You know, I don't know why a social conservative like Betsy DeVos is against public schools. They seem to be transmitting exactly the values she and her ilk hold so dear.

“I think that putting a wife to work is a very dangerous thing.”- Donald Trump.

* OK, yes, Divine Child is a Catholic school. But it's not like the phenomenon of slut-shaming at the prom is restricted to private schools:

Prom is supposed to be the most magical night of your high school life — you get your hair and makeup done; you wear the gorgeous gown that makes your mom cry, "You're all grown up"; and you generally look flawless as you kiss good-bye to your awkward years. 
For these teens, prom was ruined when their outfits were banned. Check out their "inappropriate" and "immodest" choices to see for yourself that these girls look beautiful, no matter what their school says.
I don't have daughters, but if I did, I wouldn't have a problem with them wearing any of these outfits. Your mileage may vary, but that's the point: why is the school making these decisions? As one of the girls -- who is wearing what I would say is a very modest dress -- says:
"Maybe instead of teaching girls that they should cover themselves up, we should be teaching boys that we're not sex objects that they can look at."

By the way: #6 is infuriating. What is wrong with people?

Monday, April 30, 2018

Don't Blame Teachers For School Underfunding: A Data Tale From Jersey City

The animosity between NJ Senate President Steve Sweeney and the NJEA, New Jersey's largest teachers union, is already well-known. Add to that the rivalry between Sweeney and Jersey City mayor Steve Fulop, and Sweeney's desire to amend the state's school funding system... well, Sweeney's latest dig at Jersey City's teachers and board of education really shouldn't have surprised anyone:
In a statement issued Friday, state Senate President Stephen Sweeney blasted the Jersey City Board of Education for approving the agreement, which will increase district spending on teacher salaries by 3.31 percent during the current school year and 2.72 percent during 2018-19. The board approved the contract by a 5-1 vote Thursday night. 
Sweeney (D-Gloucester) said the Jersey City school district already receives more state funding than it should – district officials have dismissed this as untrue. Sweeney added that salary increases amid a $71 million shortfall in the district's proposed budget sends the wrong message to other schools.
"What makes it even worse is that the Jersey City Board of Education wrote a blank check that taxpayers in every other school district in New Jersey are going to have to reach into their pockets to pay," Sweeney said. "That's because Jersey City continues to get $151 million a year more in state aid than it would be receiving if the school funding formula was run fairly with the 10-year-old growth caps and Adjustment Aid eliminated." [emphasis mine]
Others have reported Sweeney claims Jersey City is over aided by $174 million; let's stick with the lower figure for now to be conservative (you'll see why in a minute). Sweeney arrives at this figure because Jersey City, and several other districts, benefit from a provision in the School Funding Reform Act (SFRA) called "adjustment aid." This aid was included in the original 2008 law to mitigate against the shock school districts might face when transitioning to the new formula; it keeps districts from falling below the level of aid they received prior to the new law. However, it has also led to some districts currently receiving more state aid than they would get if the provision wasn't included.

Jersey City gets a lot of adjustment aid, which likely helps it keep its local taxes lower than they would be otherwise. To illustrate, I took this chart from the Education Law Center's website†:

There really is little doubt Jersey City should be contributing more local tax revenues toward its schools; whether it can at the moment, given the state's property tax cap, is an open question.* That said, and as ELC** points out in this brief, the district is still not getting all the funding it needs, from either the state or local sources, to provide an adequate education for its students.

Which makes Sweeney's statement even more interesting. Because his clear implication is that Jersey City is giving its teachers a big raise*** on the backs of other school districts, who don't get nearly as much state aid. But he's also claiming property taxes in Jersey City are artificially low, again because of an excess amount of state aid.

Is this possible? Is Jersey City so "over-aided" that can afford big teacher salaries and low property taxes?

Again, I'll leave aside the question of taxes and instead focus on teacher salaries. Because I happen to have data available to take a reasonable stab at answering this question: Are Jersey City's teachers significantly overpaid compared to their colleagues in neighboring school districts? If not, is it really fair of Sweeney to call this recent contract irresponsible?

Let's start by looking at how much JC's teachers make compared to their colleagues in the other school districts in Hudson County (click to enlarge).

At first glance, when we look just at the average Jersey City salary compared to the rest of the county, it appears JC teachers are doing relatively well -- not spectacularly well, but well. Bayonne, Gutenberg, Weehawken and East Newark**** teachers seem to pay a serious wage penalty for not working in JC...

Or do they? One of the problems with simply comparing average (or even median) salaries is that it doesn't account for how teachers are paid in the real world. For example:

Like all public school teachers (and like many, many others in both the public and private sector), Hudson County teachers are paid more when they have more experience; this explains the upward slope of these lines, showing pays raises when teachers gain seniority. Jersey City (the dashed red line) has a slightly earlier bump up in experience than most other Hudson County districts.

However, when JC teachers reach their 30th year, their pay is rather average. In fact, the best-paying district in Hudson County, accounting for experience, appears to be Hudson County Vo-Tech. Which, again, is interesting, given Sweeney's full-throated support for vo-tech schools.****

Now, whether Jersey City is paying relatively more than other districts for its teachers also depends on how experience is distributed. So let's look at that next:

Jersey City does have a somewhat larger concentration of teachers with 15 to 19 years of experience; that might help explain a somewhat higher average salary for all JC teachers than other Hudson County districts.

But teacher pay doesn't just vary with experience. Earning an advanced degree leads to higher pay; living in a labor market that's more expensive, or pays more for teachers relative to other professions, changes pay. Keep in mind: these factors are out of control of both the Jersey City Board of Education and the Jersey City Education Association, the local union that negotiated the contract. It's ridiculous to think either party could buck trends and norms followed across the state.

So how can we determine whether Jersey City teachers are really "overpaid"? I've approached the problem using a regression model: a statistical technique that allows us to "hold things constant." Using seven years of data on every teacher in the state, I've tried to model how experience, full-time/part-time status, labor market, job description, highest degree earned, and other factors affect teacher pay (nerds, I give the details on the regression model below).

The model allows us to predict how much a teacher might earn, given all these factors. The amount above or below prediction (the residual) can't be explained by the variables in the model; we will assume, therefore, that this amount is how much each teacher is "over-" or "under-" paid, relative to other teachers in the state.

So: are Jersey City teachers way overpaid? Put simply: no, not really.
This is expressed as a ratio of actual salary over predicted salary; a ratio of "1" means the salary is exactly what the model predicts, so the teacher isn't "over-" or "under-" paid, given their experience, degree, labor market, etc.

In Jersey City in 2016-17, teachers (as a group) were paid about 3.7 percent more than prediction. That hardly makes them the most "overpaid" teachers in Hudson County: Harrison, Hoboken, Secaucus, and Hudson Vo-Tech teachers were all "overpaid" more Jersey City school staff (again, this doesn't account for administrators, nor for staff without certificates).

Let me stop here and clarify something: I am deliberately putting "under-" and "over-" paid in quotes, because this model cannot account for many other factors that would affect teacher pay. It may be that Jersey City has to pay more to attract the same quality of teacher candidate for a variety of reasons that can't be measured. Maybe teacher candidates didn't want to teach in a district that was under state control for a quarter of a century. Maybe they've heard, as I have, that the state monitors have made staff feel unappreciated. Maybe the traffic sucks.

All I'm trying to do here is provide some sort of empirical analysis to determine whether there's evidence that Jersey City teachers are the beneficiaries of the "over-aiding" of the district. To that end: let's see what the "overpayment"****** of Jersey City teachers costs the district.

I could choose all sorts of denominators to use, but let's keep this simple: how much of the total appropriations of the Jersey City Public Schools can be attributed to the "overpayment" of teachers? About 1.3 percent.

But let's get back to Senator Sweeney's complaint: how much of the "over-aiding" of Jersey City gets gobbled up by the "overpayment" of Jersey City's teachers? About 6 percent -- that's barely a blip.

The idea that Jersey City's teachers substantially benefit from of the "over-aiding" of the district is not supported by a reasonable analysis of the available data.

I'm going to run the risk of pissing off a few friends here, but let me put this on the table:

Senator Sweeney and I have a lot of disagreements. I was, like almost every other teacher in the state, extremely disappointed by his support of Chris Christie's attack on our pensions and health benefits. I think Senator Sweeney is dead wrong about the benefits -- and largely blind to the harms -- of the expansion of charter schools in Camden (call them whatever you want, they're charter schools). I also think Senator Sweeney is dead wrong on taxation.

That said: Steve Sweeney has valid concerns about New Jersey's state school aid formula. He is right to note that the growth caps have got to be addressed. He is right to state that communities like Jersey City ought to be contributing more toward the funding of their schools. He is right to champion the districts in this state that are often overlooked in the debate over school funding, yet whose students are suffering real harm due to inadequate funding.

So I'm willing to take Steve Sweeney at his word. I do believe he is concerned that there are students in New Jersey school districts who are suffering right now because they can't get adequate funding for their schools.


The idea that the students of Bayonne are being denied an adequate education because of the greed of the teachers of Jersey City is just plain wrong.

There is no evidence Jersey City teachers are wildly overpaid. There is no evidence the small bump JCEA members enjoy in their wages is a major part of the "over-aiding" of the district. I understand NJEA gave Sweeney a few bruises. But making arguments that pin the blame for the underfunding of New Jersey schools on Jersey City's teachers is not helpful in the slightest.

Look, schools cost what they cost. If you want certain outcomes, you have to pay for them (we need to have a good long talk about this idea soon...). By the state's own formula, Jersey City's schools are not over-funded.

In addition: if you want good teachers, you need to pay good wages. New Jersey actually underpays its teachers relative to the rest of the labor market. If Jersey City is paying its teachers a bit more, that's a good thing. Why come down on the district for trying to get good people to come into the profession?

Senator Sweeney, instead of slamming Jersey City's teachers for standing up for themselves and demanding decent pay...

Why don't we instead work to get all districts the funding they need to bring the best and the brightest into New Jersey's classrooms?

For the record: I am a proud NJEA member, and I am proud to stand with my fellow public school teachers in Jersey City, and everywhere else in the state.

* I really don't want to wade into this on this post, because, to be honest, I just haven't had time to look at it carefully. But some, like Jeff Bennett, argue Jersey City could increase its revenues without the state raising its property tax cap. Bennett (who, despite our policy differences, I genuinely respect) has told me Jersey City hasn't even raised its tax rates as high as it could under the current cap. I have no reason to doubt Jeff, but I haven't looked into the topic myself. 

** For the record: I have done work as a contractor for ELC in the past.

*** Something worth noting: when you see a report that teachers are getting a "... 3.31 percent during the current school year and 2.72 percent during 2018-19," understand that doesn't mean all of the teachers are getting more money. Public schools operate on salary guides, which provide a raise for every year of service up until a final "step." You need to add money into guide like that just to maintain it. So those at the "top of the guide" might actually be getting no raise, depending on how the guide is structured.

Teacher salary guides is a really complex topic; maybe I'll try to get to it at some point...

**** Actually, the East Newark data for 2016-17 looks off because a lot of the teachers who should be 1.0 full-time equivalents are listed as 0.1 FTEs. I tried as best as I could to clean up this rather obvious mistake.

***** To be clear: I join with Senator Sweeney in supporting vo-tech programs and schools. More Vo-Tech!

I just don't understand why the senator is complaining about Jersey City teachers getting a raise when they make less than the county's vo-tech school. Why isn't he blaming them for underfunding elsewhere? (OK, he shouldn't, but you get my point, right?)

****** Yes, these quotes are stupid. You have a better idea?

The Regression Model:

I have a panel of certificated staff data from 2010 to 2017. 2013 is excluded because some of the teacher characteristics data weren't included. The model I use is:
salary = f(prior_exp_years FTE i.highest_ed_comp i.metajobcode i.lmencode i.data_year i.charter charter#data_year logEnroll)

  • prior_exp_years: Total years of experience, in and out of NJ or the district.
  • FTE: Full-time equivalency.
  • highest_ed_comp: Highest degree earned.
  • metajobcode: Job description, divided into larger categories (i.e., all science teachers bundled)
  • lmencode: Labor market; I used counties. 
  • data_year: The year. 
  • charter: Whether the school is a charter. I know some of you might push back a bit, but the fact is a teacher suffers a wage penalty for working in a charter. Given that reality, it's not rational to expect Jersey City teachers to make charter school wages; in fact, there is a very good case to be made that JCPS teachers are propping up the city's burgeoning charter sector through wage free-riding
  • charter#data_year: Given the volatility of the state's charter sector, interacting it over time seemed reasonable. 
  • logEnroll: OK, so this one had me thinking. We know for a fact that school districts enjoy economies of scale. It may well be those districts then use the savings to recruit more desirable teacher candidates, or make up for recruitment hardships that can't be measured. It may also, however, be that larger districts create larger teachers unions, which leverage more bargaining power. But do districts really have much control over how big they are? Hmm... Ultimately, I kept this in the model because it matters -- but I'm open to debate. In any case, removing it does up the "overpayment" ratio for Jersey City, but only to about 1.06. That's not enough to make a serious dent in the amount JC is "over-aided."
Bad mistake in the original post: I inadvertently put Newark's LFS v. Levy chart up, not Jersey City's. Sorry about that -- correction made.