They will tell you the PARCC will help ensure that students are "college-ready." They will tell you the PARCC will "provide parents with important information." They will tell you the PARCC is "generations better" than previous standardized tests.
People are certainly entitled to their opinions, but let's be clear: at this point, there is very little evidence to back up any assertions of the PARCC's superiority. In truth, there is a great deal we don't know about the PARCC:
We don't know if the PARCC is more reliable or valid than the NJASK, or any other statewide standardized test.
Those who claim that the PARCC sample items that have been released are "better" than the questions on the old NJASK have the rest of us at a disadvantage: we never got to see the NJASK. In fact, any claims of the PARCC's superiority over the old tests fail if only because the NJASK was never properly studied; we don't really know how "good" or "bad" the NJASK actually was.
There are two major considerations for any test: validity and reliability. Validity speaks to whether the test measures what you want to measure; reliability deals with the consistency of a test's results. I've been looking, and, so far, I've found no evidence the PARCC is more reliable than any other standardized statewide test.
And we have very little information as to the external validity of the PARCC, if only because it is so new. We don't know if better results on the PARCC correlate more tightly to better outcomes in college or career. How could we? We haven't even administered the test yet!
When anyone asserts that the PARCC is somehow "better" than another test, they are offering an opinion based on personal preference. That's perfectly fine (and it's worth noting that some people's preferences are better-informed than others). But claims about PARCC's superiority over what came before it are not currently backed up by objective evidence, and PARCC's cheerleaders ought to be far more circumspect in making their claims.
We don't know if the PARCC has better predictive validity for "college and career readiness" than other standardized tests.
I'm going to make a bet right now: $50 (hey, I make a teacher's salary...) says scale scores on the PARCC and scale scores on the NJASK for individual students are highly correlated. Of course, no one with access to this data is going to take me up on this bet, because they know that a student who scores well on one standardized test will almost certainly score well on a different one.
The primary task of standardized tests is to rank and order students. If you doubt me, look at how the NJDOE is going to report the results: it's all based on how students do compared to other students.
The notion of "college and career readiness" (which I think is utterly phony anyway) isn't supposed to be tied to the ranking of students. It's supposed to be about whether students have acquired the knowledge needed to be successful adults. But ranking students is what the PARCC is designed to do; setting the proficiency levels comes later (see below).
I can guarantee you that the ranking and ordering of New Jersey's students on the PARCC will barely differ from their ranking on the NJASK.* If that's the case, what could possibly make the PARCC any "better"?
We don't know the extent the "rigor" which the PARCC is allegedly measuring is developmentally appropriate.
As parents and other stakeholders take a closer look at the sample items that have been released, they grow increasingly concerned that the PARCC is not developmentally appropriate. Russ Walsh has produced evidence that some PARCC sample tests overreach in the difficulty level of their reading passages.
There's no point in setting high standards for students if they can't reasonably achieve them. And I haven't seen any evidence that the standards the PARCC demands can be achieved by the large majority of our students.
What I do know is that tests like the PARCC must have items with various degrees of difficulty in order to create a normalized or "bell-curve" distribution of scores. This summer, a committee consisting of lord-knows-who will "benchmark" the test and set the performance levels -- after the test has been administered.
New York went through this process last year, leading to the crashing of proficiency levels and the wailing and gnashing of teeth by reformies like Governor Andrew Cuomo. He promptly decided to dump all the blame on teachers and ignore his own failure to provide adequate funding for New York's schools. This, of course, appears to be the real purpose of standardized tests: ammunition for politicians to get what they want.
It's perfectly fine to benchmark an exam after the fact. But doing so highlights the normative nature of setting proficiency levels. The criteria for setting "cut scores" -- the levels needed to show various levels of proficiency -- isn't based on some objective idea of learning; it's based on how all of the students did on the test. Which is why the cut scores are going to be set after the PARCC is administered, when the benchmarkers can see the results for each test item and determine how difficult it was.
I know this is knotty stuff: lord knows I've struggled with writing about it before. But the critical point is this: given the variation in the abilities of our students and the amount of resources we are willing to devote to public schools, it is reasonable to question whether all students can achieve the levels of "rigor" the PARCC is calling for.
That isn't a statement excusing low expectations for children: it's a statement informed by the knowledge that this test is going to rank and order students. And, logically, not everyone can be above average.
We don't know how large the bias resulting from the computerized format of the PARCC will be.
I was just at a "Take the PARCC" event last night (more on that later). Even the parents and teachers I spoke with who didn't have a problem with the content of the PARCC admitted that children who are more computer literate are going to have an advantage on this exam.
I'll leave it to others to point out the design flaws in the user interface of the PARCC (scroll windows within scroll windows?). And I'll certainly acknowledge paper tests can and do have design flaws.
But there's no question in my mind that a child with regular access to a modern computer with high-speed internet access at home will feel far more comfortable in the PARCC testing environment than a child without that access. At the very least, we ought to study the extent of this bias before we make high-stakes decisions based on PARCC results.
We don't know if the PARCC is sensitive to changes in instruction.
I know regular readers have seen this a billion times, but once again...
It is impossible to deny the correlation between socio-economic status and standardized test scores. And yet we're using these scores to make high-stakes decisions about schools and teachers and even students (yes, we are) without appropriately acknowledging this bias.
Worse: even the PARCC people admit they don't know how this test does at measuring the quality and alignment of instruction. We are attributing all sorts of causes for the variations in PARCC test scores without even knowing the extent of the relationship between school and teacher effectiveness and those scores.
Do I even have to point out how insane this is?
Look, I'm not going to defend the NJASK or any other pre-PARCC test. As I've said many times: we barely knew anything about that test, or many of the other statewide tests that were administered in the wake of No Child Left Behind.
I'll also risk alienating some of you by stating, once again, that I believe there is an appropriate and reasonable use for standardized testing, especially in the formulating of policy. I think tests results can help inform decisions, even if using them to compel decisions is totally unwarranted and, frankly, ignorant.
Lord knows my job as a researcher and blogging smart-ass would be far more difficult if I didn't have test scores to work with. Much of my work in advocating for teacher workplace rights and fair/adequate school funding and reasonable charter school policies relies on standardized test results.
But when I and others use this data, we use it appropriately, with full acknowledgment of its limitations and flaws. And we certainly don't make unsubstantiated claims about how the tests themselves are going to radically improve instruction and outcomes for students.
It's time for the PARCC cheerleaders to take a step back and think more clearly about their claims. It's time for them to start showing a little more humility and a little more healthy skepticism. It's time for them to stop holding on to arguments that have little evidence to back them up.
We know way less about the PARCC than many would have us believe. We have very little evidence that it is "better" than what came before. Let's at least wait until we've studied it before we claim otherwise.
A lack of external validity.
* One caveat: we might see the ceiling go up a bit, especially in math. More on this later.