Wittayayut via shutterstock

When the State Board of Education voted earlier this month to remove standardized test scores from teacher evaluations, reaction was mixed.

The teachers’ unions were predictably pleased, while the Hartford Courant’s editorial board denounced the action, calling it a move “back to squishy teacher evals.”

“There needs to be a hammer, some way for administrators who know which teachers need to be put out to pasture, encouraged to consider a different career or given help to improve their performance,” declared the editorial. “A strong objective measure is crucial to that effort. Mushy, ill-defined subjective criteria provide no accountability.”

What’s needed, in other words, is the kind of data produced by standardized tests.

“Data, for all their limitations, are real,” added the editorial. “Data present a clear picture. Data are impartial. Data are not interested in protecting anyone. Data get to the truth. Data are not stories concocted by administrators to ‘nurture’ ineffective teachers.”

Just one problem: The “data” gleaned from standardized test scores are not nearly as clear, impartial, and truthful as the Courant’s editorial board would like to believe.

First of all, standardized tests are written to evaluate students — not teachers — so a methodology was created to translate student scores into teacher scores: the value-added measure, or VAM. “The models work by comparing a student’s estimated score on a standardized test to the student’s actual score — the difference between the two is the teacher’s VAM. The estimated score is based on past student test scores and sometimes other factors such as poverty and disability status. A teacher’s overall VAM score is computed by averaging together the value added to each of his or her individual students.”

Not every teacher can receive a VAM score, however, since it is only computed for teachers of grades and subjects that include a standardized test. In Connecticut, the SBAC test is administered in grades 3-8 and the SAT in 11th grade. So how do we evaluate teachers of students in kindergarten, 1st, 2nd, 9th, 10th, and 12th grades?

What’s more, the VAM is hardly fool-proof.

“In the 2013-14 school year, Sheri Lederman, a highly-regarded veteran elementary school teacher in Great Neck, [New York], was judged ‘ineffective’ on a specific portion of her annual evaluation, based on a [VAM] measuring how much her students improved on standardized state tests.” Lederman sued and ultimately won, the judge ruling that her VAM-based evaluation was “indisputably arbitrary and capricious.”

Justice Roger D. McDonough noted that “a teacher’s score can fluctuate dramatically from year to year, including in Lederman’s case: the year after being deemed ineffective, she scored effective based on the same statistical model. The court’s decision also said that growth models can penalize teachers of particularly low- or high-achieving students. Finally, the judge criticized the fact that this approach to evaluations creates a bell curve, ensuring that some teachers will be marked below average regardless of overall results.”

So, Hartford Courant, did “the data get to the truth” in this case?

And then there’s this: “About three-quarters of school psychologists from among [New York’s] nearly 700 school districts said state tests are causing greater anxiety than local assessments,” according to survey results in 2015. “The report contended that the test anxiety is more common at the elementary-school level, saying students more often showed ‘internalized’ symptoms such as excessive worry and withdrawal rather than demonstrating ‘externalized’ symptoms, such as increased irritability, frustration and acting out.”

This finding — combined with the fact that 20 percent of students in New York opted out of the state’s Common Core test in 2015 — should raise more than a few questions about the validity and representative nature of student test scores. Is this the type of “clear and impartial data” we want to connect to teacher evaluations?

Admittedly, value-added measures do possess potential as one future measure of teacher quality. Economists, for example, are currently “engaged in cordial debate” regarding the efficacy of VAMs.

As things stand right now, though, connecting teacher evaluations to standardized test scores is ill-advised. The Hartford Courant might call the current evaluation system “squishy,” but using student test scores only exacerbates the “squish factor.” At least that’s what the data tell us.

Barth Keck is an English teacher and assistant football coach who also teaches courses in journalism, media literacy, and AP English Language & Composition at Haddam-Killingworth High School.

DISCLAIMER: The views, opinions, positions, or strategies expressed by the author are theirs alone, and do not necessarily reflect the views, opinions, or positions of CTNewsJunkie.com.

Barth Keck is in his 32nd year as an English teacher and 18th year as an assistant football coach at Haddam-Killingworth High School where he teaches courses in journalism, media literacy, and AP English Language & Composition. Follow Barth on Twitter @keckb33 or email him here.

The views, opinions, positions, or strategies expressed by the author are theirs alone, and do not necessarily reflect the views, opinions, or positions of CTNewsJunkie.com or any of the author's other employers.