William J. Reese is a professor of educational policy studies and history at the University of Wisconsin, Madison, and the author of “Testing Wars in the Public Schools: A Forgotten History.” This op-ed in the New York Times appeared April 20, 2013.
FOR the nearly 50 million students enrolled in America’s public schools, tests are everywhere, whether prepared by classroom teachers or by the ubiquitous testing industry. Central to school accountability, they assume familiar shapes and forms. Multiple choice. Essay. Aptitude. Achievement. NAEP, ACT, SAT.
To teachers everywhere, the message is clear: Raise test scores. No excuses. The stakes are very high, as the many cheating scandals unfolding nationally reveal, including most spectacularly the recent indictment of 35 educators in Atlanta.
But we should also be wondering, where did all this begin? It turns out that the race to the top has a lot of history behind it.
Members of the Boston School Committee fired the first shots in the testing wars in the summer of 1845. Traditionally, an examination committee periodically inspected the local English grammar schools, questioned some pupils orally, then wrote brief, perfunctory reports that were filed and forgotten.
Many Bostonians smugly assumed that their well-funded public schools were the nation’s best. They, along with many visitors, had long praised the local system, which included a famous Latin school and the nation’s first public high school, founded in 1821.
Citizens were in for a shock. For the first time, examiners gave the highest grammar school classes a common written test, conceived by a few political activists who wanted precise measurements of school achievement. The examiners tested 530 pupils — the cream of the crop below high school. Most flunked. Critics immediately accused the examiners of injecting politics into the schools and demeaning both teachers and pupils.
The testing groundwork was laid in 1837, when a lawyer and legislator in Massachusetts named Horace Mann became secretary of the newly created State Board of Education, part of the Whig Party’s effort to centralize authority and make schools modern and accountable. After a fact-finding trip abroad, Mann claimed in 1844 in a nationally publicized report that Prussia’s schools were more child-friendly and superior to America’s. Boston’s grammar masters, insulted, attacked Mann in print, and he returned the favor. In December, some Whig reformers, including Mann’s close friend Samuel Gridley Howe, were elected to the School Committee and soon landed on the examining committee.
Howe masterminded the use of written tests. His committee arrived at Boston’s grammar schools with preprinted questions, which angered the masters and terrified students. Pupils had one hour to write down their answers on each subject to questions drawn from assigned textbooks.
The examiners explained in a lengthy report that they wanted “positive information, in black and white,” to reveal what students knew. For further comparison, Howe’s committee gave the same test in towns outside of Boston, including Roxbury, then a prosperous suburb.
All summer, Howe and his colleagues hand-graded the tests, evaluating 31,159 responses. The average score was 30 percent. The committee wrote a searching commentary on the outcome and prepared tables ranking the schools by average score. They all fell short of the standard achieved in Roxbury.
At the School Committee meeting in August, when masters were traditionally reappointed, the three examiners presented their findings. Newspapers were packed with editorials and letters to the editor, attacking or praising Howe’s report, which Mann publicized in his influential Common School Journal. Urban districts across the nation started giving similar tests.
The examiners’ report lambasted the schools. “Some of the answers are so supremely absurd and ridiculous,” the committee noted, that one might think the pupils were “attempting to jest with the Committee.” Pupils had memorized material they often did not understand. Those who could repeat lines from the famous poem “Thanatopsis” could not define the word in the title. Students could not explain whether Lake Ontario flowed into Lake Erie or the other way around. Anyone who has ever listened to children who just took a standardized test can imagine their consternation.
The examiners believed that the teacher made the school, a guiding assumption in the emerging ethos of testing. Tests, they said, would identify the many teachers who emphasized rote instruction, not understanding. They named the worst ones and called for their removal.
Controversies surrounded every aspect of the exam. Examiners caught one master leaking questions to his pupils. They censured the head teacher in the segregated Smith School for not seeing potential in African-American children, whose scores were abysmal. They asked why Boston had fallen behind a suburban rival. They presciently suggested that tests would one day compare schools across national boundaries.
Anticipating an angry reaction from parents, Mann told Howe to deflect criticism from the examiners by blaming the masters for low scores. While the School Committee fired a few head teachers, parents nevertheless accused Howe of deliberately embarrassing the pupils and bounced him out of office in the next election.
Other reformers took Howe’s place, and the testing continued. No one could explain, however, why some schools did better than others. Statistical analysis was in its infancy, and written examinations documented unequal results without easy solutions.
“Comparison of schools cannot be just,” the chairman of the examining committee wrote in 1850, “while the subjects of instruction are so differently situated as to fire-side influence, and subjected to the draw-backs inseparable from place of birth, of age, of residence, and many other adverse circumstances.”
What can we learn from the advent of what we learned to call “high-stakes testing”? What transpired then still sounds eerily familiar: cheating scandals, poor performance by minority groups, the narrowing of the curriculum, the public shaming of teachers, the appeal of more sophisticated measures of assessment, the superior scores in other nations, all amounting to a constant drumbeat about school failure.
Raising test scores is still the mantra of every school reformer, whatever their motivations. Yet the same issues that plagued 19th-century Boston remain. Poor children lag and affluent parents patronize the most exclusive schools to separate their children from anyone labeled “below average.” The survival instinct encourages many teachers to teach to the test, relying on the rote methods that the original exams sought to expose.
The members of Howe’s committee were mesmerized by the charms of numbers, tables and ranked lists, but they also warned that schools performed many important tasks, not easily measured statistically, like teaching norms of civility and good citizenship. And what the public wants from its schools has only grown.
We have come a long way since the summer of 1845. Public education, then in its infancy, is now universal. Testing yields essential, valuable knowledge about school performance, but its exaggerated use distorts teaching and ignores the broader purpose of education. As Howe’s committee insisted, test results should not be the full and final judgment on schools and their teachers. There is more to a child’s education than “positive information, in black and white.”