Home ]

Haney testimony

 
 

Summary of Haney Comments
Gardner Auditorium
State House, Boston.
June 20, 2001


Dr. Walter M. Haney
Center for the Study of Testing, Evaluation and Educational Policy
Lynch School of Education
BOSTON COLLEGE
Chestnut Hill MA 02467
Ph. 617-552-4521
Email: haney@bc.edu



I Introduction

Thank you for opportunity to speak with you. Let me start by providing
a brief biographical sketch.
(For more see http://www2.bc.edu/~haney/)

By way of introduction, I would like to make one general point:

Test results should not be used in isolation to make important decisions
about institutions or individuals.

- Use of standardized test results in isolation to make decisions about
students, teachers and/or schools is contrary to professional standards
regarding test use. (See for example the statement of AERA, http://www.aera.net/about/policy/stakes.htm)

- Regarding folly of using annual test results to rate institutions, see
recent working paper by Kane and Staiger (2001) from National Bureau of
Economic Research (http://www.nber.org/papers/w8156). They show that
annual school test results are mainly random noise - not meaningful
indication of school quality.

- Regarding use of test results to make decisions about individuals,
decades of research on college admissions testing show that it is far
more sound (more valid and with smaller adverse impact on minorities and
females) to make decisions flexibly using test scores, grades and other
information rather than to make decisions mechanically based on test
scores alone.

- Finally, if nothing else, the recent expose of widespread errors in
test scoring and reporting (Henriques & Steinberg, NYT, May 20, 2001;
Steinberg & Henriques, May 21, 2001) should make clear how unwise it is
to make important decisions based on test scores in isolation.

II Weaknesses in MCAS
Even well developed tests should not be used in isolation to make
important decisions, but the MCAS, in particular, is not suitable for
making such decisions.

The MCAS tests have been hastily developed and are of doubtful technical quality.

Low- tech tests (that is, paper-and-pencil tests in which students have
to write long hand), and the MCAS in particular, seriously underestimate
the skills of students used to working on computers. (Haney & Russell,
2000; Russell & Plati, 2001)

Categorical designations based on MCAS results (of students, as
advanced, proficient, needing improvement, or failing), and of schools
(as exemplary or low performing) have no reasonable scientific basis.

Students supposedly "failing" MCAS have been shown to score well or
national tests, such as the Stanford-9, Explore and the Preliminary
Scholastic Aptitude Test (see, for example, figure 9 in Horn, et al,
2000, a copy of which is attached to these notes).


III The myth of the Texas miracle in education
Experience with high stakes testing in other places (such as Texas) has
shown that, while in the short-term gains may be apparent, the longer
term consequences can be highly negative, in terms of: increased rates
of students leaving school before graduation; undermining of academic
skills of students who remain in school, and loss of good teachers from
public schools.

A long account of the "myth of the Texas miracle in education" was
published in the Education Policy Analysis Archives, a peer-reviewed
scholarly journal published on the Internet at
http://epaa.asu.edu

The article can be accessed directly at

http://epaa.asu.edu/epaa/v8n41/

More recently, a summary and update to this work (as draft version 5) is
available at:

http://www.law.harvard.edu/civilrights/publications/dropout.html

A shorter version is appended to these notes. Here I summarize just two
major findings from this work.

When I first started studying education in Texas more than three years
ago, it quickly became apparent that the official statistics on
drop-outs reported by the Texas Education Agency (TEA) were highly
misleading (see Haney 2000, 2001, for a full explanation of why I became
suspicious of the TEA dropout statistics). Hence, I decided to study
data on grade enrollments and graduates in Texas. Thanks to help from a
number of generous people, I was able to assemble a set of TEA data on
the numbers of students enrolled by grade in Texas, and the numbers of
high school graduates from 1975-76 to 1998-99. Using these data I was
able to analyze rates of progression from grade to grade and from
various grades to graduation. I summarize just two such analyses here.
In analyzing enrollment data for Texas over the last 25 years, one of
the most striking patterns I found related to rates of progression from
grade 8 to 9. Specifically, I calculated and graphed the numbers of
students in grade 9 in each academic year divided by the numbers of
students in grade 8 the previous year.

Results are shown in Figure 1. The reason why these grade 8 to 9
progression ratios are greater than 1 is because some students are
failed in grade 9 and have to repeat the grade. As can be seen from
Figure 1, in the late 1970s the progression ratios for Black, Hispanic
and White students in Texas were similar - less than 1.10, implying that
less than 10% of grade 9 students were flunked to repeat grade 9. Since
then, however, the progression ratios for minority and White students
have diverged sharply. By 1998-99, the grade 8 to 9 progression ratio
for Whites had increased only slightly to about 1.10. But for Black and
Hispanic students in Texas, the ratio had increased to about 1.30,
evidence that 25 to 30% of Black and Hispanic students were being
flunked to repeat grade 9. (Available evidence indicates that the vast
majority of students who are failed to repeat grade 9 do not persist in
school to high school graduation). This pattern, with the grade 9
failure for Black and Hispanic students in Texas increasing from being
roughly equivalent to the rate for Whites in the late 1970s, to being
triple the rate for Whites by the late 1990s, clearly suggests that
educational inequities have increased sharply in Texas over the last two decades.
In another series of analyses, I sought to study the rates at which
students in Texas graduated from high school. In one set of analyses, I
calculated progress from grade 6 to high school graduation 6.5 years
later for the Texas high school classes of 1982 to 1999 simply in terms
of numbers of students (that is, total numbers of Black, Hispanic and
White students).
Results are shown in Figure 2. Also shown in this figure are the
differences, that is the numbers of students who do not make it from
grade 6 to high school graduation 6.5 years later. As can be seen, the
numbers of children lost between grade 6 and high school graduation in
Texas were in the range of 50 to 60 thousand for the classes of 1982 to
1986. The numbers of lost children started to increase for the classes
of 1986 and 1987 and jumped to almost 90 thousand for the class of 1991.
For the classes of 1992 through 1999, in the range of 75 to 80 thousand
children are being lost in each cohort.



Cumulatively for the classes of 1991 through 1999, there were a total of
2,226,003 White, Black and Hispanic students enrolled in grade 6 (in the
academic years 1984-85 through 1992-93). The total number graduating
from these classes was 1,510,274. In other words, for the graduating
classes of 1991 through 1999, 715,729 children in Texas or 32% were lost
or left behind before graduation from high school. Independent analyses
by Balfanz and Letgers (2001, available at:
http://www.law.harvard.edu/civilrights/publications/dropout.html)
confirm that there are large numbers of high schools in Texas with very
low "holding power."

IV Conclusion
In conclusion, MCAS results should not be used in isolation to determine
high school graduation or to make any other high stakes decisions about
individuals or institutions. To do so is to violate professional
standards, to ignore available research, and almost surely to undermine
the quality of public education in the Commonwealth.

References
Balfanz, R. & Letgers, N. (2001). How many central city high schools
have a severe dropout problem, where are they located and who attends
them? Paper presented at the "Dropout Research: Accurate Counts and
Positive Interventions" Conference Sponsored by Achieve and the Harvard
Civil Rights Project, January 13, 2001, Cambridge MA. (Available at: http://www.law.harvard.edu/civilrights/publications/dropout.html.)
Haney, W. (2000). The myth of the Texas miracle in education. Education
Policy Analysis Archives Volume 8 Number 41, August 19, 2000. Published
on the WWW at: http://epaa.asu.edu/epaa/v8n41/. (A printed version of
this monograph is distributed by the Harvard Education Publishing Group.)
Haney, W. (2001). Revisiting the myth of the Texas miracle in education:
lessons about dropout research and dropout prevention. (Draft v. 5)
Revision of paper presented at the "Dropout Research: Accurate Counts
and Positive Interventions" Conference Sponsored by Achieve and the
Harvard Civil Rights Project, January 13, 2001, Cambridge MA. (Available
at: http://www.law.harvard.edu/civilrights/publications/dropout.html.)
Henriques, D. & Steinberg, J. (May 20, 2001). Right answer, wrong
score: Test flaws take toll, New York Times, p. 1. Available at http://www.nytimes.com/2001/05/20/business/20EXAM.html.
Horn, C., Ramos, M., Blumer, I., & Madaus, G. (2000). Cut scores:
Results may vary. Monograph of the National Board on Testing and Public
Policy Volume 1, No. 1, April 2000.
Russell, M. & Haney (2000). Bridging the gap between technology and
testing. Education Policy Analysis Archives Volume 8 Number 41, March
28, 2000. Available on the WWW at: http://epaa.asu.edu/epaa/v8n41/
Russell, M. & Plati, T. (2001). Effects of Computer Versus Paper
Administration of a State-Mandated Writing Assessment. Teachers College
Record On-line. Available at http://www.tcrecord.org/Content.asp?ContentID=10709
Steinberg, J. & Henriques, D. (May 21, 2001). When a test fails the
schools, careers and reputations suffer, New York Times, p. 1.
Available at: http://www.nytimes.com/2001/05/21/business/21EXAM.html.

The Tragedy behind the Texas Education "Miracle" and the Hazards of High
Stakes Testing

By Walt Haney

On thing the two candidates in the first presidential debate of the new
millennium agreed on: America's schools need more high stakes
standardized testing. Gov. Bush proposes testing students in every
grade to see if they should be promoted and graduate from high school.
Seemingly trying to outdo Bush, Vice President Gore proposed testing all
prospective teachers in the nation to determine if they are competent.
But a close look at the recent educational history of Texas - probably
the most test-crazed state in the nation - shows the hazards of such
mania for high stakes testing.
First, I acknowledge that Texas has been cited recently by observers
from organizations as diverse as RAND, the American Federation of
Teacher and the Heritage Foundation, as an example of successful
standards-based school reform: The Texas approach to high stakes testing
has been proposed as a model for other states and for federal
legislation, and its apparent results even described as a "miracle."
Having spent the last two years sifting through the relevant data, I
have become convinced that these claims are ill founded. Indeed, the
facts diverge so sharply from popular assumptions that the Texas
experience might be said to offer a model for how not to pursue
education reform - and how not to take claims of success based on test
scores alone at face value.
Second, a bit of background. In 1971, a federal court ruled that the
system of financing public schools in Texas unconstitutionally
discriminated against students living in poor districts. Although the
U.S. Supreme Court reversed the decision in 1973, the case helped spur
the Texas legislature to pass the Equal Educational Opportunity Act,
which established the first state-mandated testing program. This was the
Texas Assessment of Basic Skills (TABS), without sanctions for test
takers, which was used from 1980 to 1985. Following recommendations of
a Select Committee on Education (chaired by H. Ross Perot), the
legislature passed a comprehensive education reform law that established
a statewide curriculum, required students to achieve a score of 70 to
pass their high school courses, mandated a "no pass, no play" rule for
athletes, required teachers to pass a proficiency test, and mandated a
new basic skills testing program. The Texas Educational Assessment of
Minimum Skills (TEAMS) was implemented in 1985 and tested students in
all the odd-numbered grades from first to eleventh. High school
students were required to pass the TEAMS exit exam in order to receive a
high school diploma.
Five years later, TEAMS was replaced by yet another testing program, the
Texas Assessment of Academic Skills (TAAS), which consists of mostly
multiple-choice items together with one short essay on the writing exam.
Again, if students do not pass the exit level tests in reading,
writing, and math, they cannot graduate from high school, regardless of
their course grades. (The passing scores were set after review of group
performance on TAAS but without any reference to performance criteria
external to that test-in effect setting a norm-referenced standard for
passing.) Moreover, the reputation, funding, and even existence of
schools are contingent on their TAAS scores. By law, the State Board of
Education must rate the performance of schools and districts, and
performance data must be reported by ethnicity and socioeconomic status.

Four claims have been offered to suggest that test-driven school reform
has had a positive impact in Texas: (1) sharp increases in the overall
pass rates on TAAS during the 1990s, (2) apparent confirmation of TAAS
gains by results on the National Assessment of Educational Progress
(NAEP), (3) apparent decreases in the achievement gap between White and
minority students, and (4) fewer students dropping out of school.
On closer examination, however, there is reason to doubt all of these
claims - and, with respect to dropouts, to conclude that high-stakes
testing in Texas has actually made things considerably worse. Before
explaining why, I should point out that the policy of Texas (and a
growing number of other states) to deny diplomas to students on the
basis of a single test violates the recommendation of the National
Academy of Sciences and other professional organizations. The wisdom of
that recommendation is strongly supported by the limitations of TAAS
itself. My calculations suggest that the standard errors of measurement
for those tests be between 20 and 40 percent higher than the estimates
offered by the state. (The fact that students are permitted to retake
the high school exit exam does not change the fact that everything rests
on the results of a single test. In any case, the data suggest that the
opportunity to take TAAS eight or more times may be largely theoretical:
students who fail the TAAS grade 10 test more than once or twice are
likely to drop out of school.)
The major pillar on which claims of a Texas miracle rest is that TAAS
pass rates increased substantially during the 1990s. The percentage of
grade 10 students who met or exceeded the (arbitrarily determined)
passing scores on tests of reading, writing, and math jumped from 52% in
1994 to 72% in 1998. But this means much less than meets the eye.
Here's why:
* Doubtful validity of TAAS: Texas data reveal that TAAS scores have a
significantly lower correlation with course grades than previous studies
have shown other tests to have. Also, the TAAS reading scores are more
highly correlated with TAAS math scores than they are with TAAS writing
scores. This casts doubt on the validity and reliability of the test.
* Changes in the test-taking population: A major portion of the
apparent rise in the proportion of tenth graders passing the TAAS can be
explained by the number of students who have been excluded from testing.
The numbers and percentages of students taking the grade 10 TAAS, but
classified as "in special education" (so their scores are excluded from
school accountability ratings) nearly doubled between 1994 and 1998.
Moreover, as I explain below, many other students have simply dropped
out of high school in Texas. As best as I can determine, a quarter to a
half of the increase in TAAS exit level pass rates can be explained
simply by looking at special education exemptions and drop-outs.
* Teaching to the test: Anyone familiar with recent educational
history must view with some skepticism any reports of increases in
performance that follow the introduction of a new testing program.
After a few years, students and teachers become familiar with its
content and format. With that familiarity - and explicit coaching -
invariably comes a rise in average test scores. Very simply, the
practice of teaching to the test, which is as rampant in Texas as
anywhere in the country, seriously compromises the validity of that
test. Rising scores is no assurance of greater learning.
* Narrowing and distorting the curriculum: Higher TAAS scores not only
become less meaningful as a result of widespread teaching to the test,
but may actually be cause for concern to the extent that they result
from slighting disciplines, topics, and projects not tested. Moreover,
as Linda McNeil of Rice University has documented, the most devastating
effects of high-stakes testing seem to be visited upon low-achieving and
minority students, some of whom now spend the better part of the school
year drilling on the content and format of the TAAS.
These findings indicate that most of the dramatic gains reported in TAAS
scores are simply not real. That conclusion is corroborated by Texas
secondary students' performance on the SAT, which has not improved since
the early 1990s, at least in comparison with national results.
(Incidentally, the relatively poor SAT scores in Texas cannot be
attributed to the proportion of high school students in that state who
choose to take the test.) Moreover, pass rates on Texas's "college
readiness" test (yes, despite having a high school graduation test, and
college admissions tests, Texas also has a college readiness test) fell
from 78% in 1993 to 32% in 1998.
*
Faced with questions about the meaning of apparent gains in TAAS
scores, some defenders of high-stakes testing have responded by pointing
to NAEP scores. Indeed, results from 1996 revealed significant gains in
the percentage of Texas students scoring at the proficient or advanced
level, at least in mathematics. Again, however, the reality isn't
nearly so simple.
For starters, the Texas NAEP gains are in the range of 0.12 to 0.33
standard deviation units, far less impressive than the gains on TAAS,
which are in the range of 0.43 to 0.72. The dramatic rise in TAAS
scores during the 1990s therefore is not confirmed by NAEP. In fact,
the improvement in NAEP scores that did take place simply brought Texas
students' achievement up to the point that they performed much like
students nationally - above the average on some measures, and below on
others. In the two subject areas in which state NAEP assessments were
conducted more than once during the 1990s, there is evidence of modest
progress by students in Texas, but it is much like the progress found
for students elsewhere.
Even this gain, however, may reflect changes in the population being
tested. Between 1992 and 1996, the percentages of students excluded
from NAEP testing in Texas increased from 8 to 11 percent at grade 4,
and from 7 to 8 percent at grade 8. This means that 20 to 25 percent
of the 1992 to 1996 NAEP gains for Texas may be due simply to the
increased rates of exclusion of limited English proficiency and special
needs students from those exams. That would leave a gain of 9 points
at grade 4 and 4.3 points at grade 8. The former is still considerably
above the national increase of 4 points at grade 4, but no longer
highest among the states.
But that's not all. It's also important to keep in mind that NAEP
results are inevitably confounded with differences in grade retention
across the states. Where failure and grade repetition are common,
students in grades 4 and 8 will be older than their counterparts in
states that are less likely to make use of retention. Thus, it is
probably no accident that the two states identified in 1997 as having
made unusual "progress" on NAEP math assessments, Texas and North
Carolina, have unusually high rates of holding students back before
grade 4. In short, as with TAAS results, part of the apparent gains on
NAEP math tests in Texas is an illusion arising from exclusion and retention.

The other major claim in support of a Texas miracle has to do with the
relative gains for minority students. Gaps in achievement between White
and nonwhite (specifically, Black and Hispanic) students appear to have
narrowed. In 1994, 29 percent of 10th-grade African-American students
and 35 percent of Hispanic students passed all three TAAS tests. Four
years later, the proportions were 55 and 59 percent, respectively. (For
Whites, the percentage passing jumped from 67 to 85 percent during the
same period.)
If we look at NAEP results, however, the trend is considerably less
impressive. In many comparisons, Black and Hispanic students show about
the same gain in NAEP scores as White students, but the 1998 NAEP
reading results suggest that while White grade 4 reading scores in Texas
have improved since 1992, those of Black and Hispanic students have not.
The gap actually increased.
Still more disturbing is the fact that much of the apparent narrowing
that does exist can be explained by the disproportionate retention of
students of color. Since 1990, Texas schools have been making grade 10
test scores look better by increasingly flunking students - especially
Black and Hispanic students - in ninth grade. The overall ninth-grade
retention rate in Texas is unusually high from a national perspective.
And by the end of the 1990s, 25-30 percent of Black and Hispanic
students, as compared with only 10% of White students, was forced to
repeat ninth grade in Texas.
Research has consistently demonstrated that students who are held back
a grade are far more likely to drop out. In Texas, there is a 72
percent chance that a White student will progress through twelve grades
without being retained. For Black and Hispanic students, the
probabilities are 46 and 44 percent, respectively. And indeed, tens of
thousands of students drop out of school after being held back in ninth
grade. Notwithstanding official Texas statistics claiming that the
state's dropout rate has declined, the gap between White and nonwhite
dropout rates widened substantially as the TAAS exit exam requirements
were being phased in.
For the classes of 1982 to 1990, the percentage of Black and Hispanic
students who progressed from grade 6 to graduation six years later
hovered around 65%. For Whites, the corresponding percentage started at
about 80% and gradually declined to about 75% in 1990. When the TAAS
testing was first implemented, the percentages fell dramatically, to 55%
for minorities and to about 68% for Whites. Between 1992 and 1996, the
corresponding percentages were 60% for minorities and 75% for Whites.
Only after Texas was forced by the GED Testing Service to raise its
passing standard for receipt of a so-called high school equivalency
diploma in 1997 did the percentages persisting from grade 6 to high
school graduation begin to creep back up, to 65% for minorities in the
class of 1999, and for White students to 78% in the same class.
Here's the bottom line: since the beginning of the TAAS high school
graduation test in 1991, 22-25% of White students and 35-40% of Black
and Hispanic students dropped out of school some time after sixth grade.
These numbers give the lie to claims that minorities are making
progress on the test. There are fewer of them around to take the test.
But test scores aside, a high dropout rate is a tragedy in its own right
and it is a tragedy that high-stakes testing in Texas seems to have
exacerbated.
As if all this isn't bad enough, there's another dropout problem in
Texas - although, as high-stakes testing spreads through the country, it
is hardly confined to that state. I am referring to the problem of
teachers leaving the profession as a direct result of what is happening
to their jobs in the name of accountability. In a survey of members of
the Texas State Reading Association, a staggering 85 percent of
respondents agreed with the statement that "the emphasis on TAAS is
forcing some of the best teachers to leave teaching because of the
restraints the tests place on decision making and the pressures placed
on them and their students." Anecdotal reports and other surveys of
Texas educators corroborate this finding.
The more familiar one becomes with the actual data - as opposed to the
press releases and political rhetoric - the more unsettling it is to see
Presidential candidates outbidding one another in their calls for more
high stakes testing. What the story behind the Texas "miracle" clearly
shows is that if the stakes are high enough, schools will find ways to
raise test scores, but at huge cost to real education and good teaching.


Walt Haney, professor of education at Boston College, served as an
expert witness in a lawsuit brought against the state of Texas by the
Mexican American Legal Defense and Education Fund. This article is
adapted from a much longer analysis published in the Education Policy
Analysis Archives (http://epaa.asu.edu/epaa/v8n41).

 
 
Home ] Up ] Backman testimony ] Formica testimony ] Heichman testimony ] Perez testimony ] Ward testimony ] Scharf testimony ] Wiesenberg testimony ] Suyenaga testimony ] Oliver testimony ] [ Haney testimony ] Culverhouse testimony ] Bumpus testimony ]