Stimulating Excellence in Education

Comments on: Stimulating Excellence: Unleashing the Power of Innovation in Education

May 2009, The Center for American Progress et al.

The report focuses on creating the conditions for entrepreneurial innovation and reward in education. It deplores the lack of a quality improvement culture in education, and the general failure to recognize the vital importance of measuring performance for active management. It makes specific recommendations aimed at drastic improvements in information quality. Excellent so far! But the report would be far more powerful and persuasive if it capitalized on two very significant features of the current situation.

First, only on page 34, in the report's penultimate paragraph, do the authors briefly touch on what all educators know is absolutely the most important thing to understand about teaching and learning: it always starts from where the student is at, growing out of what is already known. This is doubly important in the context of the report's focus, teaching and learning about how to institute a new culture of power metrics and innovation. To try to institute fundamental changes, with little or no concern for what is already in place, is a sure recipe for failure.

Second, there is one feature of the educational system as it currently exists that will be of particular value as we strive to improve the quality of the available information. That feature concerns tests and measurement. Many of the report's recommendations would be quite different if its authors had integrated their entrepreneurial focus with the technical capacities of state-of-the-art educational measurement.

The obvious recommendation with which to start concerns the reason why public education in the United States is such a fragmented system: because outcome standards and product definitions are expressed (almost) entirely in terms of locally-determined content and expert opinion. Local content, standards, and opinions are essential, but to be meaningful, comparable, practical, and scientific they have to be brought into a common scale of comparison.

The technology for creating such scales is widely available. For over 40 years, commercial testing agencies, state departments of education, school districts, licensure and certification boards, and academic researchers have been developing and implementing stable metrics that transcend the local particulars of specific tests. The authors of the "Stimulating Excellence" report are right to stress the central importance of comparable measures in creating an entrepreneurial environment in education, but they did not do enough to identify existing measurement capabilities and how they could help create that environment.

For instance, all three of the recommendations made at the bottom of page 12 and top of page 13 address capabilities that are already in place in various states and districts around the country. The examples that come easiest to mind involve the Lexile Framework for Reading and Writing, and the Quantile Framework for Mathematics, developed by MetaMetrics, Inc., of Durham, NC (

The Lexile metric for reading ability and text readability unifies all major reading tests in a common scale, and is used to report measures for over 28 million students in all 50 states. Hundreds of publishers routinely obtain Lexile values for their texts, with over 115,000 books and 80 million articles (most available electronically) Lexiled to date.

Furthermore, though one would never know from reading the "Stimulating Excellence" report, materials on the MetaMetrics web site show that the report's three recommendations concerning the maximization of data utility have already been recognized and acted on, since

That said, a larger issue concerns the need to create standards that remain invariant across local specifics. A national curriculum and national testing standards seem likely to fall into the trap of either dictating specific content or fostering continued fragmentation when states refuse to accept that content. But in the same way that computer-adaptive testing creates a unique examination for each examinee "without compromising comparability" so, too, must we invest resources in devising a national system of educational standards that both takes advantage of existing technical capabilities and sets the stage for improved educational outcomes.

That is what the report's key recommendation ought to have been. An approximation of it comes on page 35, with the suggestion that now is the time for investment in what is referred to as "backbone platforms" like the Internet. Much more ought to have been said about this, and it should have been integrated with the previous recommendations, such as those concerning information quality and power metrics. For instance, on page 27, a recommendation is made to "build on the open-source concept." Upon reading that, my immediate thought was that the authors were going to make an analogy with adaptively administered item banks, not literally recommend actual software implementation processes.

But they took the literal road and missed the analogical boat. That is, we ought to build on the open-source concept by creating what might be called crowd-sourced "wikitests" exams that teachers and researchers everywhere can add to and draw from, with the qualification that the items work in practice to measure what they are supposed to measure, according to agreed-upon data quality and construct validity standards. This process would integrate local content standards with global construct standards in a universally uniform metric not much different from the reference standard units of comparison we take for granted in measuring time, temperature, distance, electrical current, or weight. Michael K. Smith suggests a practical approach to achieving these objectives in "Why not a national test for everyone?", Phi Delta Kappan, 91, 4, Feb. 2010, 54-58.

And this is where the real value of the "backbone platform" concept comes in. The Internet, like phones and faxes before it, and like alphabetic, phonetic and grammatical standards before them, provides the structure of common reference standards essential to communication and commerce. What we are evolving toward is a new level of complexity in the way we create the common unities of meaning through which we achieve varying degrees of mutual understanding and community.

In addition, measurement plays a fundamental role in the economy as the primary means of determining the relation of price to value. The never-ending spiral of increasing costs in education is surely deeply rooted in the lack of performance metrics and an improvement culture. We ought to take the global infrastructure of measurement standards as a model for what we need as a "backbone platform" in education. We ought to take the metaphor of transparency and the need for "clear metrics" much more literally. We really do need instruments that we can look right through, that bring the thing we want to see into focus, without having to be primarily concerned with which particular instrument it is we are using.

Decades of research in educational measurement show that these instruments can be constructed. A great deal still needs to be done, and the challenges are huge, but taking them on will enable us to expand the domains in which we insist on fair dealing, and in which the balance scale applies as a symbol of justice.

When the entrepreneurial vision presented in the "Stimulating Excellence" report is situated in a context better informed by what educators are already doing and what they already know, the stage will be set for a new culture of performance improvement in education, a culture that explicitly articulates, tests, and acts on its educational values. At that point, we can expect great things!

William P. Fisher, Jr.

Fisher W.P. Jr. (2009) Comments on: Stimulating Excellence: Unleashing the Power of Innovation in Education, Rasch Measurement Transactions, 2009, 23:3, 1222-1223

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from welcomes your comments:

Your email address (if you want us to reply):


ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website,

Coming Rasch-related Events
June 23 - July 21, 2023, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps),
Aug. 11 - Sept. 8, 2023, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),


The URL of this page is