Common Sense for Measurement


Length and weight may be "real" experiences. But we construct their units of measure. "Inches" and "Ounces" are quite unnatural. They are our own creations. Extremely useful, but entirely imaginary constructions.

A variable is an amount of something. We picture a variable as a distance.

From Less ----------------------------> To More

We arrange to observe evidence of that "something". But its measurement line and its units of measurement are up to us to construct and to maintain. The variable and its evidence could be:

benchmarks exceeded
symptoms missing
problems solved
tasks completed
assertions condoned

We provoke occurrences of evidence. Then count how many we observe to occur. But these counts are not measures. They are merely "Raw Counts","Raw Scores"!


To be observed the pieces of evidence must be concrete. But this reality keeps them uneven in size. To measure we need an even abstraction. We need a line marked out in abstractly equal units.

Pieces of evidence are always unstable. They appear and disappear by accident. They are merely probable signs of the variable for which they are designed to evoke manifestations. To measure, we must invent a practical way to connect whatever pieces of evidence we arrange to observe with a simple idea about the probabilities of their occurrences. These probabilities, in turn, must be specified as governed by the measures we want to construct.


Distance (Length) was undoubtedly our First Variable. Squirrels are good at it. Counting steps, arms, hands and fingers was our First Method. The trouble with Counting is the obviously unequal units which are counted.

How many apples fill a basket? How many oranges squeeze a glass? How do we trade apples and oranges? We don't count them. We weigh them!

Weighing is a constructed abstraction. Equal "feet" are abstracted from real feet. Equal "ounces" are abstracted from real stones. There are no "real" equal units. We have to invent them. We construct our instrumentations of lengths and weights to mark out units sufficiently equal to serve our practical purposes.

Four Troubles with Raw Scores

These meditations reveal four troubles with the primitive counting traditionally mistaken for measures in social, educational and health research.

1. Unequal size events , miscounted as equal.

2. Unequal size categories , miscounted as equal.

3. Missing responses, miscounted as "failures".

4. Incoherent responses , miscounted as valid.


The non-linearity of counts is not the only problem. We study evidence collected in order to learn from past experience how to navigate our future. But the future is certainly uncertain. How do we handle that uncertainty?

Imagine two baseball players. Smith bats 400. Jones bats 200. At "Batter Up", who's sure to hit? No way to know! Even Smith is only 4 out of 10.

Inverse Probability

Ask instead, "Who's hit is more likely?" The obvious answer is, "Smith". But, "Why?" Smith's odds are 2/3. Jones's odds are only 1/4. That makes Smith's odds for a hit (2/3)/(1/4)= 8/3 times better than Jones'.

That's how we always handle uncertainty. We learn from the past what to expect from the future. We use past evidence to estimate future probabilities. To do this we construct a reproducible transition from actually counting concrete events (right answers, symptoms, agreements, categories) to imagining abstract units of equal size. To handle uncertainty we apply the stochastic insights of: Bernoulli,Bayes, Laplace, Poisson. We interpret observation X=0,1 as implying a probability Px governed by the variable on which we want to measure.

Conjoint Additivity

To be useful, a variable must be expressed in equal units. Otherwise we cannot do arithmetic with its measures. To construct linearity we apply the additivity insights of: Campbell, Thurstone, Rasch, Luce & Tukey and require the Px implied by X=0,1 to satisfy the equation:

loge(Pni1/Pni0) = Bn - Di

Pni1 is the probability of person n responding Xni=1 to item i, and similarly Pni0. Bn is the ability of person n. Di is the difficulty of item i. This solution is called "Stochastic Conjoint Additivity".

When Xni=0,1,2,3,,,M occurs in ordered increments, like counts, the additive measurement model generalizes to:

loge(Pnix/Pni[x-1]) = Bn - Di - Fix

Fix is the linear difficulty of step x on item i.

When Xnijk = 0,M is the result of Rater j rating the performance of Person n on Task k at Category x of Item i, the measurement model becomes:

loge[Pnijkx/Pnijk[x-1]) = Bn - Di - Cj - Ak - Fix

Three More Raw Score Troubles

This reveals three more troubles with raw counts. The measurement increments implied by one more raw count depend on test targeting and item difficulty distribution.

5. Floors and Ceilings

At the floor and ceiling extremes of "None" and "All" counts of one more imply infinite increments! Whenever tests and questionnaires drift off-target, these intrinsic floors and ceilings destroy the utility of their raw scores.

6. Item Clumping

When uneven item calibrations produce item clumps, one more within a clump implies a smaller increment than one more between clumps. This means that whenever item difficulties are unevenly distributed, raw score implications are distorted by item clumping.

7. Misprecision

We always need estimates of measure precision. For this purpose, raw counts are perverse. Raw counts are most precise in predicting future raw counts at their finite extremes of "None" and "All". They are least precise in their middle at "Half Right". But this contradicts what we see when we move within a clump of items close together in difficulty. Within a clump, each "One More" marks off a small and hence precisely implied step on the measurement scale. Leaping from clump to clump, however, implies larger and hence less precisely implied steps. Acknowledging the infinities implied by "All Right" or "All Wrong" exposes their total lack of precision.

Inversely, the standard errors of measurement estimated by our measurement models are minimal in the middle, as they must be when we are right on target, and infinite at the extremes, as they must be, when we are completely off target.


A long as primitive counts and raw scores are routinely mistaken for measures by our colleagues in Social, Educational and Health research, there is no hope of their professional activities ever developing into a reliable or useful science. We owe it to them, and to ourselves, to teach them how to construct measures which work as well as the ubiquitous physical measures by which they manage their everyday living, so that they can do a better job in making sense out of the profusions of data which they collect so enthusiastically.

It is our job to teach them how to use the measurement models of stochastic conjoint additivity to transform their inevitably raw and concrete ordinal observations into clearly specified, arithmetically useful, reproducible linear measures.

Benjamin D. Wright

The construction of a measurement system does not stop with the initial work of determining that the variable of interest is in fact quantitative, and that the instrument in hand validly and reliably measures it. For our measures to generalize, we must become metrologists, linking different brands of instruments together into national and international systems that create and maintain common currencies for the exchange of quantitative value.
William P. Fisher. Jr.,

Common Sense for Measurement Wright, B.D. … Rasch Measurement Transactions, 1999, 13:3 p. 704

Please help with Standard Dataset 4: Andrich Rating Scale Model

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from welcomes your comments:

Your email address (if you want us to reply):


ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website,

Coming Rasch-related Events
July 31 - Aug. 3, 2017, Mon.-Thurs. Joint IMEKO TC1-TC7-TC13 Symposium 2017: Measurement Science challenges in Natural and Social Sciences, Rio de Janeiro, Brazil,
Aug. 7-9, 2017, Mon-Wed. In-person workshop and research coloquium: Effect size of family and school indexes in writing competence using TERCE data (C. Pardo, A. Atorressi, Winsteps), Bariloche Argentina. Carlos Pardo, Universidad Catòlica de Colombia
Aug. 7-9, 2017, Mon-Wed. PROMS 2017: Pacific Rim Objective Measurement Symposium, Sabah, Borneo, Malaysia,
Aug. 10, 2017, Thurs. In-person Winsteps Training Workshop (M. Linacre, Winsteps), Sydney, Australia.
Aug. 11 - Sept. 8, 2017, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),
Aug. 18-21, 2017, Fri.-Mon. IACAT 2017: International Association for Computerized Adaptive Testing, Niigata, Japan,
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago,
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
Oct. 25-27, 2017, Wed.-Fri. In-person workshop: Applying the Rasch Model hands-on introductory workshop, Melbourne, Australia (T. Bond, B&FSteps), Announcement
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
April 13-17, 2018, Fri.-Tues. AERA, New York, NY,
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps),
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
The HTML to add "Coming Rasch-related Events" to your webpage is:
<script type="text/javascript" src=""></script>


The URL of this page is