# Common Sense for Measurement

Observations

Length and weight may be "real" experiences. But we construct their units of measure. "Inches" and "Ounces" are quite unnatural. They are our own creations. Extremely useful, but entirely imaginary constructions.

A variable is an amount of something. We picture a variable as a distance.

From Less ----------------------------> To More

We arrange to observe evidence of that "something". But its measurement line and its units of measurement are up to us to construct and to maintain. The variable and its evidence could be:

VariableEvidence
length
health
ability
skill
attitude
benchmarks exceeded
symptoms missing
problems solved
assertions condoned

We provoke occurrences of evidence. Then count how many we observe to occur. But these counts are not measures. They are merely "Raw Counts","Raw Scores"!

Evidence

To be observed the pieces of evidence must be concrete. But this reality keeps them uneven in size. To measure we need an even abstraction. We need a line marked out in abstractly equal units.

Pieces of evidence are always unstable. They appear and disappear by accident. They are merely probable signs of the variable for which they are designed to evoke manifestations. To measure, we must invent a practical way to connect whatever pieces of evidence we arrange to observe with a simple idea about the probabilities of their occurrences. These probabilities, in turn, must be specified as governed by the measures we want to construct.

Measurement

Distance (Length) was undoubtedly our First Variable. Squirrels are good at it. Counting steps, arms, hands and fingers was our First Method. The trouble with Counting is the obviously unequal units which are counted.

How many apples fill a basket? How many oranges squeeze a glass? How do we trade apples and oranges? We don't count them. We weigh them!

Weighing is a constructed abstraction. Equal "feet" are abstracted from real feet. Equal "ounces" are abstracted from real stones. There are no "real" equal units. We have to invent them. We construct our instrumentations of lengths and weights to mark out units sufficiently equal to serve our practical purposes.

Four Troubles with Raw Scores

These meditations reveal four troubles with the primitive counting traditionally mistaken for measures in social, educational and health research.

1. Unequal size events , miscounted as equal.

2. Unequal size categories , miscounted as equal.

3. Missing responses, miscounted as "failures".

4. Incoherent responses , miscounted as valid.

Uncertainty

The non-linearity of counts is not the only problem. We study evidence collected in order to learn from past experience how to navigate our future. But the future is certainly uncertain. How do we handle that uncertainty?

Imagine two baseball players. Smith bats 400. Jones bats 200. At "Batter Up", who's sure to hit? No way to know! Even Smith is only 4 out of 10.

Inverse Probability

Ask instead, "Who's hit is more likely?" The obvious answer is, "Smith". But, "Why?" Smith's odds are 2/3. Jones's odds are only 1/4. That makes Smith's odds for a hit (2/3)/(1/4)= 8/3 times better than Jones'.

That's how we always handle uncertainty. We learn from the past what to expect from the future. We use past evidence to estimate future probabilities. To do this we construct a reproducible transition from actually counting concrete events (right answers, symptoms, agreements, categories) to imagining abstract units of equal size. To handle uncertainty we apply the stochastic insights of: Bernoulli,Bayes, Laplace, Poisson. We interpret observation X=0,1 as implying a probability Px governed by the variable on which we want to measure.

To be useful, a variable must be expressed in equal units. Otherwise we cannot do arithmetic with its measures. To construct linearity we apply the additivity insights of: Campbell, Thurstone, Rasch, Luce & Tukey and require the Px implied by X=0,1 to satisfy the equation:

loge(Pni1/Pni0) = Bn - Di

Pni1 is the probability of person n responding Xni=1 to item i, and similarly Pni0. Bn is the ability of person n. Di is the difficulty of item i. This solution is called "Stochastic Conjoint Additivity".

When Xni=0,1,2,3,,,M occurs in ordered increments, like counts, the additive measurement model generalizes to:

loge(Pnix/Pni[x-1]) = Bn - Di - Fix

Fix is the linear difficulty of step x on item i.

When Xnijk = 0,M is the result of Rater j rating the performance of Person n on Task k at Category x of Item i, the measurement model becomes:

loge[Pnijkx/Pnijk[x-1]) = Bn - Di - Cj - Ak - Fix

Three More Raw Score Troubles

This reveals three more troubles with raw counts. The measurement increments implied by one more raw count depend on test targeting and item difficulty distribution.

5. Floors and Ceilings

At the floor and ceiling extremes of "None" and "All" counts of one more imply infinite increments! Whenever tests and questionnaires drift off-target, these intrinsic floors and ceilings destroy the utility of their raw scores.

6. Item Clumping

When uneven item calibrations produce item clumps, one more within a clump implies a smaller increment than one more between clumps. This means that whenever item difficulties are unevenly distributed, raw score implications are distorted by item clumping.

7. Misprecision

We always need estimates of measure precision. For this purpose, raw counts are perverse. Raw counts are most precise in predicting future raw counts at their finite extremes of "None" and "All". They are least precise in their middle at "Half Right". But this contradicts what we see when we move within a clump of items close together in difficulty. Within a clump, each "One More" marks off a small and hence precisely implied step on the measurement scale. Leaping from clump to clump, however, implies larger and hence less precisely implied steps. Acknowledging the infinities implied by "All Right" or "All Wrong" exposes their total lack of precision.

Inversely, the standard errors of measurement estimated by our measurement models are minimal in the middle, as they must be when we are right on target, and infinite at the extremes, as they must be, when we are completely off target.

Moral

A long as primitive counts and raw scores are routinely mistaken for measures by our colleagues in Social, Educational and Health research, there is no hope of their professional activities ever developing into a reliable or useful science. We owe it to them, and to ourselves, to teach them how to construct measures which work as well as the ubiquitous physical measures by which they manage their everyday living, so that they can do a better job in making sense out of the profusions of data which they collect so enthusiastically.

It is our job to teach them how to use the measurement models of stochastic conjoint additivity to transform their inevitably raw and concrete ordinal observations into clearly specified, arithmetically useful, reproducible linear measures.

Benjamin D. Wright

The construction of a measurement system does not stop with the initial work of determining that the variable of interest is in fact quantitative, and that the instrument in hand validly and reliably measures it. For our measures to generalize, we must become metrologists, linking different brands of instruments together into national and international systems that create and maintain common currencies for the exchange of quantitative value.
William P. Fisher. Jr., wfishe@lsumc.edu

Common Sense for Measurement Wright, B.D. … Rasch Measurement Transactions, 1999, 13:3 p. 704

Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
Rasch Books and Publications: Winsteps and Facets
Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Rasch Models for Solving Measurement Problems (Facets), George Engelhard, Jr. & Jue Wang Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
May 17 - June 21, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden http://www.hkr.se/samc2024
June 21 - July 19, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 5 - Aug. 6, 2024, Fri.-Fri. 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com