Length and weight may be "real" experiences. But we construct their units of measure. "Inches" and "Ounces" are quite unnatural. They are our own creations. Extremely useful, but entirely imaginary constructions.
A variable is an amount of something. We picture a variable as a distance.
We arrange to observe evidence of that "something". But its measurement line and its units of measurement are up to us to construct and to maintain. The variable and its evidence could be:
Variable | Evidence |
---|---|
length health ability skill attitude | benchmarks exceeded symptoms missing problems solved tasks completed assertions condoned |
We provoke occurrences of evidence. Then count how many we observe to occur. But these counts are not measures. They are merely "Raw Counts","Raw Scores"!
To be observed the pieces of evidence must be concrete. But this reality keeps them uneven in size. To measure we need an even abstraction. We need a line marked out in abstractly equal units.
Pieces of evidence are always unstable. They appear and disappear by accident. They are merely probable signs of the variable for which they are designed to evoke manifestations. To measure, we must invent a practical way to connect whatever pieces of evidence we arrange to observe with a simple idea about the probabilities of their occurrences. These probabilities, in turn, must be specified as governed by the measures we want to construct.
Distance (Length) was undoubtedly our First Variable. Squirrels are good at it. Counting steps, arms, hands and fingers was our First Method. The trouble with Counting is the obviously unequal units which are counted.
How many apples fill a basket? How many oranges squeeze a glass? How do we trade apples and oranges? We don't count them. We weigh them!
Weighing is a constructed abstraction. Equal "feet" are abstracted from real feet. Equal "ounces" are abstracted from real stones. There are no "real" equal units. We have to invent them. We construct our instrumentations of lengths and weights to mark out units sufficiently equal to serve our practical purposes.
These meditations reveal four troubles with the primitive counting traditionally mistaken for measures in social, educational and health research.
1. Unequal size events , miscounted as equal.
2. Unequal size categories , miscounted as equal.
3. Missing responses, miscounted as "failures".
4. Incoherent responses , miscounted as valid.
The non-linearity of counts is not the only problem. We study evidence collected in order to learn from past experience how to navigate our future. But the future is certainly uncertain. How do we handle that uncertainty?
Imagine two baseball players. Smith bats 400. Jones bats 200. At "Batter Up", who's sure to hit? No way to know! Even Smith is only 4 out of 10.
Ask instead, "Who's hit is more likely?" The obvious answer is, "Smith". But, "Why?" Smith's odds are 2/3. Jones's odds are only 1/4. That makes Smith's odds for a hit (2/3)/(1/4)= 8/3 times better than Jones'.
That's how we always handle uncertainty. We learn from the past what to expect from the future. We use past evidence to estimate future probabilities. To do this we construct a reproducible transition from actually counting concrete events (right answers, symptoms, agreements, categories) to imagining abstract units of equal size. To handle uncertainty we apply the stochastic insights of: Bernoulli,Bayes, Laplace, Poisson. We interpret observation X=0,1 as implying a probability P_{x} governed by the variable on which we want to measure.
To be useful, a variable must be expressed in equal units. Otherwise we cannot do arithmetic with its measures. To construct linearity we apply the additivity insights of: Campbell, Thurstone, Rasch, Luce & Tukey and require the P_{x} implied by X=0,1 to satisfy the equation:
P_{ni1} is the probability of person n responding X_{ni}=1 to item i, and similarly P_{ni0}. B_{n} is the ability of person n. D_{i} is the difficulty of item i. This solution is called "Stochastic Conjoint Additivity".
When X_{ni}=0,1,2,3,,,M occurs in ordered increments, like counts, the additive measurement model generalizes to:
F_{ix} is the linear difficulty of step x on item i.
When X_{nijk} = 0,M is the result of Rater j rating the performance of Person n on Task k at Category x of Item i, the measurement model becomes:
This reveals three more troubles with raw counts. The measurement increments implied by one more raw count depend on test targeting and item difficulty distribution.
5. Floors and Ceilings
At the floor and ceiling extremes of "None" and "All" counts of one more imply infinite increments! Whenever tests and questionnaires drift off-target, these intrinsic floors and ceilings destroy the utility of their raw scores.
6. Item Clumping
When uneven item calibrations produce item clumps, one more within a clump implies a smaller increment than one more between clumps. This means that whenever item difficulties are unevenly distributed, raw score implications are distorted by item clumping.
7. Misprecision
We always need estimates of measure precision. For this purpose, raw counts are perverse. Raw counts are most precise in predicting future raw counts at their finite extremes of "None" and "All". They are least precise in their middle at "Half Right". But this contradicts what we see when we move within a clump of items close together in difficulty. Within a clump, each "One More" marks off a small and hence precisely implied step on the measurement scale. Leaping from clump to clump, however, implies larger and hence less precisely implied steps. Acknowledging the infinities implied by "All Right" or "All Wrong" exposes their total lack of precision.
Inversely, the standard errors of measurement estimated by our measurement models are minimal in the middle, as they must be when we are right on target, and infinite at the extremes, as they must be, when we are completely off target.
A long as primitive counts and raw scores are routinely mistaken for measures by our colleagues in Social, Educational and Health research, there is no hope of their professional activities ever developing into a reliable or useful science. We owe it to them, and to ourselves, to teach them how to construct measures which work as well as the ubiquitous physical measures by which they manage their everyday living, so that they can do a better job in making sense out of the profusions of data which they collect so enthusiastically.
It is our job to teach them how to use the measurement models of stochastic conjoint additivity to transform their inevitably raw and concrete ordinal observations into clearly specified, arithmetically useful, reproducible linear measures.
Benjamin D. Wright
The construction of a measurement system does not stop with the
initial work of determining that the variable of interest is in
fact quantitative, and that the instrument in hand validly and
reliably measures it. For our measures to generalize, we must
become metrologists, linking different brands of instruments
together into national and international systems that create and
maintain common currencies for the exchange of quantitative value.
William P. Fisher. Jr., wfishe@lsumc.edu
Common Sense for Measurement Wright, B.D. … Rasch Measurement Transactions, 1999, 13:3 p. 704
Rasch Publications | ||||
---|---|---|---|---|
Rasch Measurement Transactions (free, online) | Rasch Measurement research papers (free, online) | Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch | Applying the Rasch Model 3rd. Ed., Bond & Fox | Best Test Design, Wright & Stone |
Rating Scale Analysis, Wright & Masters | Introduction to Rasch Measurement, E. Smith & R. Smith | Introduction to Many-Facet Rasch Measurement, Thomas Eckes | Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. | Statistical Analyses for Language Testers, Rita Green |
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar | Journal of Applied Measurement | Rasch models for measurement, David Andrich | Constructing Measures, Mark Wilson | Rasch Analysis in the Human Sciences, Boone, Stave, Yale |
in Spanish: | Análisis de Rasch para todos, Agustín Tristán | Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez |
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
June 23 - July 21, 2023, Fri.-Fri. | On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com |
Aug. 11 - Sept. 8, 2023, Fri.-Fri. | On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt133h.htm
Website: www.rasch.org/rmt/contents.htm