In RMT 22:1, Stenner, Stone, and Burdick (2008) distinguished between two different measurement models: reflective or latent variable models and formative or composite variable models (Edwards & Bagozzi, 2000). In the former, the causal action flows from the latent variable to the indicators (e.g., temperature) whereas, in the latter, the causal action flows from indicators to the composite variable (e.g., socioeconomic status). We believe that the language we use should accentuate these differences and as such we propose to call reflective models measurement models, what these models produce we will call measures and the process of producing these measures will be called measuring. In parallel fashion, formative models will be called index models, what they produce we will call indices, and the process of producing indices will be called indexing. The notion of an index is well developed in economics and sociology and carries the connotations we desire. What follows is a discussion of how indexing and measuring differ and why it is important to make this distinction in the human sciences.
Indices are the effects of their indicators whereas measures (of latent variables) are the causes of their indicators. So, changes in stature or consumer price behavior are caused by changes in height (or weight) and price changes for market baskets of commodities (computers, milk, gasoline), respectively. Changes in latent variable measures, in contrast, cause a homogeneous (often nonlinear) change in indicator behavior, as when temperature change causes thermometric fluid to expand in the thermometer or a change in reader ability causes a change in count correct on a reading test.
Altering the indicators of an index changes the definition of the variable being indexed, whereas changing the indicators for a measure will not alter the latent variable (although precision of measurement and or unit size may be affected). So, if midline girth is added to height and weight as indicators of stature or all electronic commodities are eliminated from the Consumer-Product-Index (CPI) market basket, the definition of what is being indexed changes.
In contrast, knowledge of expansion coefficients and viscosity differences allows us to swap new thermometric fluids for mercury without changing the construct being measured. Similarly, new reading items with different text and item types can be swapped for previous items without changing the construct being measured.
Another way to express this point is that the indicators for an index are constitutive of that index, whereas indicators for a latent variable are incidental to the construct's definition.
In a generally objective measurement framework (e.g., Lexiles) what is crucial in the definition of the construct is the specification equation that specifies the cause of the variation detected by the instrument. Because the indicators of an index by design track different kinds of variation (height, weight, midline girth), it is difficult to imagine a specification equation that could, somehow, capture what these indicators share independent of the linear (or otherwise) combination that constitutes the index. What, for example, would a parallel form of Sheldon's somatotype rating scale look like? Difficulty in imagining what new indicators would constitute a parallel form is strongly suggestive of the need for an index rather than a measurement model.
Indices Misinterpreted as Latent Variables
Because both index and measurement models are fundamentally associational (i.e., based on correlations among indicators), traditional applications of Rasch model software often cannot distinguish between an index and a latent variable (Stenner, Burdick, & Stone, 2008). Examples of resulting confusion take predominantly one particular form: index variables are interpreted, as if they are latent variables. Here is an example typical of many in the Rasch literature [and RMT, Ed.]:
The Rasch model has been shown to fit FIM data reasonably well, which indicates that the scale locations describe adequately the relative order in which these functions are lost in the aging population. The items on the top describe difficult activities, such as climbing stairs, whereas items on the bottom describe easier activities that are maintained relatively well. (Embretson, 2006, p. 52)
Contrary to a latent variable interpretation the FIM (Functional Independence Measure) appears to be an index of motor functioning with the causal action moving from indicators to index. If the desired medical outcome is "more functional independence," then rehabilitating bladder control, walking, bathing, and so on should promote the intended outcome rather than the other way around. Alternatively, we could teach the patient to drive a motorized wheelchair but to include this as an indicator would alter the definition of functional independence.
Global fit of data to a Rasch model will not sort out the direction of causal flow and thus will not provide unambiguous evidence for a latent variable interpretation of the construct. A substantive theory and associated specification equation capable of explaining variation in indicator difficulties is a big step in support of a latent variable interpretation. The coup de grace is a demonstration of the specification equation's causal status using experimental manipulation of instrument characteristics (radicals) and subsequent observation of the theoretically predicted change in the measurement outcome.
Correlation is not Causation
It is a property of indices (economical, sociological, or psychological) that the indicator composite may be found to correlate more highly with an unintended criterion than the intended one. Such a discomforting outcome is yet another reason that a correlational (as opposed to a causal) view of validity is not sustainable.
Latent variable interpretations are most defensible when global fit of data to a Rasch model is accompanied by invariance of the indicator structure throughout the range of the construct. In the language of additive conjoint measurement (Luce & Tukey, 1964) and as realized in the Lexile Framework for reading (Kingdon in press), it should be possible to trade off a difference between reader abilities of 200L for a difference in text readability of 200L to hold comprehension rate (count correct/total items) constant (Burdick, Stone, & Stenner, 2006). This trade-off property has been shown to operate throughout the grade range from kindergarten to advanced adult reading (e.g., Supreme Court decisions) and would not be expected to hold for a reading index variable composed of items such as: (1) number of books in the home, (2) daily newspaper subscription, (3) English as a first language, etc.
It may be true that "where there is correlational smoke there is likely to be causational fire" (Holland, 1986, p. 951). Good fit with a Rasch model is correlational smoke, but as we have just seen, it takes an experimental test of a substantive theory to unambiguously distinguish between a latent variable and an index.
A. Jackson Stenner, Mark H. Stone, and Donald S. Burdick
Burdick, D. S., Stone, M. H., & Stenner, A. J. (2006). The Combined Gas Law and a Rasch Reading Law. Rasch Measurement Transactions, 20(2), 1059-60, www.rasch.org/rmt/rmt202.pdf
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155-174.
Embretson, S. E. (2006). The continued search for nonarbitrary metrics in psychology. American Psychologist, 61(1), 50-55.
Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81, 945-960.
Luce, R & Tukey, J. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1-27.
Stenner, A. J., Burdick, D. S., & Stone, M. H. (2008). Formative and reflective models: Can a Rasch analysis tell the difference? Rasch Measurement Transactions, 22:1, 1152-3,www.rasch.org/rmt/rmt221.pdf
Indexing vs. Measuring A.J. Stenner, M.H. Stone, and D.S. Burdick, Rasch Measurement Transactions, 2009, 22:4, 1176-7
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|June 23 - July 21, 2023, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 11 - Sept. 8, 2023, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt224b.htm