S. S. Stevens Revisited

Stevens' Classification of Scales (after Stevens, 1959, p.25, 27)
Scale	Operation	Examples	Location	Dispersion	Association	Test
Nominal	Equality	Numbering of players	Mode			Chi-square
Ordinal	Greater or less	Hardness of minerals Street numbers Raw scores	Median	Percentiles	Rank-order correlation	Sign test Run test
Interval	Distance	Temperature: Celsius Position, Time Standard scores(?)	Arithmetic mean	Standard deviation	Product-moment correlation	t-test F-test
Ratio	Ratio	Numerosity (Counts) Length, Density Position, Time Temperature: Kelvin Loudness:sones Brightness: brils	Geometric mean Harmonic Mean	Percent variation

Stevens' work on measurement, so often quoted, is seldom read. His statement, measurement is the assignment of numerals to events or objects according to rule (Stevens, 1959, p.25), is used to support all kinds of mathematical abuse. Stevens, however, tried to be precise about what kind of arithmetic was valid with what kind of numbers (see Table).

Nominal scales

Whether a process of classification underlying the nominal scale constitutes measurement is one of those semantic issues that depend on taste... I prefer to call it a form of measurement (p. 25). Unfortunately this preference has been confused with the every-day restriction of the term "measurement" to numerically linear operations. Nevertheless, Stevens' did perceive the difference between mere numerical operations and meaningful statistics: When we compute the mean of the numerals assigned to a team of football players, are we trying to say something about the players, or only about the numerals? The only "meaningful" statistic here would be N, the number of players assigned a numeral. (p. 29)

Test scores

Stevens understanding of the ordinal status of test scores is clear. He categorizes raw scores as ordinal, and, since the "(?)" in the Table is his, he is not convinced that even standardized scores are interval. Further, he says: When operations are available to determine only rank order, it is of questionable propriety to compute means and standard deviations... If we want to interpret the result of averaging a set of data as an arithmetic mean in the usual sense, we need to begin with more than an ordinal assignment of numerals. (p. 29)

Ratio scales

The distinctive feature of a ratio scale is that it has a an origin defined by a dominating substantive theory (p. 25). Thus, Time, measured from the "Big Bang", is on a ratio scale, and so is Length when measured from the location of that same event. Length, in yards or meters, and Time, in days or years, are on interval scales. Since Stevens regards his classification as a hierarchy, he lists Length with the super-ordinate ratio scale. His distinction between Length as ratio and Time as interval dissolves.

When obtaining the location of a set of ratio numbers, their ratio relationship with the scale's particular origin, must be maintained. This is done by using the geometric mean, not the arithmetic mean, as the "average" for the ratios (p. 27). To obtain the "average" of a set of ratio scale numbers, the logarithm of each number is calculated. The arithmetic mean of these logarithms is computed and then exponentiated to yield the geometric mean of the ratio numbers. To obtain the usual interval-level summaries (means and standard deviations) of numbers whose ratio scale structure is essential, those numbers must be converted first to an interval scale by taking logarithms. Since any interval scale can be transformed to ratio form by choosing an origin and exponentiating, the distinction between these scales has no mathematical importance [though it is important for communicating the meaning of the numbers].

Counts

The numerosity of collections of objects [counts] ... belongs to the class I have called ratio scales (p. 20). Accordingly, in situations where it is important to maintain the notion that a count of 0 means "none at all", rather than "none extra", and a count 1 of means "only one object", rather than "one more to go with those we already have", then ratio scale arithmetic applies. To apply the usual interval statistics to such counts requires the logarithms of the counts to be obtained, but the objects cannot vary in size. If we are counting real objects, like right answers or apples, then the counts can have no more than ordinal status because of the variation in object size.

Campbell, Stevens and Rasch

Stevens formulated his classification and discussion as a rejoinder to the British committee who, in 1932, investigated the possibility of "quantitative estimates of sensory events" (p.22). Physicist Norman Campbell's verdict was: Why do not psychologists accept the natural and obvious conclusion that subjective measurements of loudness in numerical terms (like those of length) are mutually inconsistent and cannot be the basis of measurement? Stevens comment on this is: Why, he might have asked, does the psychologist not give up and go quietly off to limbo? (p.23)

Stevens set out to rescue psychological measurement by changing the problem from that of inventing operations (the physical view) to that of classifying scales (a mathematical view, p. 23). As Warren Torgerson (1958 p.18) noted, Stevens' and other similar approaches are concerned with different methods for the systematic classification of various limited sets of [concrete] objects, rather than methods of measurement of [an abstract] property. Stevens' solution does not produce linear measures, but merely classifies the numbers already in use. He concludes that brightness and loudness are on ratio scales (linearized by taking logarithms, p. 40). He also discovers that methods such as "just noticeable differences", rating scale categories and paired comparisons produce only ordinal scales (p. 46). Stevens' enormous contribution was his successful argument that there are different kinds of scales, kinds defined in terms of their degree of resemblance to the real number line. The weakness of his writing was its apparent implication that the nature of a scale is somehow defined by the investigator (Cliff 1992 p. 186).

Rasch surmounts both Campbell's insistence on physical operations and Stevens' substitution of classification for measurement construction. Rasch measurement deliberately engages in producing from intangible qualitative observations the most meaningful (and common) form of measurement, namely that on an interval scale readily analyzed by linear statistics. Since it is almost impossible to think quantitatively without linearity, Rasch refuses to be appeased with less useful numerical assignments.

Cliff N (1992) Abstract measurement theory and the revolution that never happened. Psychological Science 3(3) p.186-190.

Stevens S. S. (1959) Measurement, Psychophysics and Utility, Chap. 2, in C. W. Churchman & P. Ratoosh (Eds.), Measurement: Definitions and Theories. New York: John Wiley

Later note: S.S. Stevens confused scale-types by combining additive and multiplicative relationships under one term, "ratio". It would have been clearer if he had called all additive scales "interval", and all multiplicative scales "ratio", and then noted that numbering systems can have both properties, but that both sets of properties cannot be maintained simultaneously. For instance, the average of the interval properties of a set of numbers (the arithmetic mean) is not the same as the average of the ratio properties of set of numbers (the geometric mean). He could then have pointed out that there are other similar properties possible for scales of numbers, e.g., the "harmonic" properties whose average is the harmonic mean, and the "variance" properties, whose average is the "root mean square". But that would have spoiled his "hierarchy of scales" story ....

Stevens Revisited. Wright B. D. … Rasch Measurement Transactions, 1997, 11:1 p. 552-3.

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com