## S. S. Stevens Revisited

Stevens' Classification of Scales (after Stevens, 1959, p.25, 27)
Scale Operation Examples Location Dispersion Association Test
Nominal Equality Numbering of players Mode     Chi-square
Ordinal Greater or less Hardness of minerals
Street numbers
Raw scores
Median Percentiles Rank-order correlation Sign test
Run test
Interval Distance Temperature: Celsius
Position, Time
Standard scores(?)
Arithmetic mean Standard deviation Product-moment correlation t-test
F-test
Ratio Ratio Numerosity (Counts)
Length, Density
Position, Time
Temperature: Kelvin
Loudness:sones
Brightness: brils
Geometric mean
Harmonic Mean
Percent variation

Stevens' work on measurement, so often quoted, is seldom read. His statement, measurement is the assignment of numerals to events or objects according to rule (Stevens, 1959, p.25), is used to support all kinds of mathematical abuse. Stevens, however, tried to be precise about what kind of arithmetic was valid with what kind of numbers (see Table).

### Nominal scales

Whether a process of classification underlying the nominal scale constitutes measurement is one of those semantic issues that depend on taste... I prefer to call it a form of measurement (p. 25). Unfortunately this preference has been confused with the every-day restriction of the term "measurement" to numerically linear operations. Nevertheless, Stevens' did perceive the difference between mere numerical operations and meaningful statistics: When we compute the mean of the numerals assigned to a team of football players, are we trying to say something about the players, or only about the numerals? The only "meaningful" statistic here would be N, the number of players assigned a numeral. (p. 29)

### Test scores

Stevens understanding of the ordinal status of test scores is clear. He categorizes raw scores as ordinal, and, since the "(?)" in the Table is his, he is not convinced that even standardized scores are interval. Further, he says: When operations are available to determine only rank order, it is of questionable propriety to compute means and standard deviations... If we want to interpret the result of averaging a set of data as an arithmetic mean in the usual sense, we need to begin with more than an ordinal assignment of numerals. (p. 29)

### Ratio scales

The distinctive feature of a ratio scale is that it has a an origin defined by a dominating substantive theory (p. 25). Thus, Time, measured from the "Big Bang", is on a ratio scale, and so is Length when measured from the location of that same event. Length, in yards or meters, and Time, in days or years, are on interval scales. Since Stevens regards his classification as a hierarchy, he lists Length with the super-ordinate ratio scale. His distinction between Length as ratio and Time as interval dissolves.

When obtaining the location of a set of ratio numbers, their ratio relationship with the scale's particular origin, must be maintained. This is done by using the geometric mean, not the arithmetic mean, as the "average" for the ratios (p. 27). To obtain the "average" of a set of ratio scale numbers, the logarithm of each number is calculated. The arithmetic mean of these logarithms is computed and then exponentiated to yield the geometric mean of the ratio numbers. To obtain the usual interval-level summaries (means and standard deviations) of numbers whose ratio scale structure is essential, those numbers must be converted first to an interval scale by taking logarithms. Since any interval scale can be transformed to ratio form by choosing an origin and exponentiating, the distinction between these scales has no mathematical importance [though it is important for communicating the meaning of the numbers]. ### Counts

The numerosity of collections of objects [counts] ... belongs to the class I have called ratio scales (p. 20). Accordingly, in situations where it is important to maintain the notion that a count of 0 means "none at all", rather than "none extra", and a count 1 of means "only one object", rather than "one more to go with those we already have", then ratio scale arithmetic applies. To apply the usual interval statistics to such counts requires the logarithms of the counts to be obtained, but the objects cannot vary in size. If we are counting real objects, like right answers or apples, then the counts can have no more than ordinal status because of the variation in object size.

### Campbell, Stevens and Rasch

Stevens formulated his classification and discussion as a rejoinder to the British committee who, in 1932, investigated the possibility of "quantitative estimates of sensory events" (p.22). Physicist Norman Campbell's verdict was: Why do not psychologists accept the natural and obvious conclusion that subjective measurements of loudness in numerical terms (like those of length) are mutually inconsistent and cannot be the basis of measurement? Stevens comment on this is: Why, he might have asked, does the psychologist not give up and go quietly off to limbo? (p.23)

Stevens set out to rescue psychological measurement by changing the problem from that of inventing operations (the physical view) to that of classifying scales (a mathematical view, p. 23). As Warren Torgerson (1958 p.18) noted, Stevens' and other similar approaches are concerned with different methods for the systematic classification of various limited sets of [concrete] objects, rather than methods of measurement of [an abstract] property. Stevens' solution does not produce linear measures, but merely classifies the numbers already in use. He concludes that brightness and loudness are on ratio scales (linearized by taking logarithms, p. 40). He also discovers that methods such as "just noticeable differences", rating scale categories and paired comparisons produce only ordinal scales (p. 46). Stevens' enormous contribution was his successful argument that there are different kinds of scales, kinds defined in terms of their degree of resemblance to the real number line. The weakness of his writing was its apparent implication that the nature of a scale is somehow defined by the investigator (Cliff 1992 p. 186).

Rasch surmounts both Campbell's insistence on physical operations and Stevens' substitution of classification for measurement construction. Rasch measurement deliberately engages in producing from intangible qualitative observations the most meaningful (and common) form of measurement, namely that on an interval scale readily analyzed by linear statistics. Since it is almost impossible to think quantitatively without linearity, Rasch refuses to be appeased with less useful numerical assignments.

Benjamin D. Wright

Cliff N (1992) Abstract measurement theory and the revolution that never happened. Psychological Science 3(3) p.186-190.

Stevens S. S. (1959) Measurement, Psychophysics and Utility, Chap. 2, in C. W. Churchman & P. Ratoosh (Eds.), Measurement: Definitions and Theories. New York: John Wiley

Torgerson W. S. (1958) Theory and methods of scaling. New York: John Wiley

Later note: S.S. Stevens confused scale-types by combining additive and multiplicative relationships under one term, "ratio". It would have been clearer if he had called all additive scales "interval", and all multiplicative scales "ratio", and then noted that numbering systems can have both properties, but that both sets of properties cannot be maintained simultaneously. For instance, the average of the interval properties of a set of numbers (the arithmetic mean) is not the same as the average of the ratio properties of set of numbers (the geometric mean). He could then have pointed out that there are other similar properties possible for scales of numbers, e.g., the "harmonic" properties whose average is the harmonic mean, and the "variance" properties, whose average is the "root mean square". But that would have spoiled his "hierarchy of scales" story ....

Stevens Revisited. Wright B. D. … Rasch Measurement Transactions, 1997, 11:1 p. 552-3.

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
June 23 - July 21, 2023, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 11 - Sept. 8, 2023, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com