Sample Size Again

Wright B. D., Tennant A. (1996) Sample Size Again. Rasch Measurement Transactions 9:4 p. 468.

"I notice that in the chapter on latent trait theory in the book Health Measurement Scales by D.L. Streiner and G.R. Norman (1995, New York: Oxford University Press), they argue that 200 subjects are required for the one parameter (Rasch) model when deriving an item characteristic curve. People will challenge my assertion that 50 cases will do! How shall I respond?"
Alan Tennant
Rheumatology and Rehabilitation Research Unit
University of Leeds, United Kingdom

An empirical item characteristic curve (ICC) plots the relationship between person ability (often represented by raw score) on the X-axis and proportion of success on the item on the Y-axis. It has the shape of a jagged line from lower left to upper right (see Rasch, 1992, pp. 71, 95 for many examples). For stable inference, however, this empirical shape must be superseded by an ideal form with clear properties. If the only constraint on the ICC were that increasing ability implies greater probability of success, then any ogive would suffice, e.g., arc tangent or 2- or 3-parameter models. When particular mathematical properties are required, however, then the relevant ogive is chosen. L. L. Thurstone conceptualized the tested sample as normally distributed and chose the cumulative normal ogive as his ICC.

Georg Rasch escaped from the awkward constraint that the sample be normally distributed by focussing on the requirement that the item parameters be separable from the person parameters. This leads to a logistic ogive for the ICC. Each item is now represented by one parameter which measures its difficulty relative to the other items. The logistic ICC is derived mathematically and its shape determined without reference to any data. In most cases, however, data is required to estimate each item's "one parameter" of difficulty. With a reasonably targeted sample of 50 persons, there is 99% confidence that the estimated item difficulty is within +-1 logit of its stable value - this is close enough for most practical purposes, especially when persons take 10 or more items. With 200 persons, there is 99% confidence the estimated value is within +-0.5 logits (see RMT 7:4 p. 328). But for pilot studies, 30 persons are enough to see what's happening (see Best Test Design). Even if you plan to test 200, start the analysis as soon as the first data become available: 200 incorrect administrations are never as good as 50 correct ones.


Sample size again. Wright BD, Tennant A. … Rasch Measurement Transactions, 1996, 9:4 p.468

The URL of this page is www.rasch.org/rmt/rmt94h.htm

Website: www.rasch.org/rmt/contents.htm