Estimating Rasch Measures for Extreme Scores

Extreme scores (zero and perfect scores) imply extreme, but indefinitely located, measures. Indefinite measures are awkward to report and difficult to use in further analyses, such as computing means and standard deviations. What can be done to give these measures definite values? Here are several approaches. They are all based on the Bayesian idea that we would not have administered the test to the person, or included the item on the test, unless we thought that the person or item was relevant. Consequently, an extreme score implies a measure only slightly out of the measurement range of the test, not a measure a considerable distance away.

I. The extreme score is only barely extreme.

Raw scores are observed on an ordinal scale. Fractional raw scores are unobservable. Consequently any measure that yields an expected raw score closer than 0.5 score points to an extreme score is expected to be observed as producing an extreme score. Consequently the most central measure for a zero score is that corresponding to 0.5 score points, and for a perfect score is that corresponding to a perfect score less 0.5 score points. After the measures for non-extreme item and person have been estimated in the usual way, the measures corresponding to these almost extreme raw scores can be estimated (RMT 10:2 p. 499). Other commonly-used extreme score corrections are 1/3 and 1/4.

II. The extreme measure is only barely extreme.

From raw score R a measure M_R and its standard error SE_R can be estimated. The measure for score R+1 is approximately M_R + SE_R²(see Wright & Stone, BTD, 1979, 192-5). Thus the measure for an extreme score can be estimated from the measure for a score 1 point less extreme [see Table]. If S is the perfect score, then M_S ≈ M_S-1 + SE_S-1².

III. The extreme measure is only barely significantly different.

Only measures statistically significantly more extreme than non-extreme measures would provoke separate consideration. Thus a measure M_S = M_S-1 + 1.65*SE_S-1 is the most central that would cause the rejection of the hypothesis, at the .05 level, that M_S and M_S-1 are statistical equivalent.

IV. The extreme measure aligns smoothly with non-extreme measures.

This can be achieved by curve-fitting. For instance, a quadratic fit of M_S to M_S-1, M_S-2 and M_S-3 yields M_S = 3*M_S-1 - 3*M_S-2 + M_S-3.

V. The extreme response string is only barely modal.

The likelihood of each possible response string for a particular measure can be computed as L_R = P_nix where x = R, the raw score corresponding to that response string. If L₀>0.5 for a measure, then that measure will probably produce a response string with a raw score of zero. If L₀<0.5, then a non-zero score will probably be observed. The most central measure likely to produce an extreme measure is the one for which L₀ = 0.5.

VI. Data augmentation with non-extreme responses.

The belief in test relevancy can be expressed in terms of additional artificial responses (Jannarone et al., 1990). For instance, two further responses could be added to every person and item response string: a "1" and a "0". Then no response string can be extreme. If the additional responses are arranged to alternate "01" and "10" then the additional artificial persons and items will have close to 50% success rates, and so have minimal impact on the measurement system. Once the set of measures have been estimated, they can be anchored. Then the augmented data can be dropped, allowing standard errors and fit statistics to be computed from the observed data. If the prior belief is twice as strong, then 4 items can be added. For belief expressed in item fractions, then weights can be used for the artificial items.

VII. The underlying distribution is specified.

If the underlying distribution of, say, persons is specified to be normal (or any other distribution), then measures can be imputed for extreme scores that result in the best fit to that distribution. These measures are constrained to be more extreme than the measures estimated from similar non-extreme response strings.

VIII. Posterior distribution = Prior distribution.

The distribution of the measures estimated from the data is intended to coincide with the distribution of the measures that generated it. This can be used to refine the measure estimates for extreme scores.

After extreme measures are estimated using one of the methods above, the means and standard deviations of the item and person measure distributions are computed. Then data are simulated using the entire set of measures (extreme and non-extreme). From these data, a new set of measures are estimated for non-extreme and extreme scores. The means and S.D.s of these new measures are computed and compared to their previous values. The previous "extreme" measures are adjusted and new means and S.D.s computed which make the two distributions as similar as possible. Further data are simulated from the revised measures and the distributions are again compared. The extreme measures again adjusted to make the distributions coincide. This iterative process continues until no more adjustments are necessary or there is no improvement in distribution coincidence.

"Least Measurable Distance" Extrapolations for
Extreme Score Measures in Logits

Approach Number of dichotomous items or polytomous steps, L

I. Extreme Score Adjustment 10 25 50 100

R=1/2
R=1/3
R=1/4 (2L-1)/(L-1)
(3L-1)/(L-1)
(4L-1)/(L-1) 0.75
1.17
1.57 0.71
1.13
1.51 0.70
1.11
1.48 0.70
1.11
1.48

II. Measure Extrapolation

LMD lower bound L/(L-1) 1.11 1.04 1.02 1.01

Test
Width
in
Logits 2
4
6
8 C_f2/L, f=(L-1)/L
C_f4/L
C_f6/L
C_f8/L 1.16
1.22
1.37
1.44 1.04
1.08
1.12
1.17 1.02
1.04
1.07
1.10 1.01
1.01
1.04
1.04

Which to choose?

Most of the difference between these approaches is hair-splitting [see Table], but questions to be addressed include:

(a) Are the items dichotomous, polytomous or mixed?
(b) Is the test fixed length or adaptive?
(c) Are there missing data?
(d) What is known about the underlying distributions?
(e) What computational resources are available?
(f) Are the computed extreme measures reasonable?

Choose an extrapolation approach that provides consistently reasonable measures for your data and is easy to explain. Approach I has proved robust and flexible for small samples with missing data and is implemented in WINSTEPS.

A Rule of Thumb

Measures corresponding to extreme scores 0 and L should be no closer to their next integer neighbors 1 and L-1 than the least measurable distance, LMD, between integer neighbors estimated at 1 and L-1. According to Best Test Design (Wright & Stone, 1979, pp. 132, 135, 192-198, 214), when R=1 or L-1,

LMD = C_fw/L > L/R(L-R) > L/(L-1)

From the Table, reasonable values are generally in the range

1.0 M_S - M_S-1 1.2

A rule of thumb follows:

No extreme score extrapolation can be less than one logit. Extrapolations >1.2 logits require convincing justification.

Standard Errors of Extreme Measures

Extreme measures have indefinite standard errors, but the following provide useful values:

(1) SE_S > SE_S-1

(2) SE_S ≈ SE_S-1 + SE_S-1²/2

(3) SE_S ≈ 1/(Variance of raw score S | M_S)
[This is implemented in WINSTEPS]

Benjamin D. Wright

Jannarone R.J., Yu K.F., Laughlin J.E. (1990) Easy Bayes estimation for Rasch-type models. Psychometrika 55, 3, 449-460.

Estimating Rasch measures for extreme scores.Wright B.D. … Rasch Measurement Transactions, 1998, 12:2 p. 632-3.

Rasch Books and Publications

Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale

Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland

Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan

Other Rasch-Related Resources: Rasch Measurement YouTube Channel

Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.

Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters

Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Jan. 16 - Feb. 13, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Apr. 8 - Apr. 11, 2026, Wed.-Sat.	National Council for Measurement in Education - Los Angeles, CA, ncme.org/events/2026-annual-meeting
Apr. 8 - Apr. 12, 2026, Wed.-Sun.	American Educational Research Association - Los Angeles, CA, www.aera.net/AERA2026
May. 15 - June 12, 2026, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 19 - July 25, 2026, Fri.-Sat.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com

The URL of this page is www.rasch.org/rmt/rmt122h.htm

Website: www.rasch.org/rmt/contents.htm

"Least Measurable Distance" Extrapolations for Extreme Score Measures in Logits
Approach			Number of dichotomous items or polytomous steps, L
I. Extreme Score Adjustment			10	25	50	100
R=1/2 R=1/3 R=1/4	(2L-1)/(L-1) (3L-1)/(L-1) (4L-1)/(L-1)		0.75 1.17 1.57	0.71 1.13 1.51	0.70 1.11 1.48	0.70 1.11 1.48
II. Measure Extrapolation
LMD lower bound		L/(L-1)	1.11	1.04	1.02	1.01
Test Width in Logits	2 4 6 8	C_f2/L, f=(L-1)/L C_f4/L C_f6/L C_f8/L	1.16 1.22 1.37 1.44	1.04 1.08 1.12 1.17	1.02 1.04 1.07 1.10	1.01 1.01 1.04 1.04