Polytomous Mean-Square Fit Statistics

Fit statistics in Rasch analysis serve a different purpose from those in regression analysis. In descriptive statistical methodology, fit statistics are used to discover a model that fits the data well enough that the data could be considered to have been generated by the model. In Rasch analysis, the model is already chosen. The purpose of the fit statistics is to aid in measurement quality control, to identify those parts of the data which meet Rasch model specifications and those parts which don't. Parts that don't are not automatically rejected, but are examined to identify in what way, and why, they fall short, and whether, on balance, they contribute to or corrupt measurement. Then the decision is made to accept, reject or modify the data. Modification includes simple actions such as correcting obvious data entry errors and respondent mistakes, and more sophisticated actions such as collapsing rating scale categories.

We have 30 years of experience investigating mean-square (MnSq) statistics for dichotomous data (RMT 8:2 360 www.rasch.org/rmt/rmt82a.htm). The interpretation of fit statistics for polytomous items, however, is a recent development.

Response String Easy..........Hard	INFIT Mean-Square	OUTFIT Mean-Square	Point-Measure Correlation	Point-Measure Expected	Diagnosis
I. modelled: 33333132210000001011 31332332321220000000 33333331122300000000 33333331110010200001	.98 .98 1.06 1.03	.99 1.04 .97 1.00	.78 .81 .87 .81	.70 .70 .70 .70	Stochastically monotonic in form, strictly monotonic in meaning
II. overfitting (muted): 33222222221111111100 33333222221111100000 33333333221100000000 32222222221111111110 32323232121212101010	.18 .31 .80 .21 .52	.22 .36 .77 .26 .54	.92 .97 .93 .89 .82	.70 .70 .70 .70 .70	Guttman pattern most probable responses high discrimination low discrimination tight progression
III. limited categories 33333333332222222222 22222222221111111111 33333322222222211111	.24 .24 .16	.24 .34 .20	.87 .87 .93	.55 .70 .66	range restriction: high (low) categories central categories only 3 categories
IV. informative-noisy: 32222222201111111130 33233332212333000000 33133330232300101000 33333333330000000000	.94 1.25 1.49 1.37	1.22 1.09 1.40 1.20	.55 .77 .72 .87	.70 .69 .70 .70	noisy outliers erratic transitions noisy progression extreme categories
V. non-informative: 22222222222222222222 12121212121212121212 01230123012301230123 03030303030303030303 03202002101113311002	.85 1.50 3.62 5.14 2.99	1.21 1.96 4.61 6.07 3.59	.00 -.09 -.19 -.09 -.01	.67 .70 .68 .70 .70	one category central flip-flop rotate categories extreme flip-flop random responses
VI. contradictory: 11111122233222111111 11111111112222222222 22222222223333333333 00111111112222222233 00000000003333333333	1.75 2.56 2.11 4.00 8.30	2.02 3.20 4.13 5.58 9.79	.00 -.87 -.87 -.92 -.87	.70 .55 .70 .70 .70	folded pattern central reversal high reversal Guttman reversal extreme reversal

One subtlety of rating scale fit analysis is the detection of idiosyncratic category usage, particularly respondents' over-use of central or extreme categories. The Table illustrates response strings and their diagnostic fit statistics. The responses are reported from left-to-right in descending order of expected values. Representative values have been chosen for item and step calibrations with a 4 category (0-3) rating scale. The response strings that best fit the Rasch model (Section I) descend in value stochastically. They exhibit MnSq's near 1.0 and positive point-measure correlations (which are similar to point-biserial correlations, but correlate responses with Rasch measures rather than raw scores).

In Section II, The Guttman pattern matches the expectations as closely as possible. As a result, it has low MnSq statistics. MnSq's less than 1.0 indicate better than expected fit to the model. These responses agree with, but add little additional information to, other responses. This pattern also has high point-measure correlation.

In Section III, other over-fitting (muted) response strings include those in which the full range of the categories is not employed. Their low MnSq statistics and high correlations seem to mark them as matching the model especially well, but low MnSq statistics indicate a lack of statistical information, here resulting from a category range restriction. Raters trying not to contradict other raters may emphasize central categories or exhibit less discrimination (Section II) and so be reported with low MnSq's, i.e., as less statistically informative.

Section IV depicts response strings that contain useful measurement information, but also challenge the construct hierarchy. To the naked eye, response strings in Section IV look much like those in Section I. MnSq's above 1.0 indicate the presence of unmodelled variance (noise) along with the useful information in the responses. At some level, the noise in the responses overwhelms the information (music) and the response string, as it stands, is no longer assisting in measurement construction. Looking at Sections V and VI, it appears that MnSq's of 1.5 or more, and correlations near or below 0, are indicative of disruptive response strings for these data.

Section V presents non-informative response sets, used by respondents to avoid engaging the rating scale. The symptom here is the lack of relationship between the responses and the construct. Measurement construction would be aided by omitting or pruning such response strings. Once the measurement framework has been constructed and established with anchor measures and calibrations, all response strings may be reported so that the misfit statistics can be used to inform the use of the measures. Paradoxically, perfect agreement, in which all raters rate an examinee with the same category is not an ideal of Rasch measurement. Unanimity of choice of response category contributes no information about the relative standing of the categories, but implies that the category is so wide that large differences in perceived levels of performance are still classified under that one category.

Section VI demonstrates response strings that occur when prompts are misunderstood or miscoded. Large fit statistics and negative correlations flag response strings that are coded "backwards". Except with unusual patterns of missing data or rank ordering, response patterns with negative correlations should be recoded or omitted from the measurement system.

Richard M. Smith
Rehabilitation Foundation Inc.
P.O. Box 675, Wheaton IL 60189-9931

This is the BIGSTEPS control file for the data above: &inst TITLE='COMPUTING STATISTICS' NI=20 ITEM1=1 ; include response strings in person name name1=1 namlen=30 CODES=0123 ptbis=no ; compute point-measure correlation INUMB=YES ; no item labels TFILE=* 6 ; Table 6 - persons in fit order 18 ; table 18 - persons in entry order * IAFILE=* ; item anchor values - uniform 1 -1.9 2 -1.7 3 -1.5 4 -1.3 5 -1.1 6 -0.9 7 -0.7 8 -0.5 9 -0.3 10 -0.1 11 0.1 12 0.3 13 0.5 14 0.7 15 0.9 16 1.1 17 1.3 18 1.5 19 1.7 20 1.9 * SAFILE=* ; step anchor values 0 0 1 -1 2 0 3 1 * &end 33333132210000001011 modelled 31332332321220000000 modelled 33333331122300000000 modelled 33333331110010200001 modelled 33222222221111111100 most expected 33333222221111100000 most likely 33333333221100000000 high discrimination 32222222221111111110 low discrimination 32323232121212101010 tight progression 33333333332222222222 high (low) categories 22222222221111111111 2 central categories 33333322222222211111 only 3 categories 32222222201111111130 noisy outliers 33233332212333000000 erratic transitions 33333333330000000000 extreme categories 33133330232300101000 noisy progression 22222222222222222222 one category 12121212121212121212 central flip-flop 03202002101113311002 random responses 01230123012301230123 rotate categories 03030303030303030303 extreme flip-flop 11111122233222111111 folded pattern 22222222223333333333 high reversal 11111111112222222222 central reversal 00111111112222222233 Guttman reversal 00000000003333333333 extreme reversal

Polytomous mean-square fit statistics. Smith R.M. … Rasch Measurement Transactions, 1996, 10:3 p. 516-517.

Rasch Books and Publications

Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale

Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland

Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan

Other Rasch-Related Resources: Rasch Measurement YouTube Channel

Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.

Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters

Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Jan. 16 - Feb. 13, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Apr. 8 - Apr. 11, 2026, Wed.-Sat.	National Council for Measurement in Education - Los Angeles, CA, ncme.org/events/2026-annual-meeting
Apr. 8 - Apr. 12, 2026, Wed.-Sun.	American Educational Research Association - Los Angeles, CA, www.aera.net/AERA2026
May. 15 - June 12, 2026, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 19 - July 25, 2026, Fri.-Sat.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com

The URL of this page is www.rasch.org/rmt/rmt103a.htm

Website: www.rasch.org/rmt/contents.htm