Rasch Power Analysis: Size vs. Significance: Infit and Outfit Mean-Square and Standardized Chi-Square Fit Statistic

Which statistic is more informative depends on what null hypothesis you are concerned about:
Null hypothesis: "the data fit the model (perfectly)" - use the t-test significance = Winsteps Zstd
Null hypothesis: "the data fit the model (usefully)" - use the chi-square divided by degrees of freedom = mean-square.

"The first of the distributions characteristic of modern tests of significance, though originating with F.R. Helmert [1875], was rediscovered by Karl Pearson in 1900, for the measure of discrepancy between observation and hypothesis, known as χ2 [chi-square].. ... It supplies an exact and objective measure of the joint discrepancy from their expectations of a number of normally distributed ... variates" (R. A. Fisher, Statistical Methods for Research Workers.)

It is the χ2 distribution which underlies many Rasch-model fit statistics. Even those based on the likelihood of the data capitalize on the fact that -2 loge ( likelihood ) is asymptotically χ2.

A χ2 statistic with k degrees of freedom, d.f., is the sum of the squares of k random unit-normal deviates. Therefore its expected value is k, and its model variance is 2k. This provides the convenient feature that the expected value of a mean-square statistic, i.e., a χ2 statistic divided by its d.f. is 1.0. But the model variance of a mean-square statistic is 2/k. Thus, as the number of degrees of freedom, i.e., the sample size, increases, the power to detect small divergences increases, and ever smaller departures of the mean-square from 1.0 become statistically "significant", i.e., surprising, if the data are indeed as modeled.

For terminology, etc., see www.rasch.org/rmt/rmt162f.htm

The relationship between the size and significance of mean-square statistics is shown in the Figure. The statistical significance is expressed as the value of the corresponding value on a unit normal distribution. For 2-sided t-tests, 1.96 corresponds to p=.05. For dichotomous responses, d.f. is a little less than sample size (for an item) or test length (for a person). For polytomous responses, d.f. is somewhat less than (sample size or test length)*(polytomous categories - 1).

Test of Perfect Fit

The null hypothesis for a significance test of "perfect" fit of these data would be "Mean-square=1.0". Since the Rasch model is a mathematical ideal, like a Pythagorean triangle, we never expect to encounter empirical data that match it exactly. So this is an instance in which we know, a priori, that the null hypothesis cannot be accepted.

A mean-square of 1.2 means 1 unit of modeled information and .2 of unmodeled noise. The plot indicates that items with as little misfit as this would be flagged as significantly misfitting if observed in samples of over 200 persons. On the other hand, grossly noisy items, with more unmodeled noise than modeled information, i.e., with mean-squares of 2.0 or more, are not flagged in samples of less than 10. Overall, useful sample sizes for standardized fit statistics appear to be in the range 100-250 data points for the "perfect fit" null hypothesis.

Indication of Useful Fit

An indicator of "useful" fit could be "mean-square = 1.5 or less" (e.g., RMT 14:2, p. 743). Then, as the sample size (d.f.) increases, especially beyond 30, there is increasing certainty as to whether these data are productive for measurement (mean-square ≤ 1.5) or unproductive (mean-square > 1.5). This could be formulated as a one-sided t-test of the hypothesis that the mean-square is ≤1.5, with only values >1.5 being of concern.

Informal simulations studies and experience analyzing hundreds of datasets indicate that:

Interpretation of parameter-level mean-square fit statistics:
>2.0Distorts or degrades the measurement system
1.5 - 2.0Unproductive for construction of measurement, but not degrading
0.5 - 1.5Productive for measurement
<0.5Less productive for measurement, but not degrading.
May produce misleadingly good reliabilities and separations

See also "Reasonable mean-square fit values" www.rasch.org/rmt/rmt83b.htm

John Michael Linacre

Plot of mean-squares with significance and degrees of freedom



Rasch Power Analysis: Size vs. Significance: Standardized Chi-Square Fit Statistic. J.M. Linacre … Rasch Measurement Transactions, 2003, 17:1, 918.



Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):

 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Sept. 27-29, 2017, Wed.-Fri. In-person workshop: Introductory Rasch Analysis using RUMM2030, Leeds, UK (M. Horton), Announcement
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Oct. 25-27, 2017, Wed.-Fri. In-person workshop: Applying the Rasch Model hands-on introductory workshop, Melbourne, Australia (T. Bond, B&FSteps), Announcement
Dec. 6-8, 2017, Wed.-Fri. In-person workshop: Introductory Rasch Analysis using RUMM2030, Leeds, UK (M. Horton), Announcement
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt171n.htm

Website: www.rasch.org/rmt/contents.htm