Rasch Estimation Methods, Statistical Independence and Global Fit

Edited excerpts from the LTEST-L e-mail forum:

John de Jong (language test developer and long-time Rasch practitioner):

No global model-data fit statistics are reported by the Facets Rasch analysis program for judge-awarded ratings, and there can't be, because

1) Facets uses unconditional (JMLE UCON) estimation;

2) the data violates the assumption of independence, i.e., the different scores given by different raters are not independent: they are given to the same students on the same tasks.

So, though the Facets program may be nice to tinker with, no generalizations from results can be made.

The problem of consistent estimation when data is missing has been partially solved by Cees Glas and applied in the OPLM computer program (RMT 6:4 p.253). OPLM uses Conditional Maximum Likelihood CMLE to estimate item parameters across large numbers of linked subtests. These item parameter estimates are then used to estimate person measures.

John M. Linacre (author of Facets):

Question: Why doesn't Facets report a global summary statistic for the fit of judge-awarded data to the Rasch measurement model?

Answer: It does when you tell it to! Simply define an extra facet, beyond students, tasks and raters. This facet contains only one element which is specified as a component of all data points. The Infit and Outfit statistics for this element are global mean-square fit statistics (chi-squares divided by degrees of freedom) with z-score-equivalent significance levels. Since this may seem awkward, I'll overcome my revulsion to global fit tests and let Facets report a chi-square test of the null hypothesis that all data fit the model (as the BIGSTEPS Rasch analysis program does). I predict, in advance, that data will never fit - because empirical data never fit a theoretical ideal!

Why am I repulsed by global fit tests - the neutron bombs of statistical practice? Because misfit is never global and never a statistical event, it is always local and idiosyncratic. Rejecting a dataset or a measurement model on global fit is equivalent to refusing to eat because most things are inedible. In practice, you evaluate what you eat one mouthful at a time, checking as you eat for local fit to your model for food edibility. Otherwise you starve and your species becomes extinct!

There is a crucial difference between measurement models and the descriptive regression models beloved by statisticians:

Descriptive models summarize a particular set of assumed-linear numbers as succinctly as possible. Any model will do. Global fit is a convenient way of choosing the opportunistically "best" model or rejecting this or that "bad" model. Global fit is the only criterion.

Rasch measurement models are different in intent and practice. They extract, from a set of ordered observations, linear measures as generalizable as statistically possible. Only a specific family of models can do this. There is no influence of global (or even local) fit on model choice. The influence of fit is on data choice - its selection, reorganization, reconception.

John objects to fit statistics based on measures obtained with the joint (unconditional) maximum likelihood estimation algorithm (UCON). Like all estimation techniques, UCON has its strengths and weaknesses. Its strengths include efficient and versatile handling of systematic, random or unintentional missing data, and the ability to estimate measures from large, complex many-facet data sets incorporating diverse observation models. Its weaknesses are a minor degree of statistical inconsistency under artificial conditions, and statistical bias with some very small data sets and certain idiosyncratic data configurations.

How troublesome are these weaknesses? Statistical consistency is the property that, when applied to an infinite amount of data, the estimation algorithm will give a "right" answer. In fact, as implemented in Facets, UCON is consistent (Haberman 1977), because persons, items, etc. are all conceptually unlimited!

Bias is the degree to which an estimate based on a finite set of data is misleading. All Rasch estimation methods are biased, i.e., they produce a "wrong" answer (RMT 3:1 47). UCON is more biased than conditional methods, but this bias is negligibly small and always less than the standard errors of the estimated measures for real many-facet data sets of any useful size. Further, the bias can be easily corrected or eliminated for many simple situations, e.g., paired comparisons.

The crucial question here is: Are fit statistics based on UCON estimates useful for quality control? We know that all estimated fit statistics reported by any computer program are "wrong". Empirical data and the estimates derived from them are never exactly in accord with the statistical theory underlying any fit statistic. But 25 years of practical experience with the UCON algorithm provide convincing evidence that UCON-based fit statistics are helpful and trustworthy.

Since it produces reasonable and useful estimates, the UCON algorithm is currently employed in Facets. I watch for statistically better, faster converging, more robust and less computationally intensive estimation algorithms. Each year the Facets estimation algorithm improves. Cees Glas's work on two facet (person-item) estimation is remarkable and I hope he soon turns his attention to helping "many-facet" practitioners.

John also asserts that the lack of statistical independence in many-facet data invalidates any fit statistics. I agree that, when two raters rate the same performance, their ratings are not statistically independent in general. But, for independently- produced ratings by skilled and perceptive expert judges, the requirement for successful operation of a Rasch measurement model is not unconditional independence, but conditional or local independence. In practice this means: Do all the ratings awarded by judges to students on tasks have about the same amount of statistical independence from each other across judges, across students and across tasks? If so, their interdependence is used to estimate measures, and their relative independence is summarized by fit statistics. Thousands of analyses have been performed by scores of researchers that confirm the utility of Rasch measures and fit statistics derived from judge-awarded ratings.

Facets isn't perfect, but it's good enough for practical work and far better than any existing operational alternative. I thank John de Jong for provoking me to thought, and Stuart Luppescu for bringing his remarks to my attention. I welcome feedback.

Haberman S. J. 1977. Maximum likelihood estimates in exponential response models. Annals of Statistics 5: 815-841

Rasch Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … 1993, 7:2 p.296-7


Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … Rasch Measurement Transactions, 1993, 1993, 7:2 p.296-7

Please help with Standard Dataset 4: Andrich Rating Scale Model



Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):

 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 31, 2017, Fri. Conference: 11th UK Rasch Day, Warwick, UK, www.rasch.org.uk
April 2-3, 2017, Sun.-Mon. Conference: Validity Evidence for Measurement in Mathematics Education (V-M2Ed), San Antonio, TX, Information
April 26-30, 2017, Wed.-Sun. NCME, San Antonio, TX, www.ncme.org - April 29: Ben Wright book
April 27 - May 1, 2017, Thur.-Mon. AERA, San Antonio, TX, www.aera.net
May 26 - June 23, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 30 - July 29, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 31 - Aug. 3, 2017, Mon.-Thurs. Joint IMEKO TC1-TC7-TC13 Symposium 2017: Measurement Science challenges in Natural and Social Sciences, Rio de Janeiro, Brazil, imeko-tc7-rio.org.br
Aug. 7-9, 2017, Mon-Wed. In-person workshop and research coloquium: Effect size of family and school indexes in writing competence using TERCE data (C. Pardo, A. Atorressi, Winsteps), Bariloche Argentina. Carlos Pardo, Universidad Catòlica de Colombia
Aug. 7-9, 2017, Mon-Wed. PROMS 2017: Pacific Rim Objective Measurement Symposium, Sabah, Borneo, Malaysia, proms.promsociety.org/2017/
Aug. 10, 2017, Thurs. In-person Winsteps Training Workshop (M. Linacre, Winsteps), Sydney, Australia. www.winsteps.com/sydneyws.htm
Aug. 11 - Sept. 8, 2017, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 18-21, 2017, Fri.-Mon. IACAT 2017: International Association for Computerized Adaptive Testing, Niigata, Japan, iacat.org
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago, jampress.org/iomc2017.htm
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
The HTML to add "Coming Rasch-related Events" to your webpage is:
<script type="text/javascript" src="http://www.rasch.org/events.txt"></script>

 

The URL of this page is www.rasch.org/rmt/rmt72n.htm

Website: www.rasch.org/rmt/contents.htm