Edited excerpts from the LTEST-L e-mail forum:
John de Jong (language test developer and long-time Rasch practitioner):
No global model-data fit statistics are reported by the Facets Rasch analysis program for judge-awarded ratings, and there can't be, because
1) Facets uses unconditional (JMLE UCON) estimation;
2) the data violates the assumption of independence, i.e., the different scores given by different raters are not independent: they are given to the same students on the same tasks.
So, though the Facets program may be nice to tinker with, no generalizations from results can be made.
The problem of consistent estimation when data is missing has been partially solved by Cees Glas and applied in the OPLM computer program (RMT 6:4 p.253). OPLM uses Conditional Maximum Likelihood CMLE to estimate item parameters across large numbers of linked subtests. These item parameter estimates are then used to estimate person measures.
John M. Linacre (author of Facets):
Question: Why doesn't Facets report a global summary statistic for the fit of judge-awarded data to the Rasch measurement model?
Answer: It does when you tell it to! Simply define an extra facet, beyond students, tasks and raters. This facet contains only one element which is specified as a component of all data points. The Infit and Outfit statistics for this element are global mean-square fit statistics (chi-squares divided by degrees of freedom) with z-score-equivalent significance levels. Since this may seem awkward, I'll overcome my revulsion to global fit tests and let Facets report a chi-square test of the null hypothesis that all data fit the model (as the BIGSTEPS Rasch analysis program does). I predict, in advance, that data will never fit - because empirical data never fit a theoretical ideal!
Why am I repulsed by global fit tests - the neutron bombs of statistical practice? Because misfit is never global and never a statistical event, it is always local and idiosyncratic. Rejecting a dataset or a measurement model on global fit is equivalent to refusing to eat because most things are inedible. In practice, you evaluate what you eat one mouthful at a time, checking as you eat for local fit to your model for food edibility. Otherwise you starve and your species becomes extinct!
There is a crucial difference between measurement models and the descriptive regression models beloved by statisticians:
Descriptive models summarize a particular set of assumed-linear numbers as succinctly as possible. Any model will do. Global fit is a convenient way of choosing the opportunistically "best" model or rejecting this or that "bad" model. Global fit is the only criterion.
Rasch measurement models are different in intent and practice. They extract, from a set of ordered observations, linear measures as generalizable as statistically possible. Only a specific family of models can do this. There is no influence of global (or even local) fit on model choice. The influence of fit is on data choice - its selection, reorganization, reconception.
John objects to fit statistics based on measures obtained with the joint (unconditional) maximum likelihood estimation algorithm (UCON). Like all estimation techniques, UCON has its strengths and weaknesses. Its strengths include efficient and versatile handling of systematic, random or unintentional missing data, and the ability to estimate measures from large, complex many-facet data sets incorporating diverse observation models. Its weaknesses are a minor degree of statistical inconsistency under artificial conditions, and statistical bias with some very small data sets and certain idiosyncratic data configurations.
How troublesome are these weaknesses? Statistical consistency is the property that, when applied to an infinite amount of data, the estimation algorithm will give a "right" answer. In fact, as implemented in Facets, UCON is consistent (Haberman 1977), because persons, items, etc. are all conceptually unlimited!
Bias is the degree to which an estimate based on a finite set of data is misleading. All Rasch estimation methods are biased, i.e., they produce a "wrong" answer (RMT 3:1 47). UCON is more biased than conditional methods, but this bias is negligibly small and always less than the standard errors of the estimated measures for real many-facet data sets of any useful size. Further, the bias can be easily corrected or eliminated for many simple situations, e.g., paired comparisons.
The crucial question here is: Are fit statistics based on UCON estimates useful for quality control? We know that all estimated fit statistics reported by any computer program are "wrong". Empirical data and the estimates derived from them are never exactly in accord with the statistical theory underlying any fit statistic. But 25 years of practical experience with the UCON algorithm provide convincing evidence that UCON-based fit statistics are helpful and trustworthy.
Since it produces reasonable and useful estimates, the UCON algorithm is currently employed in Facets. I watch for statistically better, faster converging, more robust and less computationally intensive estimation algorithms. Each year the Facets estimation algorithm improves. Cees Glas's work on two facet (person-item) estimation is remarkable and I hope he soon turns his attention to helping "many-facet" practitioners.
John also asserts that the lack of statistical independence in many-facet data invalidates any fit statistics. I agree that, when two raters rate the same performance, their ratings are not statistically independent in general. But, for independently- produced ratings by skilled and perceptive expert judges, the requirement for successful operation of a Rasch measurement model is not unconditional independence, but conditional or local independence. In practice this means: Do all the ratings awarded by judges to students on tasks have about the same amount of statistical independence from each other across judges, across students and across tasks? If so, their interdependence is used to estimate measures, and their relative independence is summarized by fit statistics. Thousands of analyses have been performed by scores of researchers that confirm the utility of Rasch measures and fit statistics derived from judge-awarded ratings.
Facets isn't perfect, but it's good enough for practical work and far better than any existing operational alternative. I thank John de Jong for provoking me to thought, and Stuart Luppescu for bringing his remarks to my attention. I welcome feedback.
Haberman S. J. 1977. Maximum likelihood estimates in exponential response models. Annals of Statistics 5: 815-841
Rasch Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … 1993, 7:2 p.296-7
Estimation methods, statistical independence and global fit. de Jong J, Linacre JM. … Rasch Measurement Transactions, 1993, 1993, 7:2 p.296-7
Rasch Publications | ||||
---|---|---|---|---|
Rasch Measurement Transactions (free, online) | Rasch Measurement research papers (free, online) | Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch | Applying the Rasch Model 3rd. Ed., Bond & Fox | Best Test Design, Wright & Stone |
Rating Scale Analysis, Wright & Masters | Introduction to Rasch Measurement, E. Smith & R. Smith | Introduction to Many-Facet Rasch Measurement, Thomas Eckes | Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. | Statistical Analyses for Language Testers, Rita Green |
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar | Journal of Applied Measurement | Rasch models for measurement, David Andrich | Constructing Measures, Mark Wilson | Rasch Analysis in the Human Sciences, Boone, Stave, Yale |
in Spanish: | Análisis de Rasch para todos, Agustín Tristán | Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez |
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
Apr. 14-17, 2020, Tue.-Fri. | International Objective Measurement Workshop (IOMW), University of California, Berkeley, https://www.iomw.org/ |
May 22 - June 19, 2020, Fri.-Fri. | On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 26 - July 24, 2020, Fri.-Fri. | On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com |
June 29 - July 1, 2020, Mon.-Wed. | Measurement at the Crossroads 2020, Milan, Italy , https://convegni.unicatt.it/mac-home |
July - November, 2020 | On-line course: An Introduction to Rasch Measurement Theory and RUMM2030Plus (Andrich & Marais), http://www.education.uwa.edu.au/ppl/courses |
July 1 - July 3, 2020, Wed.-Fri. | International Measurement Confederation (IMEKO) Joint Symposium, Warsaw, Poland, http://www.imeko-warsaw-2020.org/ |
Aug. 7 - Sept. 4, 2020, Fri.-Fri. | On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com |
Oct. 9 - Nov. 6, 2020, Fri.-Fri. | On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 25 - July 23, 2021, Fri.-Fri. | On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt72n.htm
Website: www.rasch.org/rmt/contents.htm