"We shall define the difficulty of a multiple choice test item as being a function of that proportion of individuals answering the item which knows which of the alternatives is the best answer. This definition involves the assumption that there is some objective criterion which determines that one particular alternative is a better answer to the item than any of the others."
Paul Horst, "The Difficulty of a Multiple Choice Test Item" Journal of Educational Psychology, xxiv (1933), 229-232.
Question: I have a fairly large sample of 5,000 subjects. As an experiment I ran the calibration with all subjects and then again with the 500 worst fitting (OUTFIT meansquare range from 2 to 9.9) subjects excluded. There was some change in parameter estimates and item fit, but not huge, not what I expected. This is comforting, but has this been the experience of others or is it probably a quirk of my data or the large sample size?
Answer: Yes, your experience with trimming misfitting persons is typical. You are removing the most unpredictable, the noisiest part of the data, so the remaining data must have a slightly more orderly, closerto- Guttman pattern. So expect to see a slight increase in the logit range of the measure estimates when you trim the data. But it is unusual for this slightly wider spread of the measures to have any substantive implications except where subject measures are adjacent to pre-set cut-points.
Question: Which one is most relevant to decide if an item is misfitting, the size of the mean-square statistic or its statistical significance?
Answer: When considering measurement dilemmas, it is always helpful to think of the equivalent situation in physical measurement. The statistical significance reports how certain we are that the measurement misrepresents with the data - but not how serious the misrepresentation is. The mean-square reports the size of the misrepresentation, but not how certain we are that this isn't merely reflecting the random component in the data predicted by the Rasch model.
In physical measurement, we are usually more concerned about the size of any possible misrepresentation ("measure twice, cut once") than about how certain we are that there is a misrepresentation ("I'm sure I measured it right, so there's no need to measure it again!"). If size of misrepresentation is more important than certainty, then the size of the mean-square is more crucial than its significance. But much of statistics is based on hypothesis testing, where only the probability of misrepresentation is seriously considered.
RMT 21:2 Notes and Quotes Rasch Measurement Transactions, 2007, 21:2
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
Oct. 4 - Nov. 8, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt212d.htm
Website: www.rasch.org/rmt/contents.htm