DIF detection: Rasch versus Mantel-Haenszel

Schulz et al. (1989) compare the Rasch (RM) and Mantel-Haenszel (MH) procedures for detecting differential item functioning (DIF) (also RMT 1989 3:2 51-53). The RM procedure, following Wright et al. (1976), was implemented with computer programs MSCALE and LINK (Schulz 1984). The MH procedure (Holland and Thayer 1988) was implemented with program MHDIP (Raju 1988).

Sensitivity to DIF: With small groups, MH was significantly less sensitive to DIF than RM. MH indicates significance with null hypothesis chi-squares. The observed variance of these chi-squares was less than modelled. Male/Female DIF detected by both RM and MH when groups were N=1000 were lost by MH when groups were randomly reduced to N=100, but still detected by RM.

Reliability: Contrary to MH claims, empirical results show RM to be more reliable than MH when groups are small (N=100 to 200), and always as reliable when groups are large (N>300).

Validity: When groups are comparable in achievement, RM and MH detect "the same thing". Since RM and MH DIF indices from the male/female contrast correlate at their statistical maximum, .99, one cannot explain the greater sensitivity and reliability of RM as due to the two methods detecting "something different".

DIF versus Between-Group Achievement Differences: DIF must not be confused with real group differences in achievement. To be acceptable, a DIF procedure must produce "no net DIF" over items. Three of the four MH variants fail this criterion. The MH variants differ in 1) whether the studied item is included or excluded from the total score used for matching, and 2) whether matching is fat or fine (fat: seven or less levels of total score; fine: all possible levels of total score). When contrast groups differ significantly in achievement, RM yields "no net DIF". The only MH variant which yields "no net DIF" is the one which includes the studied item in the total score and uses all levels of total score for matching. This is the MH variant most similar to RM. When contrast groups differ in achievement, the other three MH variants yield net DIF across items that is significantly different from zero (p>.001).

DIF versus Item-by-Achievement Interactions: DIF is intended to detect item-by-group interaction exclusively. When groups differ in achievement too, some DIF indices confound item-by-group and item-by- achievement interactions. The correlation between RM and MH DIF indices was at its theoretical maximum of 0.99 for equal achievement contrasts. But it was substantially less (r=.81) than the theoretical maximum (.98) for unequal achievement contrasts. Now RM and MH DIF procedures no longer detect "the same thing".

The differences between RM and MH DIF indices estimated from unequal achievement contrasts are systematically related to item-by- achievement interactions of the kind detected by RM item fit statistics. Highly discriminating (low infit) items are biased in favor of high achievers while poorly discriminating (high infit) items are biased in favor of low achievers. Thus RM DIF indices correlate positively with RM infit statistics (r=.32), but, inexplicably, MH DIF indices correlate negatively (r=-.32).

Recommendations: When contrast groups differ in achievement, then construct achievement-matched samples of the largest possible size. When contrast groups are achievement-matched, RM item Z-scores:

Z12 = (b1 - b2) / [sqrt(s1^2 + s2^2)]

are more sensitive to DIF than MH chi-squares and at least as reliable. A practical advantage of RM is that it measures DIF in the same units as person achievement.

Holland PW & Thayer DT 1988 Differential item performance and Mantel- Haenszel. In H Wainer and H Braun (Eds.), Test Validity. Hillsdale NJ: Lawrence Erlbaum

Raju NS 1988 MHDIP. Chicago:Psych Dept, Ill Inst of Technology.

Schulz EM 1984 LINK. A program for comparing paired Rasch estimates and linking tests. Chicago: MESA Press.

Schulz EM, Perlman CP, Rice WK, Wright BD 1989 Empirical Comparison of Rasch and Mantel-Haenszel Procedures. AERA

Wright BD, Mead RJ, Draba R 1976 Detecting and correcting test item bias with a logistic model. Chicago: MESA.

DIF detection: Rasch versus Mantel-Haenszel, E M Schulz … Rasch Measurement Transactions, 1990, 4:2 p. 107

Please help with Standard Dataset 4: Andrich Rating Scale Model

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):


ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago, jampress.org/iomc2017.htm
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Oct. 25-27, 2017, Wed.-Fri. In-person workshop: Applying the Rasch Model hands-on introductory workshop, Melbourne, Australia (T. Bond, B&FSteps), Announcement
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
April 13-17, 2018, Fri.-Tues. AERA, New York, NY, www.aera.net
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
The HTML to add "Coming Rasch-related Events" to your webpage is:
<script type="text/javascript" src="https://www.rasch.org/events.txt"></script>


The URL of this page is www.rasch.org/rmt/rmt42f.htm

Website: www.rasch.org/rmt/contents.htm