Differential Rater Functioning

Monitoring the quality of ratings obtained within the context of rater-mediated assessments is of major importance (Engelhard, 2002). One of the areas of concern is differential rater functioning (DRF). DRF focuses on whether or not raters show evidence of exercising differential severity/leniency when rating students within different subgroups. For example, a rater my rate male students' essays (or female students' essays) more severely or leniently than expected. Ideally, each rater's level of severity/leniency should be invariant across gender subgroups. Residual analyses of raters flagged with DRF can be used to provide a detailed exploration of potential rater biases, and they can also form the basis for conducting mixed-methods study (Creswell & Plano-Clark, 2007).

In order to illustrate the use of residual analyses to examine DRF, data from (Engelhard & Myford, 2003) are used. The purpose of the original study was to examine the rating behavior of raters who scored essays written for the Advanced Placement® English Literature and Composition (AP ELC) exam. Data from the 1999 AP ELC exam were analyzed using the FACETS model. One of the sections of this report focused on DRF among raters scoring the AP ELC exam.

A rater x student gender bias analysis was conducted to determine whether or not raters were rating essays composed by male and female students in a similar fashion. Were there raters who were more prone to gender bias than other raters? The FACETS analyses identified 18 raters that, based on statistical criteria, may have exhibited DRF related to student gender.

Table 1. Summary of Differential Rater Functioning Statistics (Student-Gender Interactions) for Rater 108

* |Z|≥=2.00

Based on the overall fit statistics (INFIT MNSQ = 1.1, OUTFIT MNSQ = 1.1), Rater 108 did not appear to be rating in an unusual fashion. However, when the interaction between rater and student gender is specifically examined, Table 1, a different story emerges. Rater 108 tended to rate the male students' essays higher on average (5.33) than expected (4.56). For females, the observed average (4.83) is less than the expected average (5.13). In summary, there is a statistically significant gender-difference in the rater's severity (z = 2.39).

Figure 1. Rater 108's rating profile

Figure 1 shows that Rater 108 assigned higher-than-expected ratings to 8 of the 9 male students' essays, but lower than expected ratings to 13 of the 23 female students' essays. This highlights the importance of exploring not only mean differences between observed and expected ratings within each subgroup category but also the variability and spread of residuals within subgroups. Ultimately, DRF involves looking at discrepancies between observed and expected ratings at the individual level. As pointed out many years ago by Wright (1984, p. 285),

"bias found for groups is never uniformly present among members of the groups or uniformly absent among those not in the group. For the analysis of item bias to do individuals any good, say, by removing the bias from their measures, it will have to be done on the individual level."

In rater-mediated assessments, it is very important to conduct group-level analyses of DRF, but use caution if routine statistical adjustments are made for rater severity. The full interpretation of these effects require a detailed examination of residuals for each rater. Using a mixed-methods framework, suspect raters that can then be investigated in more detail using case studies and other qualitative analyses.

George Engelhard, Jr., Emory University

Creswell J.W. & Plano-Clark V.L. (2007). Designing and conducting mixed methods research. Sage.

Engelhard, G. (2002). Monitoring raters in performance assessments. In G. Tindal and T. Haladyna (Eds.), Large-scale Assessment Programs for ALL Students: Development, Implementation, and Analysis, (pp. 261-287). Mahwah, NJ: Erlbaum.

Engelhard, G, & Myford, C.M. (2003). Monitoring rater performance in the Advanced Placement English Literature and Composition Program with a many-faceted Rasch model. NY: College Entrance Examination Board. http://professionals.collegeboard.com/research/pdf/cbresearchreport20031_22204.pdf

Wright, B.D. (1984). Despair and hope for educational measurement. Contemporary Education Review, 3(1), 281-285. www.rasch.org/memo41.htm

Differential Rater Functioning. … George Engelhard, Jr., Rasch Measurement Transactions, 2008, 21:3 p. 1124

Rasch Books and Publications

Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale

Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland

Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan

Other Rasch-Related Resources: Rasch Measurement YouTube Channel

Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.

Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters

Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
May. 15 - June 12, 2026, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 19 - July 25, 2026, Fri.-Sat.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 31 - Sept 2 2026, Mon.-Wed.	In person: IMEKO TC1 Metrology Education and Training symposium, Klagenfurt, Austria www.photomet-edumet2026.com. Submissions by April 20
Aug. 30 - Sept. 3, 2027, Mon.-Fri.	In Person: 2027 IMEKO World Congress (TC1, Tc7, TC13, TC18, TC26), Rimini, Italy imeko2027.org

The URL of this page is www.rasch.org/rmt/rmt213f.htm

Website: www.rasch.org/rmt/contents.htm