Variable maps provide useful tools for communicating the meaning of constructs in the human sciences. It has not been recognized that differential item functioning (DIF) can also be represented in a meaningful way on a variable map. In this case, the underlying continuum represents the differences between subgroups with comparable levels of achievement across a set of test items.
Data from Engelhard, Wind, Kobrin, and Chajewski (2012) are used to illustrate the concept of a DIF map. DIF was calculated as the difference in logits between separate item calibrations within subgroups based on the Rasch model. Two DIF maps are shown in Figures 1 (gender) and 2 (best language). The horizontal bars reflect the magnitude and direction of the differences between item calibrations for the comparison groups. The subset classification and item ID number for each SAT-W item are indicated on the DIF maps (SC=Sentence Correction, U=Usage, RIC=Revision in Context, and Rating= two separate ratings for the essay). There are several rules of thumb that can be used for interpreting the substantive significant of DIF, such as the half-logit rule proposed by Draba (1977). However, the reader is reminded that DIF maps stress the idea that DIF is a continuous variable, and that arbitrary cut points may not go far enough in aiding the substantive interpretation of DIF.
Figure 1 illustrates DIF in terms of gender subgroups. As can be seen in this figure, DIF appears to vary across item subsets, although the magnitudes of the gender differences are generally small. None of the items exhibit gender DIF based on the half-logit rule. Data were also collected on whether or not English was reported by the students as their best language. The magnitude and directionality of DIF are shown in Figure 2, and they are somewhat different from the DIF patterns shown in Figure 1. Since the SAT-W is designed to measure academic English, it is not surprising that several items exhibit DIF related to best language. For example, the English Best Language group has higher scores on both essay ratings as would be expected given the purpose of the assessment.
DIF analyses have become a routine part of the test development process (Zumbo, 2007). A variety of methods have been proposed for conducting DIF analyses, and all of the methods yield continuous indicators that can be used to create DIF maps. Rasch-based approaches (Wright, Mead, & Draba, 1976) are used here to guide the creation of the DIF maps.
[Acknowledgement: The College Board provided support for this research. Researchers are encouraged to freely express their professional judgments. Therefore, points of view or opinions stated in College Board supported research do not necessarily represent official College Board position or policy.]
Stefanie A. Wind and George Engelhard, Jr.
Draba, R. E. (1977). The identification and interpretation of item bias. (Research Memorandum No. 25). Chicago: Statistical Laboratory, Department of Education, University of Chicago.
Engelhard, G., Kobrin, J., Wind, S.A., & Chajewski, M. (2012). Differential item and person functioning in large-scale writing assessments within the context of the SAT Reasoning Test. Paper presented at the annual meeting of the American Educational Research Association, Vancouver, CA.
Wright, B. D., Mead, R., & Draba, R. (1976). Detecting and correcting test item bias with a logistic response model. (Research Memorandum No. 22). Chicago: University of Chicago, MESA Psychometric Laboratory.
Zumbo, B.D. (2007). Three generations of DIF analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4(2), 223-233.
|Figure 1. DIF Map for Gender||Figure 2. DIF Map for Best Language|
Males - Females||
English Best Language - Another Language|
Item Subsets: SC: Sentence Correction|
RIC: Revision in Context
Ratings: Two Essay Ratings
Mapping Differential Item Functioning (DIF Maps), S.A.Wind and G. Engelhard, Jr., Rasch Measurement Transactions, 2012, 26:1, 1356-7
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|June 23 - July 21, 2023, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 11 - Sept. 8, 2023, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt261d.htm