Mapping Rasch-Based Measurement onto the Argument-Based Validity Framework

This paper integrates the Rasch validity model (Wright & Stone, 1988, 1999) into the argument-based validity framework (Kane, 1992, 2004). The Rasch validity subsumes fit and order validity. Order validity has two subcategories: meaning validity (originated from the calibration of test variables) and utility validity (based on the calibration of persons to implement criterion validity). Fit validity concerns the consistency of response patterns. From 1) analysis of residuals, i.e., the difference between the Rasch model and the responses, 2) analysis of item fit, which can help revising the test, and 3) analysis of person fit, which can help diagnosing the testees whose performance do not fit our expectations, we get response, item function, and person performance validity, respectively.

The evidence-based approach to validity was proposed by Kane (1992). This framework has two phases: interpretive and validity argument. Initially, the interpretive argument (IA) is proposed in the form of statements followed by the validity argument (VA) to investigate the efficacy of the IA. Figure 1 displays a framework to use Rasch-based measurement to build VA's. Observation, generalization, explanation, and extrapolation are four major inferences that help proceeding from one validation stage to the consecutive stage. Warrants comprise any data to back up the postulated inferences. Backings give legitimacy and authority to warrants, e.g., theoretical assumptions behind the posited warrants.

Warrants for the observation inference in a Rasch-based study can include standardization of scoring process, converting raw scores into measured scores and ability. Standardization guarantees the unanimity of the scoring procedure. Converted raw scores to interval or measured scores in the Rasch analysis is essential since the distance between measured scores is real and item difficulty can be directly compared with person ability or trait levels. Rating Scales (Andrich Model) and (Masters') Partial Credit Model help further investigating the efficacy of the measurement scales. To generalize the observed scores into expected scores, person and item reliability, and person and item separation indexes are proposed as warrants and the theories behind them as backings.

The explanation inference bears on the theoretical construct under measurement. Item/person infit and outfit analysis are first warrants. Backings include theoretical concepts of fit validity. Investigating item and person fit provides information about construct-irrelevant factors. The Rasch Principal Component Analysis of Residuals (PCAR) investigates construct irrelevancies in the measure (Linacre, 2005).

Then, we can extrapolate the observation to the target scores. The extrapolation inference has an element of subjectivity. Kane, Crooks, and Cohen (1999) indicated that content analysis in the generalization inference can support extrapolation provided that the universe of generalization corresponds to the target domain. Kane (1992, 2004) also proposed the use of criterion-referenced evidence. However, even if this method is used, it may not yield sufficient support for extrapolation. Utility and meaning validity can come to aid again. The confirmed hierarchy of item difficulty is assessed against the criteria we have set. Observations which are not in conformity with the theoretical expectations or criteria are possible to be flawed. By the same token, we can anticipate how persons with different characteristics will respond to a particular question. Differential item functioning (DIF) is also useful. DIF occurs when a group of examinees have different probabilities to answer an item due to their background (sex, age, ethnicity, etc.). Background is the major criterion because it concerns test takers directly. In this light, background is internal to the assessment.

In the current Rasch-based framework, the Rasch analysis is further supported by the theoretical background of the test. This implies that psychometric models should not disassociate with the psychological and cognitive theories underlying any testing device (Embretson & Gorin, 2001; Wright & Stone, 1999). It is certainly difficult and expensive for academic institutes to carry out many studies in support of the validity arguments of a device (see McNamara, 2003). The Rasch-based validity argument framework can provide reliable and efficient evidence at the lowest expense compared with the accumulation of evidence from different studies.

Figure 1. Supporting validity arguments using Rasch analysis.

S. Vahid Aryadoust
NIE, NTU
Singapore

Embretson S., & Gorin, J. (2001). Improving construct validity with cognitive psychology principles. Journal of Educational Measurement, 38(4), 343-368.

Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527-535.

Kane, M. (2004). Certification testing as an illustration of argument-based validation. Measurement: Interdisciplinary Research and Perspectives, 2, 135-170.

Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5-17.

Wright, B. D., & Stone, M. H. (1988). Validity in Rasch measurement. University of Chicago: Research Memorandum No. 55.

Aryadoust S.V. (2009) Mapping Rasch-Based Measurement onto the Argument-Based Validity Framework, Rasch Measurement Transactions, 2009, 23:1, 1192-3

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com