Mapping Rasch-Based Measurement onto the Argument-Based Validity Framework

This paper integrates the Rasch validity model (Wright & Stone, 1988, 1999) into the argument-based validity framework (Kane, 1992, 2004). The Rasch validity subsumes fit and order validity. Order validity has two subcategories: meaning validity (originated from the calibration of test variables) and utility validity (based on the calibration of persons to implement criterion validity). Fit validity concerns the consistency of response patterns. From 1) analysis of residuals, i.e., the difference between the Rasch model and the responses, 2) analysis of item fit, which can help revising the test, and 3) analysis of person fit, which can help diagnosing the testees whose performance do not fit our expectations, we get response, item function, and person performance validity, respectively.

The evidence-based approach to validity was proposed by Kane (1992). This framework has two phases: interpretive and validity argument. Initially, the interpretive argument (IA) is proposed in the form of statements followed by the validity argument (VA) to investigate the efficacy of the IA. Figure 1 displays a framework to use Rasch-based measurement to build VA's. Observation, generalization, explanation, and extrapolation are four major inferences that help proceeding from one validation stage to the consecutive stage. Warrants comprise any data to back up the postulated inferences. Backings give legitimacy and authority to warrants, e.g., theoretical assumptions behind the posited warrants.

Warrants for the observation inference in a Rasch-based study can include standardization of scoring process, converting raw scores into measured scores and ability. Standardization guarantees the unanimity of the scoring procedure. Converted raw scores to interval or measured scores in the Rasch analysis is essential since the distance between measured scores is real and item difficulty can be directly compared with person ability or trait levels. Rating Scales (Andrich Model) and (Masters') Partial Credit Model help further investigating the efficacy of the measurement scales. To generalize the observed scores into expected scores, person and item reliability, and person and item separation indexes are proposed as warrants and the theories behind them as backings.

The explanation inference bears on the theoretical construct under measurement. Item/person infit and outfit analysis are first warrants. Backings include theoretical concepts of fit validity. Investigating item and person fit provides information about construct-irrelevant factors. The Rasch Principal Component Analysis of Residuals (PCAR) investigates construct irrelevancies in the measure (Linacre, 2005).

Then, we can extrapolate the observation to the target scores. The extrapolation inference has an element of subjectivity. Kane, Crooks, and Cohen (1999) indicated that content analysis in the generalization inference can support extrapolation provided that the universe of generalization corresponds to the target domain. Kane (1992, 2004) also proposed the use of criterion-referenced evidence. However, even if this method is used, it may not yield sufficient support for extrapolation. Utility and meaning validity can come to aid again. The confirmed hierarchy of item difficulty is assessed against the criteria we have set. Observations which are not in conformity with the theoretical expectations or criteria are possible to be flawed. By the same token, we can anticipate how persons with different characteristics will respond to a particular question. Differential item functioning (DIF) is also useful. DIF occurs when a group of examinees have different probabilities to answer an item due to their background (sex, age, ethnicity, etc.). Background is the major criterion because it concerns test takers directly. In this light, background is internal to the assessment.

In the current Rasch-based framework, the Rasch analysis is further supported by the theoretical background of the test. This implies that psychometric models should not disassociate with the psychological and cognitive theories underlying any testing device (Embretson & Gorin, 2001; Wright & Stone, 1999). It is certainly difficult and expensive for academic institutes to carry out many studies in support of the validity arguments of a device (see McNamara, 2003). The Rasch-based validity argument framework can provide reliable and efficient evidence at the lowest expense compared with the accumulation of evidence from different studies.

Figure 1. Supporting validity arguments using Rasch analysis.

S. Vahid Aryadoust

Embretson S., & Gorin, J. (2001). Improving construct validity with cognitive psychology principles. Journal of Educational Measurement, 38(4), 343-368.

Kane, M. (1992). An argument-based approach to validity. Psychological Bulletin, 112, 527-535.

Kane, M. (2004). Certification testing as an illustration of argument-based validation. Measurement: Interdisciplinary Research and Perspectives, 2, 135-170.

Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5-17.

Wright, B. D., & Stone, M. H. (1988). Validity in Rasch measurement. University of Chicago: Research Memorandum No. 55.

Aryadoust S.V. (2009) Mapping Rasch-Based Measurement onto the Argument-Based Validity Framework, Rasch Measurement Transactions, 2009, 23:1, 1192-3

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from welcomes your comments:

Your email address (if you want us to reply):


ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website,

Coming Rasch-related Events
Oct. 6 - Nov. 3, 2023, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Facets),
Oct. 12, 2023, Thursday 5 to 7 pm Colombian timeOn-line workshop: Deconstruyendo el concepto de validez y Discusiones sobre estimaciones de confiabilidad SICAPSI (J. Escobar, C.Pardo)
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),


The URL of this page is