This paper integrates the Rasch validity model (Wright & Stone, 1988, 1999) into the argument-based validity framework (Kane, 1992, 2004). The Rasch validity subsumes fit and order validity. Order validity has two subcategories: meaning validity (originated from the calibration of test variables) and utility validity (based on the calibration of persons to implement criterion validity). Fit validity concerns the consistency of response patterns. From 1) analysis of residuals, i.e., the difference between the Rasch model and the responses, 2) analysis of item fit, which can help revising the test, and 3) analysis of person fit, which can help diagnosing the testees whose performance do not fit our expectations, we get response, item function, and person performance validity, respectively.
The evidence-based approach to validity was proposed by Kane (1992). This framework has two phases: interpretive and validity argument. Initially, the interpretive argument (IA) is proposed in the form of statements followed by the validity argument (VA) to investigate the efficacy of the IA. Figure 1 displays a framework to use Rasch-based measurement to build VA's. Observation, generalization, explanation, and extrapolation are four major inferences that help proceeding from one validation stage to the consecutive stage. Warrants comprise any data to back up the postulated inferences. Backings give legitimacy and authority to warrants, e.g., theoretical assumptions behind the posited warrants.
Warrants for the observation inference in a Rasch-based study can include standardization of scoring process, converting raw scores into measured scores and ability. Standardization guarantees the unanimity of the scoring procedure. Converted raw scores to interval or measured scores in the Rasch analysis is essential since the distance between measured scores is real and item difficulty can be directly compared with person ability or trait levels. Rating Scales (Andrich Model) and (Masters') Partial Credit Model help further investigating the efficacy of the measurement scales. To generalize the observed scores into expected scores, person and item reliability, and person and item separation indexes are proposed as warrants and the theories behind them as backings.
The explanation inference bears on the theoretical construct under measurement. Item/person infit and outfit analysis are first warrants. Backings include theoretical concepts of fit validity. Investigating item and person fit provides information about construct-irrelevant factors. The Rasch Principal Component Analysis of Residuals (PCAR) investigates construct irrelevancies in the measure (Linacre, 2005).
Then, we can extrapolate the observation to the target scores. The extrapolation inference has an element of subjectivity. Kane, Crooks, and Cohen (1999) indicated that content analysis in the generalization inference can support extrapolation provided that the universe of generalization corresponds to the target domain. Kane (1992, 2004) also proposed the use of criterion-referenced evidence. However, even if this method is used, it may not yield sufficient support for extrapolation. Utility and meaning validity can come to aid again. The confirmed hierarchy of item difficulty is assessed against the criteria we have set. Observations which are not in conformity with the theoretical expectations or criteria are possible to be flawed. By the same token, we can anticipate how persons with different characteristics will respond to a particular question. Differential item functioning (DIF) is also useful. DIF occurs when a group of examinees have different probabilities to answer an item due to their background (sex, age, ethnicity, etc.). Background is the major criterion because it concerns test takers directly. In this light, background is internal to the assessment.
In the current Rasch-based framework, the Rasch analysis is further supported by the theoretical background of the test. This implies that psychometric models should not disassociate with the psychological and cognitive theories underlying any testing device (Embretson & Gorin, 2001; Wright & Stone, 1999). It is certainly difficult and expensive for academic institutes to carry out many studies in support of the validity arguments of a device (see McNamara, 2003). The Rasch-based validity argument framework can provide reliable and efficient evidence at the lowest expense compared with the accumulation of evidence from different studies.
Figure 1. Supporting validity arguments using Rasch analysis. |
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
Oct. 4 - Nov. 8, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt231f.htm
Website: www.rasch.org/rmt/contents.htm