The establishment of passing standards is a critical component of a successful examination program. The models available for setting standards vary greatly in their methodological frameworks, yet each, whether acknowledged or not, is ultimately an evaluative process that includes the use of some form of measurement or statistical assistance, but is not defined by it. As with any human endeavor, the sample of participants used greatly influences the outcome. In the context of standards this suggests that who sets standards for passing examinations is as important to the outcome as is the choice of standard setting methodology itself. A recent study in the field of high-stakes medical examinations reveals this phenomenon quite well.
The study was conducted with a national medical board in charge of a high-stakes certification testing program. The board employed the Rasch-derived Objective Standard Setting model to set the passing standard for the examination. The board consisted of 20 members. Of these members, 10 considered themselves to be primarily practitioners (PRAC) of medicine, while the remaining 10 considered their primary occupation to be that of an educator (EDUC) at a university or hospital training program.
Participants in the exercise began to define their criterion in the traditional Objective manner. After an extensive group discussion about the meaning of minimal competence and the essentiality of items, each member was presented with a complete, previously calibrated examination. The members individually reviewed each item and assessed the content and taxonomic conveyance included. Members would then decide for themselves whether the content as presented in each item was essential for an entry-level practicing physician to understand. Ultimately individual sets of core items were defined whose mean item difficulties represented the quantification of the content selected by each member participant.[Another attempt at objective standard setting is the Lewis, Mitzel, Green (1996) IRT-based Bookmark standard-setting procedure.]
An inspection of the criteria proved interesting. There is a statistically significant difference that is apparent even on simple visual inspection of Figure 1. The practitioners are noticeably stratified above the educators. There is an obvious gap between the criterion (mean = 1.52 logits) established by the practitioner members and the criterion (mean = 0.94 logits) established by the educator members.
High-stakes testing plays a critical role in the career of hopeful students. It also provides a measure of safety for our society. The selection of participant members on high-stakes boards must be carefully considered. In our case the question became, whose standard should be adopted? Practitioners are clearly closer to patient care, but educators may sometimes have a broader curricular focus. Should boards require a certain mixture?
While the use of a multi-faceted approach would account for differences in rater severity, it would not eliminate the more fundamental question of legitimate definitional differences. Indeed, while standard setters debate and discuss the merits of methodology, they cannot afford to ignore that most basic of confounding variables - the sample of participants selected.
Gregory E. Stone, The University of Toledo
Note: Wright & Grosse (RMT 7:3, 315) point out that "failing the possibly incompetent" requires a higher standard than "passing the probably competent" . Perhaps in Figure One, practitioners are subconsciously relatively more concerned with protecting patient well-being, while educators are relatively more concerned with enhancing student careers.
A standard of importance: Establishing passing standards. G.E. Stone 17:2, 919-920
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|Jan. 25 - March 8, 2023, Wed..-Wed.||On-line course: Introductory Rasch Analysis (M. Horton, RUMM2030), medicinehealth.leeds.ac.uk|
|Apr. 11-12, 2023, Tue.-Wed.||International Objective Measurement Workshop (IOMW) 2023, Chicago, IL. iomw.net|
|June 23 - July 21, 2023, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 11 - Sept. 8, 2023, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt172a.htm