[Later work suggests that modeling rank-orders as partial-credit items is productive. Journal of Applied Measurement 7:1 129-136 ]
The ideals and axioms of fundamental measurement which Georg Rasch espoused have generally been applied to tests in which the data to be analyzed consist of the direct responses made by examinees to items. This type of data can be termed "two-facet", the two facets being agents (test items) and objects (examinees). In the last two years fundamental measurement has been expanded to many-facet tests, such as performance assessment, in which ratings are given by judges to examinees' performances on several tasks, a three-facet (or more) situation.
Several rank orderings of the same examinees, each ordering made by a different judge, present a rather different measurement problem. A rank ordering looks like the outcome of a one-facet test. The performances of examinees are compared with each other, either by direct encounter or by the judge's thought-experiments. Thus the final ordering no longer has any quantifiable connection with the difficulty of the elements of performance on which the comparison was made, or the severity of the judges who constructed the orderings. Removing judge severity and item difficulty from consideration is often an intended aim of rank ordering. But, this type of data does not appear to be amenable to the familiar axioms of fundamental measurement, e.g., that there must be agents and objects, or to analysis by a Rasch program.
The good news is that objective measurement is possible with rank ordered data. In addition, Rasch analysis of rank ordered data exhibits the Rasch model's usual robustness against missing data, so it does not require that every rank ordering contain every examinee. Each judge need only rank the examinees with whose performance he is familiar, and may omit all the others. So long as there is some network of examinee overlap across the rankings made by the different judges, a coherent overall picture can be constructed.
This overall picture places each examinee at his competence measure on a latent variable, which is marked out in logits and has its local origin at, say, the mean ability of the examinees. Each measure has associated with it a standard error indicating the precision with which the measure has been determined. This information enables examinee measures to be compared in exactly the same manner as the examinee measures derived from the familiar two- facet test, except that it is no longer possible to relate performance levels to item difficulties and their implications for interpreting the substantive meaning of a measure.
Rasch measurement of rank ordered examinees also enables fit statistics to be calculated for the evaluation of the consistency of the performance level of each examinee as reflected in his rankings by the judges. Fit statistics can further report the degree to which each judge's rank ordering is consistent with the estimated measures based on the overall rankings. Especially deviant rankings can be flagged in precisely the same way that unexpected responses are identified in two-facet analysis.
The key to the analysis of rank-ordered data is the deduction that, for measurement to be constructed, each ranking of examinees must function as though it were independent of both the judge who made the ranking and the real or conceptual items which were used by the judge in assessing the relative performance level of examinees.
In the simplest case, rankings of pairs of examinees, what must dominate the data is the paired-comparison of the ordered examinees. In each ordering, any particular examinee is ranked higher or lower than any other particular examinee. Of course, when a set of orderings are obtained, all judges will rarely, if ever, agree perfectly. In fact, we depend on a certain level of stochastic disagreement in order to construct a measurement system.
What is decisive for the quantitative comparison of examinees is the number of times one examinee is ranked higher than another. Examinee n with measure Bn might be ranked HIGHER than examinee m with measure Bm a total of H times across the orderings made by the different judges. In contrast, examinee n might be ranked LOWER than m a total of L times. The ratio H/L is the essential data for the estimation of a distance between examinees n and m as in (Bn - Bm).
A straight-forward derivation of a measurement model from objectivity, similar to that in RMT 1:1 (also see Rasch, 1980, p.171-172), yields the model that must underlie the intention of obtaining meaning from multiple rankings of the same examinees. This measurement model for rank orders is remarkably simple and familiar in appearance:
Loge ( Pnm / Pmn ) = Bn - Bm
where Pnm is the probability that n is ranked higher than m and Pmn is the probability that m is ranked higher than n. Pmn + Pnm = 1.
The ratio Pnm/Pmn is realized in the rankings as H/L, and this becomes the empirical data for estimating the parameters. This model has the form of the Bradley-Terry model, but that model is motivated by data description, not measurement.
For rankings of more than two examinees, there are added constraints because examinees are not compared independently, but are reported in a composite rank-order. This alters the final form of the estimation equations from those presented in Rating Scale Analysis (Wright and Masters, 1982) and elsewhere.
Alternative estimation equations can be formulated. One approach is to decompose the rank orderings into paired comparisons. A more convenient conceptualization, however, is to imagine that the judges internalize a rating scale defined such that one examinee is found in each category (or multiple examinees, if tied rankings are allowed). A measurement model for this conceptualization is
Loge ( Prnk / Prnk+1 ) = Bn - Br - Frk
where Prnk is the probability that, in ordering r, examinee n will be ranked k. Bn is the ability of examinee n. Br is the mean ability of the examinees included in ordering r. Frk is the step difficulty up from a ranking of k+1 to a ranking of k within ordering r.
A delight of these measurement models is that it doesn't matter, in general, how many judges include each examinee in their rankings. Nor does it matter how many examinees each judge ranks, or even what numerical system is used to record the ranks. The estimates of the measures are derived merely from counting each examinee's location in each ordering.
Initial application of this technique looks promising. A more comprehensive paper will be published in Mark Wilson's (1992) Objective Measurement: Theory into Practice, Vol. 1. (See Chap. 11, p. 195-209.)
John M. Linacre
Rank ordering and Rasch measurement. Linacre JM. Rasch Measurement Transactions 2:4 p.41-42
Rank ordering and Rasch measurement. Linacre JM. Rasch Measurement Transactions, 1989, 2:4 p.41-42
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|Sept. 27-29, 2017, Wed.-Fri.||In-person workshop: Introductory Rasch Analysis using RUMM2030, Leeds, UK (M. Horton), Announcement|
|Oct. 13 - Nov. 10, 2017, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|Oct. 25-27, 2017, Wed.-Fri.||In-person workshop: Applying the Rasch Model hands-on introductory workshop, Melbourne, Australia (T. Bond, B&FSteps), Announcement|
|Dec. 6-8, 2017, Wed.-Fri.||In-person workshop: Introductory Rasch Analysis using RUMM2030, Leeds, UK (M. Horton), Announcement|
|Jan. 5 - Feb. 2, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|Jan. 10-16, 2018, Wed.-Tues.||In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement|
|Jan. 17-19, 2018, Wed.-Fri.||Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website|
|April 13-17, 2018, Fri.-Tues.||AERA, New York, NY, www.aera.net|
|May 25 - June 22, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|June 29 - July 27, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 10 - Sept. 7, 2018, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
|Oct. 12 - Nov. 9, 2018, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt24b.htm