Rasch rating scale structure parameters, are also called Andrich thresholds, step calibrations or Tau's. These relate directly to category probabilities. These probabilities relate to the probability of a category being observed, not to the substantive order of achievement of the categories. So when step calibrations, i.e., Tau's, are disordered, they say that one category is less likely to be observed, not that it is easier to perform.
Here is an example that will produce disordered Tau's:
Around 100 people work in a building. Let us count the number of people in the building at 10 minute intervals over several days. The "items" are the times of day. The "people" are the days. Here is the rating scale:
Less than 100: category 1.
We will observe categories 1 and 3 far more often than category 2. As people arrive in the morning, it will be category 1. At peak times, category 3. In the evening category 1. During a day we may never observe category 2. But, of course category 2 goes between 1 and 3. But it is a category that is very difficult to observe. The Tau's will be "disordered".
So, how do we detect when the categories are actually substantively incorrectly ordered? We use fit statistics. An illustrative example follows.
Category disordering occurs when the ordinal numbering of categories does not accord with their substantive meaning. Consider the 7 level FIMTM rating scale. Each level is substantively defined to represent a higher level of functioning. The ordinal numbering accords with this. But what would happen if the numbering of two categories was reversed? Then a higher category number could correspond to a lower level of functioning. The categories would be substantively disordered.
Here are the category summary statistics in Table 1 for some patient records with correctly coded FIM levels. Note that the "Average Measure" values advance with category. These indicate that, for this sample, higher patient performance corresponds to higher categories. The category mean-square fit statistics also do not markedly exceed their model values of 1.0. Figure 1 shows the modeled category probability curves. They depict the expected succession of "hills".
Now, suppose that due to a coding or data entry error, the numbering of levels 1 and 2 was reversed, introducing substantive category disordering. Table 2 shows the resultant category statistics. The observed category counts verify that category 1 and 2 have been reversed. Now the "average measure" values for categories 1 and 2 are disordered, and category 1 is exhibiting large misfit. Counter-intuitively, the step calibrations are ordered. The modeled category probability curves, shown in Figure 2, still depict a succession of "hills". This is because the measures, the Rasch model parameters, are always estimated on the basis that the data fit the model.
Substantive disordering of the categories is flagged by disordering in the "average measure" values and mean-square fit statistics much larger than 1.0 (indicating misfit), not disordering in the step calibrations nor in the shape of the probability curves. Of course, these statistics comment on the functioning of the rating scale for this sample. Whether substantive category disordering is due to a misspecification of the rating scale or to idiosyncrasies only found in the sample requires further investigation.
Step (Threshold) Disordering
The step calibrations or Rasch/Andrich thresholds correspond to the Rasch model parameters for the rating scale structure. Each step calibration parameterizes the relationship between a pair of adjacent categories. If, for a given item targeted directly at the person's ability level, a step calibration has a positive value, then the lower of the pair of categories is more likely to be observed. If the step calibration has a negative value, then the higher category of the pair is more likely to be observed.
Rating scale categories, however, are not observed in pairs but in the entire set simultaneously. This complicates their interpretation. If the step calibrations become successively more positive as category number increases (as in the FIM examples), then the plot of the category probability curves depicts a "range of hills". Each category in turn is most probable to be observed, and the intersections of the modal categories correspond to the step calibrations.
If the step calibrations do not increase monotonically with category number, i.e., are disordered, then one or more categories are never modal, and one or more "hill tops" are missing from the range of hills.
An Example of Step Disordering
To illustrate this, consider the FIM data presented above, but with every other observation of level 2 made missing. Table 3 shows the resulting category statistics. Compare these with Table 1. The count for level 2 is reduced by 50%. The step calibration from level 2 to 3, -2.33, is now less than that from level 1 to 2, -1.49, and so is disordered. As shown in Figure 3, category 2 is no longer modal. The cross-over between the curves for levels 2 and 3 (i.e., the step calibration) is to the left of that for levels 1 and 2. The crossover points are disordered. All other statistics, however, are almost identical. Step disordering has not introduced category disordering (as diagnosed by average measures) nor category misfit (as diagnosed by fit mean-squares).
Step Calibrations and Modality
What is the relationship between step calibrations and modality? Consider a 3 category rating scale. In Figure 4 the steps are ordered. In Figure 5 the steps coincide. The maximum probability of the central category is .33. In Figure 6 the steps are disordered. For 3 categories, the relationship between the two step calibrations, F1 and F2, and the maximum probability of the central category, as plotted in Figure 7, is given by the ogive:
Step Calibrations and the Latent Variable
From the perspective of Cumulative Probabilities, i.e., Thurstone Thresholds as computed according to the Rasch model, (Figure 8), as the step calibrations become more disordered, the central category becomes narrower. Step disordering does not indicate that the category definitions are out of sequence, rather that the category defines a narrow section of the variable. Empirically, disorder step calibrations may indicate that the category definition is too narrow, or that too many category options have been presented to respondents. Consequently, combining the narrow category with an adjacent category may simplify use of the rating scale or assist with communication of conclusions based on the scale.
Step disordering Increases Item Discrimination
Expected score ogives (the model item characteristic curves shown in Figure 9) are steeper with disordered steps. Thus step disordering indicates an item that is highly discriminating over a limited region of the variable, but that is less informative in other regions. Thus "high item discrimination" is not synonymous with "better functioning" or "more effective".
John M. Linacre
Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds). Linacre, J.M. Rasch Measurement Transactions, 1999, 13:1 p. 675
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|Apr. 14-17, 2020, Tue.-Fri.||International Objective Measurement Workshop (IOMW), University of California, Berkeley, https://www.iomw.org/|
|May 22 - June 19, 2020, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|June 26 - July 24, 2020, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|June 29 - July 1, 2020, Mon.-Wed.||Measurement at the Crossroads 2020, Milan, Italy , https://convegni.unicatt.it/mac-home|
|July - November, 2020||On-line course: An Introduction to Rasch Measurement Theory and RUMM2030Plus (Andrich & Marais), http://www.education.uwa.edu.au/ppl/courses|
|July 1 - July 3, 2020, Wed.-Fri.||International Measurement Confederation (IMEKO) Joint Symposium, Warsaw, Poland, http://www.imeko-warsaw-2020.org/|
|Aug. 7 - Sept. 4, 2020, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
|Oct. 9 - Nov. 6, 2020, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com|
|June 25 - July 23, 2021, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt131a.htm