Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds)

 Rasch rating scale structure parameters, are also called Andrich thresholds, step calibrations or Tau's. These relate directly to category probabilities. These probabilities relate to the probability of a category being observed, not to the substantive order of achievement of the categories. So when step calibrations, i.e., Tau's, are disordered, they say that one category is less likely to be observed, not that it is easier to perform. Here is an example that will produce disordered Tau's: Around 100 people work in a building. Let us count the number of people in the building at 10 minute intervals over several days. The "items" are the times of day. The "people" are the days. Here is the rating scale: Less than 100: category 1. Exactly 100: category 2. More than 100: category 3. We will observe categories 1 and 3 far more often than category 2. As people arrive in the morning, it will be category 1. At peak times, category 3. In the evening category 1. During a day we may never observe category 2. But, of course category 2 goes between 1 and 3. But it is a category that is very difficult to observe. The Tau's will be "disordered". So, how do we detect when the categories are actually substantively incorrectly ordered? We use fit statistics. An illustrative example follows.

Category disordering occurs when the ordinal numbering of categories does not accord with their substantive meaning. Consider the 7 level FIMTM rating scale. Each level is substantively defined to represent a higher level of functioning. The ordinal numbering accords with this. But what would happen if the numbering of two categories was reversed? Then a higher category number could correspond to a lower level of functioning. The categories would be substantively disordered.

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 2 3 4 5 6 7 96 88 101 168 210 146 101 -2.80 -2.04 -1.02 -.27 .85 2.34 3.32 .98 .75 1.07 1.03 1.01 .75 .87 1.02 .80 1.03 1.19 .91 .83 .89 NONE -2.22 -1.70 -1.31 .08 2.02 3.14 Table 1. Satisfactory Category Statistics Average measures advance, Thresholds advance, MNSQs near 1.0

Here are the category summary statistics in Table 1 for some patient records with correctly coded FIM levels. Note that the "Average Measure" values advance with category. These indicate that, for this sample, higher patient performance corresponds to higher categories. The category mean-square fit statistics also do not markedly exceed their model values of 1.0. Figure 1 shows the modeled category probability curves. They depict the expected succession of "hills".

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 (2) 2 (1) 3 4 5 6 7 88 96 101 168 210 146 101 -1.97 -2.18 -.95 -.25 .80 2.14 3.02 1.47 .54 1.05 .91 .97 .66 .83 1.41 .69 1.02 .99 .87 .75 .86 NONE -2.08 -1.49 -1.24 .08 1.87 2.86 Table 2. Category Disordering Average measures disordered, MNSQs misfit > 1.0, but Thresholds advance

Now, suppose that due to a coding or data entry error, the numbering of levels 1 and 2 was reversed, introducing substantive category disordering. Table 2 shows the resultant category statistics. The observed category counts verify that category 1 and 2 have been reversed. Now the "average measure" values for categories 1 and 2 are disordered, and category 1 is exhibiting large misfit. Counter-intuitively, the step calibrations are ordered. The modeled category probability curves, shown in Figure 2, still depict a succession of "hills". This is because the measures, the Rasch model parameters, are always estimated on the basis that the data fit the model.

Substantive disordering of the categories is flagged by disordering in the "average measure" values and mean-square fit statistics much larger than 1.0 (indicating misfit), not disordering in the step calibrations nor in the shape of the probability curves. Of course, these statistics comment on the functioning of the rating scale for this sample. Whether substantive category disordering is due to a misspecification of the rating scale or to idiosyncrasies only found in the sample requires further investigation.

Step (Threshold) Disordering

The step calibrations or Rasch/Andrich thresholds correspond to the Rasch model parameters for the rating scale structure. Each step calibration parameterizes the relationship between a pair of adjacent categories. If, for a given item targeted directly at the person's ability level, a step calibration has a positive value, then the lower of the pair of categories is more likely to be observed. If the step calibration has a negative value, then the higher category of the pair is more likely to be observed.

Rating scale categories, however, are not observed in pairs but in the entire set simultaneously. This complicates their interpretation. If the step calibrations become successively more positive as category number increases (as in the FIM examples), then the plot of the category probability curves depicts a "range of hills". Each category in turn is most probable to be observed, and the intersections of the modal categories correspond to the step calibrations.

If the step calibrations do not increase monotonically with category number, i.e., are disordered, then one or more categories are never modal, and one or more "hill tops" are missing from the range of hills.

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 2 3 4 5 6 7 96 44 101 168 210 146 101 -2.81 -1.96 -1.03 -.30 .82 2.30 3.27 .90 .88 1.02 1.07 .96 .75 .87 .96 .92 .98 1.22 .88 .82 .89 NONE -1.49 -2.33 -1.29 .05 1.97 3.09 Table 3. Low Frequency in Category 2 Thresholds disordered, but average measures advance, MNSQs near 1.0

An Example of Step Disordering

To illustrate this, consider the FIM data presented above, but with every other observation of level 2 made missing. Table 3 shows the resulting category statistics. Compare these with Table 1. The count for level 2 is reduced by 50%. The step calibration from level 2 to 3, -2.33, is now less than that from level 1 to 2, -1.49, and so is disordered. As shown in Figure 3, category 2 is no longer modal. The cross-over between the curves for levels 2 and 3 (i.e., the step calibration) is to the left of that for levels 1 and 2. The crossover points are disordered. All other statistics, however, are almost identical. Step disordering has not introduced category disordering (as diagnosed by average measures) nor category misfit (as diagnosed by fit mean-squares).

Step Calibrations and Modality

What is the relationship between step calibrations and modality? Consider a 3 category rating scale. In Figure 4 the steps are ordered. In Figure 5 the steps coincide. The maximum probability of the central category is .33. In Figure 6 the steps are disordered. For 3 categories, the relationship between the two step calibrations, F1 and F2, and the maximum probability of the central category, as plotted in Figure 7, is given by the ogive:

Step Calibrations and the Latent Variable

From the perspective of Cumulative Probabilities, i.e., Thurstone Thresholds as computed according to the Rasch model, (Figure 8), as the step calibrations become more disordered, the central category becomes narrower. Step disordering does not indicate that the category definitions are out of sequence, rather that the category defines a narrow section of the variable. Empirically, disorder step calibrations may indicate that the category definition is too narrow, or that too many category options have been presented to respondents. Consequently, combining the narrow category with an adjacent category may simplify use of the rating scale or assist with communication of conclusions based on the scale.

Step disordering Increases Item Discrimination

Expected score ogives (the model item characteristic curves shown in Figure 9) are steeper with disordered steps. Thus step disordering indicates an item that is highly discriminating over a limited region of the variable, but that is less informative in other regions. Thus "high item discrimination" is not synonymous with "better functioning" or "more effective".

John M. Linacre

Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds). Linacre, J.M. … Rasch Measurement Transactions, 1999, 13:1 p. 675

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 31, 2017, Fri. Conference: 11th UK Rasch Day, Warwick, UK, www.rasch.org.uk
April 2-3, 2017, Sun.-Mon. Conference: Validity Evidence for Measurement in Mathematics Education (V-M2Ed), San Antonio, TX, Information
April 26-30, 2017, Wed.-Sun. NCME, San Antonio, TX, www.ncme.org - April 29: Ben Wright book
April 27 - May 1, 2017, Thur.-Mon. AERA, San Antonio, TX, www.aera.net
May 26 - June 23, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 30 - July 29, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 31 - Aug. 3, 2017, Mon.-Thurs. Joint IMEKO TC1-TC7-TC13 Symposium 2017: Measurement Science challenges in Natural and Social Sciences, Rio de Janeiro, Brazil, imeko-tc7-rio.org.br
Aug. 7-9, 2017, Mon-Wed. In-person workshop and research coloquium: Effect size of family and school indexes in writing competence using TERCE data (C. Pardo, A. Atorressi, Winsteps), Bariloche Argentina. Carlos Pardo, Universidad Catòlica de Colombia
Aug. 7-9, 2017, Mon-Wed. PROMS 2017: Pacific Rim Objective Measurement Symposium, Sabah, Borneo, Malaysia, proms.promsociety.org/2017/
Aug. 10, 2017, Thurs. In-person Winsteps Training Workshop (M. Linacre, Winsteps), Sydney, Australia. www.winsteps.com/sydneyws.htm
Aug. 11 - Sept. 8, 2017, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 18-21, 2017, Fri.-Mon. IACAT 2017: International Association for Computerized Adaptive Testing, Niigata, Japan, iacat.org
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago, jampress.org/iomc2017.htm
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com