# Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds)

 Rasch rating scale structure parameters, are also called Andrich thresholds, step calibrations or Tau's. These relate directly to category probabilities. These probabilities relate to the probability of a category being observed, not to the substantive order of achievement of the categories. So when step calibrations, i.e., Tau's, are disordered, they say that one category is less likely to be observed, not that it is easier to perform. Here is an example that will produce disordered Tau's: Around 100 people work in a building. Let us count the number of people in the building at 10 minute intervals over several days. The "items" are the times of day. The "people" are the days. Here is the rating scale: Less than 100: category 1. Exactly 100: category 2. More than 100: category 3. We will observe categories 1 and 3 far more often than category 2. As people arrive in the morning, it will be category 1. At peak times, category 3. In the evening category 1. During a day we may never observe category 2. But, of course category 2 goes between 1 and 3. But it is a category that is very difficult to observe. The Tau's will be "disordered". So, how do we detect when the categories are actually substantively incorrectly ordered? We use fit statistics. An illustrative example follows.

Category disordering occurs when the ordinal numbering of categories does not accord with their substantive meaning. Consider the 7 level FIMTM rating scale. Each level is substantively defined to represent a higher level of functioning. The ordinal numbering accords with this. But what would happen if the numbering of two categories was reversed? Then a higher category number could correspond to a lower level of functioning. The categories would be substantively disordered.

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 2 3 4 5 6 7 96 88 101 168 210 146 101 -2.80 -2.04 -1.02 -.27 .85 2.34 3.32 .98 .75 1.07 1.03 1.01 .75 .87 1.02 .80 1.03 1.19 .91 .83 .89 NONE -2.22 -1.70 -1.31 .08 2.02 3.14 Table 1. Satisfactory Category Statistics Average measures advance, Thresholds advance, MNSQs near 1.0

Here are the category summary statistics in Table 1 for some patient records with correctly coded FIM levels. Note that the "Average Measure" values advance with category. These indicate that, for this sample, higher patient performance corresponds to higher categories. The category mean-square fit statistics also do not markedly exceed their model values of 1.0. Figure 1 shows the modeled category probability curves. They depict the expected succession of "hills".

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 (2) 2 (1) 3 4 5 6 7 88 96 101 168 210 146 101 -1.97 -2.18 -.95 -.25 .80 2.14 3.02 1.47 .54 1.05 .91 .97 .66 .83 1.41 .69 1.02 .99 .87 .75 .86 NONE -2.08 -1.49 -1.24 .08 1.87 2.86 Table 2. Category Disordering Average measures disordered, MNSQs misfit > 1.0, but Thresholds advance

Now, suppose that due to a coding or data entry error, the numbering of levels 1 and 2 was reversed, introducing substantive category disordering. Table 2 shows the resultant category statistics. The observed category counts verify that category 1 and 2 have been reversed. Now the "average measure" values for categories 1 and 2 are disordered, and category 1 is exhibiting large misfit. Counter-intuitively, the step calibrations are ordered. The modeled category probability curves, shown in Figure 2, still depict a succession of "hills". This is because the measures, the Rasch model parameters, are always estimated on the basis that the data fit the model.

Substantive disordering of the categories is flagged by disordering in the "average measure" values and mean-square fit statistics much larger than 1.0 (indicating misfit), not disordering in the step calibrations nor in the shape of the probability curves. Of course, these statistics comment on the functioning of the rating scale for this sample. Whether substantive category disordering is due to a misspecification of the rating scale or to idiosyncrasies only found in the sample requires further investigation.

Step (Threshold) Disordering

The step calibrations or Rasch/Andrich thresholds correspond to the Rasch model parameters for the rating scale structure. Each step calibration parameterizes the relationship between a pair of adjacent categories. If, for a given item targeted directly at the person's ability level, a step calibration has a positive value, then the lower of the pair of categories is more likely to be observed. If the step calibration has a negative value, then the higher category of the pair is more likely to be observed.

Rating scale categories, however, are not observed in pairs but in the entire set simultaneously. This complicates their interpretation. If the step calibrations become successively more positive as category number increases (as in the FIM examples), then the plot of the category probability curves depicts a "range of hills". Each category in turn is most probable to be observed, and the intersections of the modal categories correspond to the step calibrations.

If the step calibrations do not increase monotonically with category number, i.e., are disordered, then one or more categories are never modal, and one or more "hill tops" are missing from the range of hills.

 FIMLevel Count AverageMeasure INFITMNSQ OUTFITMNSQ Step calibrationRasch-Andrich threshold 1 2 3 4 5 6 7 96 44 101 168 210 146 101 -2.81 -1.96 -1.03 -.30 .82 2.30 3.27 .90 .88 1.02 1.07 .96 .75 .87 .96 .92 .98 1.22 .88 .82 .89 NONE -1.49 -2.33 -1.29 .05 1.97 3.09 Table 3. Low Frequency in Category 2 Thresholds disordered, but average measures advance, MNSQs near 1.0

An Example of Step Disordering

To illustrate this, consider the FIM data presented above, but with every other observation of level 2 made missing. Table 3 shows the resulting category statistics. Compare these with Table 1. The count for level 2 is reduced by 50%. The step calibration from level 2 to 3, -2.33, is now less than that from level 1 to 2, -1.49, and so is disordered. As shown in Figure 3, category 2 is no longer modal. The cross-over between the curves for levels 2 and 3 (i.e., the step calibration) is to the left of that for levels 1 and 2. The crossover points are disordered. All other statistics, however, are almost identical. Step disordering has not introduced category disordering (as diagnosed by average measures) nor category misfit (as diagnosed by fit mean-squares).

Step Calibrations and Modality

What is the relationship between step calibrations and modality? Consider a 3 category rating scale. In Figure 4 the steps are ordered. In Figure 5 the steps coincide. The maximum probability of the central category is .33. In Figure 6 the steps are disordered. For 3 categories, the relationship between the two step calibrations, F1 and F2, and the maximum probability of the central category, as plotted in Figure 7, is given by the ogive:

Step Calibrations and the Latent Variable

From the perspective of Cumulative Probabilities, i.e., Thurstone Thresholds as computed according to the Rasch model, (Figure 8), as the step calibrations become more disordered, the central category becomes narrower. Step disordering does not indicate that the category definitions are out of sequence, rather that the category defines a narrow section of the variable. Empirically, disorder step calibrations may indicate that the category definition is too narrow, or that too many category options have been presented to respondents. Consequently, combining the narrow category with an adjacent category may simplify use of the rating scale or assist with communication of conclusions based on the scale.

Step disordering Increases Item Discrimination

Expected score ogives (the model item characteristic curves shown in Figure 9) are steeper with disordered steps. Thus step disordering indicates an item that is highly discriminating over a limited region of the variable, but that is less informative in other regions. Thus "high item discrimination" is not synonymous with "better functioning" or "more effective".

John M. Linacre

Category Disordering (disordered categories) vs. Threshold Disordering (disordered thresholds). Linacre, J.M. … Rasch Measurement Transactions, 1999, 13:1 p. 675

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 21, 2019, Thur. 13th annual meeting of the UK Rasch user group, Cambridge, UK, http://www.cambridgeassessment.org.uk/events/uk-rasch-user-group-2019
April 4 - 8, 2019, Thur.-Mon. NCME annual meeting, Toronto, Canada,https://ncme.connectedcommunity.org/meetings/annual
April 5 - 9, 2019, Fri.-Tue. AERA annual meeting, Toronto, Canada,www.aera.net/Events-Meetings/Annual-Meeting
April 12, 2019, Fri. On-line course: Understanding Rasch Measurement Theory - Master's Level (G. Masters), https://www.acer.org/au/professional-learning/postgraduate/rasch
May 24 - June 21, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 22 - 30, 2019, Wed.-Thu. Measuring and scale construction (with the Rasch Model), University of Manchester, England, https://www.cmist.manchester.ac.uk/study/short/intermediate/measurement-with-the-rasch-model/
June 4 - 7, 2019, Tue.-Fri.In-Person Italian Rasch Analysis Workshop based on RUMM (entirely in Italian). For enquiries and registration email to workshop.rasch@gmail.com.
June 17-19, 2019, Mon.-Wed. In-person workshop, Melbourne, Australia: Applying the Rasch Model in the Human Sciences: Introduction to Rasch measurement (Trevor Bond, Winsteps), Announcement
June 20-21, 2019, Thurs.-Fri. In-person workshop, Melbourne, Australia: Applying the Rasch Model in the Human Sciences: Advanced Rasch measurement with Facets (Trevor Bond, Facets), Announcement
June 28 - July 26, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 2-5, 2019, Tue.-Fri. 2019 International Measurement Confederation (IMEKO) Joint Symposium, St. Petersburg, Russia,https://imeko19-spb.org
July 11-12 & 15-19, 2019, Thu.-Fri. A Course in Rasch Measurement Theory (D.Andrich), University of Western Australia, Perth, Australia, flyer - http://www.education.uwa.edu.au/ppl/courses
Aug 5 - 10, 2019, Mon.-Sat. 6th International Summer School "Applied Psychometrics in Psychology and Education", Institute of Education at HSE University Moscow, Russia.https://ioe.hse.ru/en/announcements/248134963.html
Aug. 9 - Sept. 6, 2019, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
August 25-30, 2019, Sun.-Fri. Pacific Rim Objective Measurement Society (PROMS) 2019, Surabaya, Indonesia https://proms.promsociety.org/2019/
Oct. 11 - Nov. 8, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Nov. 3 - Nov. 4, 2019, Sun.-Mon. International Outcome Measurement Conference, Chicago, IL,http://jampress.org/iomc2019.htm
Jan. 24 - Feb. 21, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 22 - June 19, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 26 - July 24, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 7 - Sept. 4, 2020, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 9 - Nov. 6, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 25 - July 23, 2021, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com