How to Assign Item Weights: Item Replication or Rating Scales?

Recommendation: If the additional weight is intended to indicate a higher level of performance, then use a rating scale.
If the additional weight is intended to indicate replications of the same level of performance, then use item weighting.
If the additional weight is merely to make the scores look nicer, then use linear-rescaling of the measurement units.

    Examples: a dichotomous item is scored 0-4 instead of 0-1:
  1. Score levels 1,2,3 exist conceptually, but are not observed in these data. Analyze 0-4 as a rating scale or partial credit item. (In Winsteps, STKEEP=Yes, IWEIGHT=1)
  2. 0-4 is specified because this item is considered to be 4 times as important as a 0-1 item. Analyze as 0-1 but give the item a weight of 4 or 4 replications in the data. (In Winsteps, STKEEP=No, IWEIGHT=4)
  3. 0-4 is specified because there are 25 items and we want the raw score range to be 0-100. Analyze as 0-1 but report the raw scores as 0-4. (In Winsteps, STKEEP=No, IWEIGHT=1)

In general, each observation is expected to be an independent and equal witness to examinee ability. The scientific motivation for this expectation is comparable to the motivations for random sampling and randomization. The introduction of arbitrary emphases, such as item weights, degrades the inferential stability of results and biases conclusions in an unreproducible way.

In the political world of examinations, however, some observations are decreed more important than others. For instance, if a pass- fail decision is to be made on the composite outcome of a 100 item MCQ test and one essay graded from 0 to 10, then the examination board may decide to assign the essay rating a weight 10 times heavier in order to give the essay and the MCQ items supposedly "equal" weight in the final decision.

Should you fall victim to such a decree, there are several ways the weights can be implemented with Rasch computer programs. Since each method has its drawbacks, initial data screening and quality control should proceed as though no weights existed. Once the measurement process has been validated, the following assignment methods may help:

1. The essay ratings and the MCQ items are analyzed separately, yielding two ability measures for each examinee. If there is insufficient overlap among the essay ratings, then additional constraints are required, such as modelling the ratings as binomial trials, and asserting that each grader is equally severe in order for a coherent set of essay measures to be produced. For the pass- fail decision, a weighted sum of the pairs of ability measures is used " the precise formula will be complicated by the different logit ranges of the two variables. The way to see what to do is to plot MCQ vs. Essay measures, and then to draw on this plot the line that best asserts the conjoint judgment of the standard setting committee. This method is the most comprehensible.

2. Each essay rating is entered 10 times (or each essay is given a weight of 10 times), and then the MCQ items and the essay ratings are analyzed together. This diminishes local independence among the observations but avoids the complication of two measurement scales. The replicated data will make the reported standard errors too small. In this example, they should be inflated about 75%. The 10 essay difficulties will be reported at about the same location on the variable as the one original essay difficulty.

3. Use explicit item weights, e.g., using IWEIGHT= in Winsteps, but adjusting the item weights to maintain approximately correct standard errors and score range. The original score range is 0-110. The essay is to be upweighted 10 times. This would give a score range 0-200. So to keep the meaningful score range, the weights needs to be adjusted by 110/200 = .55. So each MCQ item is weighted .55, and the essay item is weighted 5.50. This method is operationally the simplest.

4. Each essay rating is multiplied by 10, and then the rescaled 0- 100 essay ratings are analyzed with the MCQ items. Since only every 10th category of the 0-100 essay rating scale is observed, the analysis must allow for structurally present, but empirically absent, categories (Wilson RMT 5:1 p. 128). Again, standard errors will need to be inflated about 75% due to the effect of the fictitious categories. Only one essay difficulty will be reported, but it will not be at the same location on the variable as the 0-10 essay would have been. By convention, the difficulty of a rating scale item is chosen so that the sum of the step difficulties is zero, i.e., at the location on the variable where the highest and lowest possible ratings on the item are equally probable. If the difficulty of the 0-10 essay item is D logits from the center of the person ability distribution, the difficulty of the 0-100 essay item will be much closer to the mean ability, only about D/10 logits away. This makes the construct harder to understand, and can be confusing if the assigned weights are changed.

Assigning item weights: Item Replication or Rating Scales?. Linacre JM, Wright BD. … Rasch Measurement Transactions, 1995, 8:4 p.403

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from welcomes your comments:

Your email address (if you want us to reply):


ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website,

Coming Rasch-related Events
May 17 - June 21, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps),
June 12 - 14, 2024, Wed.-Fri. 1st Scandinavian Applied Measurement Conference, Kristianstad University, Kristianstad, Sweden
June 21 - July 19, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Winsteps),
Aug. 5 - Aug. 6, 2024, Fri.-Fri. 2024 Inaugural Conference of the Society for the Study of Measurement (Berkeley, CA), Call for Proposals
Aug. 9 - Sept. 6, 2024, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps),
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps),
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps),
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets),
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps),


The URL of this page is