A Rasch-Informed Standard Setting Procedure

Items are written for the different levels of ability that exist in the institution. Alternatively, judges assign already written items to the existing levels of ability. In other words, they decide what ability level a person should be to answer each item. If there are say, five levels of proficiency from A to E, A being the highest and E the lowest, then the items are rated on a five point scale from 1-5. Five corresponding to the lowest level, E and 1 to the highest level, A. For instance, if the judges agree that "border line" Level B students can answer an item correctly the item is rated 2 and if they envisage that for answering the item a test-taker should be at least a border line Level A student then the item is rated 1. The average judge ratings of an item is considered as its final difficulty estimate. All the items are rated in this way.

Afterwards, the items are administered to a group of test-takers and then Rasch analyzed and the location calibrations for the items are estimated. The success and preciseness of the standard setting procedure heavily depends on the accordance between the judge-envisaged item difficulties and empirical student-based item difficulties. Any standard setting procedures in which this accordance is not achieved is futile.

If the judges have done their job properly then there must be a correspondence between the empirical item estimates and judge-based item difficulties. Figure 1 shows the item estimates hierarchy on an item-person map. The Level A items are clustered at the top of the map and the other levels' items are ordered accordingly. However, there are some items which are misplaced. It is obvious that judge-intended levels of the items never correspond exactly with the Rasch measures. For instance, as can be seen in Figure 1, Rasch has reported some A items down in the B region (or below) and some B items up in the A region (or above).

Standard-setting always requires a compromise between the judges' item hierarchy and the empirical (Rasch) item hierarchy which corresponds to actual examinee performance. Standard-setting also requires negotiation about the location of the criterion levels. There will be several reasonable positions for the criterion level, from least-demanding to most demanding.

We might choose the transition points to be the lines at which the minimum number of items are misclassified between two adjacent levels. For example, the transition point between Level A and Level B is the point where the items predominantly become Level B items (as is done in Figure 1). That is, the difficulty level of item 18A or 97B which is 1.53 logits.

Or we might choose the line corresponding to 60% chances of success on the items that fall in the transition points determined by the procedure above. For example, the items at the transition points between Level A and Level B have a difficulty estimate of 1.53 logits. This is an item of borderline difficulty. In other words, an ability estimate of 1.53 logits can be the minimum cut-off score for Level A. This is the ability level required to have 50% chances of getting this item right. To be on a safe side, one can also define:

"cut-off score" = 60% chances of success on an item of borderline difficulty

Therefore, the cut-off score for Level A will be:

Pni (Xni=1| θn, δi) = exp(θn - δi) / (1 + exp(θn - δi))

0.60 = exp(θn - 1.53) / (1 + exp(θn - 1.53))

loge(1.5) = θn - 1.53

θn = 1.93

The cut-off scores for the other levels can be determined in a similar way. The items at the point of transition between Level B and Level C are 15C, 41B, 4B, 70B, 81B, 82B with difficulty estimates of 0.68 logits. Therefore the cut-off score for Level B can either be 0.68 logits, if we consider the 50% chances of success on the items at the transition point, or 1.08 logits if we consider 60% chances of success at the transition point.


Figure 1: Difficulty order of items and their judge-based corresponding levels



Baghaei P. (2009) A Rasch-Informed Standard Setting Procedure, Rasch Measurement Transactions, 2009, 23:2, 1214



Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):

 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 25 - July 27, 2018, Wed.-Fri. Pacific-Rim Objective Measurement Symposium (PROMS), (Preconference workshops July 23-24, 2018) Fudan University, Shanghai, China "Applying Rasch Measurement in Language Assessment and across the Human Sciences", www.promsociety.org
July 29 - August 4, 2018 Vth International Summer School `Applied Psychometrics in Psychology and Education`, Institute of Education at the Higher School of Economics, St. Petersburg, Russia, https://ioe.hse.ru/en/announcements/215681182.html
July 30 - Nov., 2018Online Introduction to Classical and Rasch Measurement Theories (D.Andrich), University of Western Australia, Perth, Australia, http://www.education.uwa.edu.au/ppl/courses
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
August 25 - 28, 2018, Sat.-Tue.Análisis de Rasch introductorio (en español). (Agustín Tristán), Instituto de Evaluación e Ingeniería Avanzada. San Luis Potosí, México. www.ieia.com.mx
Sept. 3 - 6, 2018, Mon.-Thurs. IMEKO World Congress, Belfast, Northern Ireland, www.imeko2018.org
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

The URL of this page is www.rasch.org/rmt/rmt232f.htm

Website: www.rasch.org/rmt/contents.htm