September 2009

Item writers often find it difficult to write multiple choice items that comply with good-item writing guidelines. This study shows that it is worth the extra effort spent writing good items.

Ross Brown
Manager, Test Development and Analysis

Consequences of Flawed Items
Many guidelines for writing good multiple choice items are intended to reduce the measurement error that results when candidates who potentially know the information being tested get an item wrong due to the construction of the item. Two examples of item flaws that may introduce such measurement error are multiple true/false items, and items with negative stems.

Multiple true/false items violate the principle that items should be focused on a single idea or issue. Multiple true-false items usually consist of a minimal stem and distractors that are conceptually unrelated. Candidates are required to assess each distractor independently and determine whether each response is true or false. For example:

                        The common cold:
                        A.  is transmitted through saliva only.
                        B.  is evident in a chest X-ray
                        C.  will most often clear up after two days.
                        D.  is treatable with Tamiflu.

Items with negative stems require candidates to select from the distractors the one that does NOT answer the conditions described in the stem. Candidates may get these items incorrect because they skim over and miss the negative word in the stem, and mistakenly choose a response that meets the conditions in the stem. In addition, these items do not assess what the candidate actually knows, but rather if they can identify an incorrect response to the issue presented in the stem. For example, a candidate can answer the question below without knowing the color of a pomegranate.
                         Which of the following is NOT red?
                         A.     apples
                         B.     pomegranates
                         C.     pears
                         D.     tomatoes

This study looked at the consequences of using items with these flaws in terms of 1) item difficulty and 2) candidate outcomes. This study is patterned after a study of items administered to medical school students by Downing (2005). The analysis was conducted on a group of 138 items, of which 69 were flawed items and 69 were unflawed items. The item flaws were multiple true/false and negative items.

Item p-value is the percentage of candidates who answered the item correctly. The table below shows that the average p-value for the flawed items was lower than for the unflawed items and the total items, indicating these items are more difficult for candidates to answer correctly.


Flawed Items
Unflawed Items
Total Items




For purposes of this study the passing standard was set arbitrarily at a score of 65% correct.  Candidates outcomes were then determined based on the total items, flawed items only and unflawed items only. Only 37% of the candidates pass when the flawed items are used, compared to 71% of the candidates passing when the unflawed items are used, and 52% passing based on total items. 
While this study is simulated from real data, it confirms the impact of flawed items found by Downing. It also provides concrete evidence that supports eliminating multiple true/false and items with negative stems from examinations. 

Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: The consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10, 133-143.
Measurement Research Associates, Inc.
505 North Lake Shore Dr., Suite 1304
Chicago, IL  60611
Phone: (312) 822-9648     Fax: (312) 822-9650

Please help with Standard Dataset 4: Andrich Rating Scale Model

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from welcomes your comments:
Please email inquiries about Rasch books to books \at/

Your email address (if you want us to reply):


FORUMRasch Measurement Forum to discuss any Rasch-related topic

Coming Rasch-related Events
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago,
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
Oct. 25-27, 2017, Wed.-Fri. In-person workshop: Applying the Rasch Model hands-on introductory workshop, Melbourne, Australia (T. Bond, B&FSteps), Announcement
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
April 13-17, 2018, Fri.-Tues. AERA, New York, NY,
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps),
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets),
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps),
The HTML to add "Coming Rasch-related Events" to your webpage is:
<script type="text/javascript" src=""></script>