Multiple-choice items continue to dominate educational testing because they are an effective and relatively easy way to measure constructs like ability and achievement. The professional attention given to analyzing responses to multiple-choice test items is considerable and has lead to advances in item response theories. In contrast, there is little scientific basis for multiple-choice item writing (Haladyna & Downing 1988a,b). Most item writing knowledge is based on personal experience or wisdom passed on from particular mentors. A paradox exists: we place more emphasis on analyzing responses than we do on how we obtain them. This article works toward restoring the balance by focusing on advances in item writing.
Theories of Item Writing:
There are many benefits in item-writing theories like those of
Bormuth, Guttman (facet theory), Hively, Tiemann & Markle, and
Williams & Haladyna. First, they emphasize the operational definition
of content in terms of the types and extent of cognitive behavior to
be tested. Second, they reduce the idiosyncrasies of item writers by
developing standardizing rules. Third, they facilitate the generation
of items to cover all of the relevant content domain. Fourth, they
provide evidence of construct and content validity. On the other
hand, the laborious nature of their item writing rules limits their
usefulness. Nevertheless, they offer the hope that item writing will
surmount the problems that limit test development today.
Old Item Formats:
* Multiple-choice (MCQ). A stem, in question or partial-sentence
form, and four or five options. There is some support for preferring
three options.
For what is San Diego best known? A. Outstanding restaurants and fine dining B. Mild climate C. Major league baseball and football teams
* Matching. A modified MCQ in which a single set of options precedes the stems, thus focusing testing on one set of concepts.
Match the city with a distinguishing feature. A. San Diego B. San Francisco C. Seattle D. Los Angeles 1. Proximity to islands and protected ocean water 2. Entertainment and tourist attractions 3. Cool climate and unpolluted air
* True-false. This format has gained a negative reputation because of substantial evidence against its use. But there are revisions which seem worthwhile (see Alternate-choice and Multiple true-false below).
New Item Formats:
Here are some new multiple-choice formats, with my recommendations
based on a survey of the research.
* Alternate-choice. A two-option MCQ with the limitation of a 50% probability of guessing the right answer. Offsetting this is efficiency. One can administer many more alternate-choice items than conventional MCQs in a fixed time. Lord argues that two-option testing is ideal for high achieving examinees, while four- or five- option MCQs work best with low achievers. When Steve Downing and I analyzed distractor use for three standardized tests, we found that most items contained only one or two working distractors. Recommended.
What is more popular in downtown San Diego? A. Horton Plaza B. Gaslite District
* Complex multiple-choice (Type K). A set of answers are combined to form the multiple-choice options. These items are usually more difficult, less discriminating, and require more development and administration time than MCQs. The National Board of Medical Examiners has discontinued the use of this format. Not recommended.
What best represent San Diego's attractiveness? 1. Climate and location 2. Beaches and water sports 3. Tourist attractions (e.g. Sea World) A. 1 & 2 B. 2 & 3 C. 1 & 3 D. 1, 2, & 3
* Multiple true-false. Like an MCQ, but the examinee evaluates the truthfulness of each option. Each option is numbered because each is a true-false item, while the stem is not numbered because it is the stimulus. The obstacle to this format is lack of familiarity. Another problem may be the tendency for one option to influence the response to another. This format is efficient, and any MCQ can be presented this way. Recommended.
What are major attractions in San Diego? 1. Sailing and water sports 2. Restaurants 3. Shopping 4. Tourists attractions (e.g. Sea World)
* Context-dependent item set (Testlet). Stimulus material and 5-12 related test items. Any item format may be used. There are four types of context-dependent item sets. (1) pictorial: pictures, maps, drawings, graphs, data, photographs, art, (2) interlinear: a passage with denotations which provide a basis for questioning, usually for the detection of grammatical, spelling, punctuation, and capitalization errors, (3) interpretive: for reading comprehension, and (4) problem-solving. The item set is inefficient to construct and administer, but it is versatile and able to measure higher-level thinking. Used extensively in certification and licensing testing programs. Recommended.
You are planning a one-week vacation to a West Coast city with a mild summer climate.
1. What is a reasonable estimate of daily food and lodging costs for two?
A. $50 B. $100 C. $200
2. What is a reasonable estimate of the minimum weekly rental rate of a sub-compact car?
A. $120 B. $150 C. $175
* Item Shell A successfully performing item from which the content
has been removed, leaving the syntactic structure. A mathematics
teacher wants to test problem solving in the context of financing the
purchase of an automobile. The original successful question states:
What is the annual interest charge on an auto loan of $10,000 at
8.5%?
The teacher strips out the loan amount and percentage rate, making an item shell, and replaces them with new values to generate similar items. This example may appear facile, but sophisticated item shells for medical problem solving and pharmacy have been developed that tap aspects of higher level thinking. The ease of constructing the item shell counteracts "writer's block" and makes it appealing. Recommended.
Haladyna TM, Downing SM (1988a) A taxonomy of multiple-choice item- writing rules. Applied Measurement in Education, 1, 37-50 Haladyna TM, Downing SM (1988b) The validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 1, 51-78
Advances in Item Design, T Haladyna Rasch Measurement Transactions, 1990, 4:2 p. 103-104
Forum | Rasch Measurement Forum to discuss any Rasch-related topic |
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
Coming Rasch-related Events | |
---|---|
Oct. 4 - Nov. 8, 2024, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
Jan. 17 - Feb. 21, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
May 16 - June 20, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
June 20 - July 18, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com |
Oct. 3 - Nov. 7, 2025, Fri.-Fri. | On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com |
The URL of this page is www.rasch.org/rmt/rmt42b.htm
Website: www.rasch.org/rmt/contents.htm