MEASUREMENT RESEARCH ASSOCIATES TEST INSIGHTSOctober 2009
 Greetings   Selecting the appropriate amount of time to allow for candidates to complete a computer-based test may be a difficult decision.  We hope that the simple study will provide some criteria for selecting an appropriate amount of time. Phil HigginsManager, Computer-Based Testing
Candidate Measured Ability and Use of Time
I spoke with a candidate several days ago who was concerned that one minute per item would not be adequate time to answer the items.  This prompted another review of how candidates use testing time with the purpose of ascertaining how candidates of different measured ability levels used their time to respond and review items. The items are four response multiple choice items.

Candidates were divided into three groups based on their overall percent correct.  The highest group had percent correct scores above 68% correct.  The moderate group had scores between 52% and 68% correct and the low group had scores of 52% correct or less.  On average all candidates spent less than a minute to initially answer the item.  This is an average; however, the minimum was .5 minutes and the maximum was less than 1.5 minutes.  There was no statistically significant different in the time per item used by high scoring and low scoring candidates.  The table below shows the results.

Time in Seconds for Initial Response to Items

 Candidate Group Mean Seconds  per Item SD Min Max High scoring candidates 49 15 31 84 Moderate scoring candidates 55 13 31 83 Low scoring candidates 57 14 39 84 Total Population 54 14 31 84

A second issue is how much time candidates spend reviewing items.  Using the same candidate performance groups, the seconds candidates spent reviewing items was calculated. On average candidates spent less than 15 seconds reviewing items, the minimum was .00 or no review and the maximum was 42 seconds which is still less than a minute.  It is interesting that the lowest scoring candidates took the least maximum time to review their responses. There was no statistically significant different in the time per item used by high scoring and low scoring candidates.

Time in Second for Review of Items

 Candidate Group Mean seconds per item to review SD Min Max High scoring candidates 14 12 .00 42 Moderate scoring candidates 9 10 .00 44 Low scoring candidates 10 10 .00 24 Total Population 11 11 .00 44

The maximum amount of time a candidate interacted with each item is presented in the next table.  There was no statistically significant different in the time per item used by high scoring and low scoring candidates.

Total Time in Seconds Used by Candidates to Answer Items

 Candidate Group Total Mean seconds per item SD Min Max High scoring candidates 63 15 42 84 Moderate scoring candidates 65 14 39 84 Low scoring candidates 67 15 46 84 Total Population 65 15 39 84

This simple study points to two conclusions.  First candidates, regardless of their ability, use approximately the same amount of time to respond to items.  Second, allowing 60 to 90 seconds per item is ample time for candidates to respond and review items.  Since computer testing costs are often calculated by the amount a testing time used, this information may be useful in calculating the amount of time to allow for the test.

 Measurement Research Associates, Inc. 505 North Lake Shore Dr., Suite 1304 Chicago, IL  60611 Phone: (312) 822-9648     Fax: (312) 822-9650 www.MeasurementResearch.com

Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
Rasch Books and Publications: Winsteps and Facets
Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Rasch Models for Solving Measurement Problems (Facets), George Engelhard, Jr. & Jue Wang Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan