Calibrating readers of the Test of Written English (6.60), Carol M.
Myford, Diana B. Marr
Reader assignment can have important consequences for examinees, particularly those whose scores lie in critical cut-score regions. Those examinees may pass or fail based upon reader assignment. This study was to determine the interchangeability and harshness stability of readers for the Test of Written English.
Student self-assessment of foreign language speaking proficiency
(6.60), Dorry M. Kenyon
Data from a new task-based self-assessment of foreign language speaking proficiency collected from 300 high-school and college learners of Spanish, French and German was analyzed to obtain the student performance-based scaling of the tasks. This scaling is compared with the American Council for the Teaching of Foreign Languages task hierarchy.
Changes in ratings of standard-setting judges over time (6.60),
George Engelhard, Jr., David W. Anderson
Quality of judgments obtained from standard setting judges is examined using a binomial trials model with a time effect parameter. A study of 25 Math judges and 22 English Language Arts judges suggests that there are significant differences in the quality of judgments form different judges. The practical implications of this are discussed.
Feasibility of producing user-based norms for the NBME examinations
(12.46), J. Folske, C. Iwamoto, Ronald J. Nungester, Richard M.
In the past, item parameter estimates obtained by calibrating certification examinations administered in the 4th year of medical school were used to produced scores and norms for the 3rd year examinations. But 4th year-based norms are not applicable to 3rd year performance. The development of 3rd year-specific scores and norms is found to be practical by concurrently calibrating several 3rd year examination forms.
Does cheating (test-wiseness) on CAT pay: NOT! (12.46), Richard
Gershon, Betty Bergstrom
When CAT tests allow review, examinees can give themselves an easier, off-target test by deliberately answering items incorrectly, and then, on review, changing their answers from wrong to right and so raising their ability. Simulated data indicate that this form of test-wise "cheating" is risky. When the test is much too easy and also short, examinee ability is severely underestimated if the examinee fails to correct even 1 or 2 deliberately wrong answers.
Validity of item selection: computerized-adaptive and paper-and-
pencil (12.46), Mary E. Lunz, Craig W. Deville
A validation committee rated CAT and P&P constructed examinations as having similar face validity, adherence to test specification, ordering of items, and cognitive skill distribution. Psychometric properties were also found to be similar. Because CAT quality depends on the item pool, the characteristics of a well constructed item pool are discussed.
ANOVA with Rasch measures (31.49), John M. Linacre
Ordinal data can be stratified at various levels to produce Rasch measures. These measures, with their standard errors, can then be further analyzed. One stratification estimates a measure for each examinee and then uses these measures to estimate demographic effects. Another stratification estimates measures for demographic effects directly from the ordinal observations. An example is used to contrast these approaches and their outcomes.
Partial credit modeling for theory of developmental sequence
(31.49), Weimo Zhu, Karen A. Kurz
An instrument, with 5 multi-level, partial credit items was based on the theory of developmental sequence, and administered to 517 children. Partial credit analysis confirmed and elucidated the developmental sequence theory.
Using Rasch to create measures form survey data (50.57), Rita K.
In secondary data analysis, items from surveys designed by others are chosen to act as proxies for variables a research wishes to define. Rasch analysis constructs measures that result in a fuller description of the variables than is provided by merely obtaining composite raw scores. In an example, measures are constructed for teachers's use of various grouping arrangements to assist tailoring of mathematics instruction.
Model for multifaceted tests: GAEL-C grammatical categories
(50.57), Zora M. Ziazi, Betsy Jane Becker
Ability measures of 27 hearing-impaired students in 7 grammatical categories are used a outcome variables for a MANOVA model in which students' gender and degree of impairment are the independent variables. Significant differences are found for the within- subjects grammatical categories, but sample size was too small to detect significant between-subject (gender, impairment) effects.
The effect of misfit on measurement (IOMW) , John M. Linacre
Unmodeled behavior, misfit, degrades the quality of measures and so inflates their imprecision (standard errors). But how severe must misfit become for measures to be misleading or the measurement corrupt? A simulation study indicates that the levels of misfit usually encountered with carefully constructed tests present minimal threats to the validity of measure-based inferences.
Rasch factor analysis (IOMW), Benjamin D. Wright
Factor analysis and Rasch measurement are compared to show that they use the same data and estimation method to solve the same problem. But factor analysis is faulted for mistaking stochastic, ordinal observations for linear measures, and then failing to construct linear measures on the factors from the data. The utility of the Rasch approach is demonstrated in a comparative example.
AERA abstracts. Rasch Measurement Transactions, 1995, 8:4 p.392
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|June 23 - July 21, 2023, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 11 - Sept. 8, 2023, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt84d.htm