At a 1990 AERA Rasch SIG session, the history of some key ideas underlying fundamental measurement was reviewed by both the presenters and the discussants. It was suggested that measurement in the physical sciences is "direct", e.g. a ruler measuring the height (length) of a person itself has a length. On the other hand, measurement in the social sciences is "indirect", e.g. an item measuring the ability of a person does not itself have an ability. In this analysis I will show that items do have abilities, and that it is only because we call their abilities "difficulties", that we lose sight of the similarity between physical and social measurement.
When one thinks of a ruler and its application, one thinks of equally spaced marks on a continuum with the feature that, when the ruler covers the appropriate range, any person can be located between a pair of adjacent marks. Latent trait theory and Rasch measurement specify that the items of a test can be located on a continuum onto which persons can also be mapped. The response of any person to any item is characterized by a probabilistic model and we expect that, from a set of relevant responses, each person can be located on the continuum, and therefore, in general, be located above some and below other items on that continuum. The locations of the items are counterparts to the marks on a ruler. So far, then, the analogy between physical and social measurement is maintained. There are two differences, but they are not critical to the case to be made: one is that items are seldom equally spaced like the markings on a ruler, and the other is that the location of a person is probabilistic and not deterministic as is mistakenly presumed in the case of measurement by a ruler. When the markings on the ruler are very close, a probabilistic model, usually the normal, is applied to determine whether the object to be measured is located above or below any mark.
The locations of the items on the continuum are generally called "difficulties" and the locations of the persons are generally called "abilities". However, because both the items and the persons are located on the same continuum, it follows that either the items and the persons both have difficulties or the items and the persons both have abilities. It is only a matter of convention that different words are used for the locations of persons and items. Thus it may be said that, by definition, the items have abilities.
In order to provide a substantive, in addition to semantic, case that an item has an ability, consider the history of assessment and the intention behind the construction and use of items. It is only recently that emphasis has been placed on the written item. Even now much testing is carried out without formal written items. The main form of assessment, conducted by teachers across the whole range of education, is in oral interchange between teacher and student. At one end of the education continuum, there is the Ph.D. oral examination; at the other is the oral assessment of students by teachers in schools as they supervise their students' studies. The written item has become so widespread because of the development of educational assessment technology. Literacy, including literacy in particular subject disciplines, can only be assessed by including a written component in the assessment. In addition, it is necessary to retain records of the tasks and performances, sometimes for legal reasons. The science of educational measurement has shown the potential for unreliability in oral assessments. The need to increase reliability and validity by controlling disturbing factors has enhanced the establishment of written forms of assessment. The assessment of abilities through written items, then, can be seen as an evolution from oral assessment, but a technical evolution that has retained the essential intention and process of oral assessment.
In an oral assessment, the examiner may pose a question, and the student may answer it. By focusing on oral assessment, it becomes clear that the examiner, the instrument of assessment, must have an ability of the same kind as is being assessed in the student, (and hopefully more of it, just as a ruler should have a greater length than the person whose height is being measured). The examiner may choose to manifest his or her ability in posing a question at a particular level. If the student's answer manifests an ability that the examiner is satisfied is above that level, then the examiner may manifest a higher level of ability by posing a harder question, and thus check whether the student can also manifest a correspondingly higher level of ability. This process continues, as it does in individually administered intelligence test such as the Binet, until the student cannot answer further questions requiring yet greater levels of ability.
This kind of interaction between examiner and student can be compared to aligning an object with a ruler to see at what point the object is beyond one mark but before the next one. Here we can invoke another analogy with physical measurement - the measurement of the mass of an object by adding a series of calibrated masses, "weights", to one side of a beam balance until it balances the other side.
Thus it can be seen that the written item is a substitute, a formal and written substitute, for an interaction between two persons, the examiner and the student. In writing the item, the examiner has to manifest the same ability that is to be assessed in the student. Clearly, the construction of the items can fall short of assessing only the intended ability. This is why developments in educational measurement have centered on concepts like reliability and validity. However, the key point remains. When items are to be constructed in a particular field, it is those with a level of knowledge, at least up to the required level in the field, and not those with less than the required level of knowledge, who are asked to construct the items. In constructing the items, they are supposed to manifest their knowledge in the items, In short, the items are expressions of the ability of the person constructing the items. When students respond to the items, they are engaging in an interaction with the manifested ability of another person: the conclusion is unavoidable, therefore, that items do have abilities, and that either physical measurement is equally "indirect" or social measurement is equally "direct".
A ruler manifests its length through its elongated shape and its sequence of markings. Both sides of a beam balance manifest their masses through the effects of gravity on them. Likewise, examiners, who write test items against which students are to pit their abilities, manifest their own abilities through the items they construct. Moreover, the ruler, the beam balance, and test items, all must be administered and used according to prescribed procedures before their intended property is manifested well enough to lead to measurement. The ruler and the beam balance can be used for all kinds of other purposes - as weapons, toys, souvenirs, and so on. Without correct administration, they do not manifest their property of length or mass in any special way. Indeed, every object has some length or mass, and yet every object is not used for measuring length or weight. An engagement between examiners and students is an engagement of abilities on both sides. To ensure that the examiners' and the students' abilities are manifested successfully for measurement, the abilities of the examiners are administered as written test items, and the abilities of the students are recorded as written responses.
The Ability of an Item, D Andrich Rasch Measurement Transactions, 1990, 4:2 p. 101-102
|Rasch Measurement Transactions (free, online)||Rasch Measurement research papers (free, online)||Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch||Applying the Rasch Model 3rd. Ed., Bond & Fox||Best Test Design, Wright & Stone|
|Rating Scale Analysis, Wright & Masters||Introduction to Rasch Measurement, E. Smith & R. Smith||Introduction to Many-Facet Rasch Measurement, Thomas Eckes||Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr.||Statistical Analyses for Language Testers, Rita Green|
|Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar||Journal of Applied Measurement||Rasch models for measurement, David Andrich||Constructing Measures, Mark Wilson||Rasch Analysis in the Human Sciences, Boone, Stave, Yale|
|in Spanish:||Análisis de Rasch para todos, Agustín Tristán||Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez|
|Forum||Rasch Measurement Forum to discuss any Rasch-related topic|
Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement
Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.
|Coming Rasch-related Events|
|June 23 - July 21, 2023, Fri.-Fri.||On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com|
|Aug. 11 - Sept. 8, 2023, Fri.-Fri.||On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com|
The URL of this page is www.rasch.org/rmt/rmt42a.htm