Construct Deficiency/Saturation: Content Coverage and Item Writing

A construct is an underlying latent trait that cannot be directly observed and measured (e.g., a mental property). The goal of measurement, specifically in social science research, is to develop questionnaire or test items to assess those unobservable constructs indirectly. The objective is to have items that cover as much as possible of the construct's continuum to allow for use in collecting information about a wide range of person performance.

In order to correctly estimate a person's location on a construct, it is imperative to define that construct well (Wright & Stone, 1979). When items are developed, they are intended to cover the spectrum of the construct being defined. However, there are instances when this is not the case. The result is insufficient or redundant coverage. The two instances are referred to as: 1) construct deficiency (insufficient coverage) and 2) construct saturation (redundancy). Each has implications for item bank development; which, in turn, impacts development of computer-based and computer-adaptive tests.

An item bank is a comprehensive catalog of items for use in creating psychometrically sound fixed-length, brief form and/or adaptive tests. These items should span the various construct dimensions and function along their respective continua at various difficulty levels. "The idea is that the test user can select test items as required to make up a particular test" (Choppin, 1978). The flexibility provided by an item bank allows the researcher to utilize valid, reliable and well-validated items without being required to re-calibrate those items each time they are used. Items selected for future use can differ, thus allowing optimized use of individual items.

Construct deficiency, CD, represents "gaps" on the construct continuum. These "gaps" represent the points at which the construct is poorly defined by the items (Schulz, 1995). In this situation, the goal is to develop items which fill these "gaps" at the specified logit value. There are two specific types of CD of interest:

1) statistically meaningful construct deficiency (SMCD), and 2) clinically meaningful construct deficiency (CMCD).

SMCD is a flexible index assigned by the principal investigator and item-banking team. A distance of 0.30 to 0.50 logits is a recommendation for SMCD evidence. CMCD is conceptualized on two levels: 1) important content area is not covered, and 2) overall content area is not covered fully. If an item is deemed clinically meaningful, upon consensus, regardless of fit, it is kept in the bank.

The optimal goal of an item bank is to fully cover the spectrum of a construct, thus producing a reliable measure. When a construct is poorly defined, the implications for future use are: 1) floor and ceiling effects will impact those individuals whose ability levels fall outside of the item difficulty levels, thus providing inadequate information; and 2) individuals whose ability levels are at the location of a "gap" will be given items that poorly target their ability. Furthermore, there are two specific ramifications for a poorly defined construct: 1) impact on the development of computer-based tests, and 2) on the development of computer-adaptive tests.

Construct deficiency can impact the results of a computer-based test because it reduces the amount of information obtained for each individual because the construct is poorly defined. This is problematic on two levels: 1) items are not targeted at the person's ability level, and 2) higher error estimates for the person's ability level, thus lowering precision and interpretability.

Construct deficiency impacts computer-adaptive tests in much the same way as it impacts computer-based tests. Maximum-information-based computer-adaptive tests specifically function to target the person at his/her ability level with items at the same level of difficulty. If there is not an item located at that person's ability level, the test is forced to move to an item further away, thus increasing the error of the ability estimate. Items are presented based on responses to the preceding item, therefore, it is necessary to fully define the construct along the continuum before attempting to produce this type of test. A bank of items limited by construct deficiency results in the inability to measure individuals along the entire ability continuum with high precision (Halkitis, 1996).

Setting up a computer adaptive test requires thresholds for item selection (i.e., logit range), and precision (i.e., stopping rules based on individual standard error). When a construct is poorly defined, the individual is forced to take more items in order to achieve a reliable estimate.

Construct saturation is over-representation by similar items at a specific logit value. This is defined more fully as the point on the construct continuum where several items are measuring the same thing in almost the same way. Overall, the goal is to have all of the items measure the same construct. However, we want them to produce new information at each level of that continuum. "A useful item is "as similar as possible, but as different as possible" (Linacre, 2000)". An item bank may have many items at the same difficulty level. Over-representation occurs when some of those items are too similar and so are no longer independent. The redundancy incurred by administering two almost identical items slightly distorts the person ability measures, but does not impact the overall measures noticeably.

The implications of construct saturation in an item bank are more positive than negative. By incorporating items that measure the same thing on a construct, it is possible to extend the choices for item selection by the test developer. But overly similar items should be identified as alternatives when used in the construction of any particular test.

The impact of construct saturation on a computer-based test is negative if more than one alternative item is included. Respondents may become frustrated when presented with several items that ask essentially the same thing. Further, statistical information is usually based on regarding the items as independent. It is difficult to make adjustments for non-independent items.

Impact on Development of Computer-Adaptive Tests Construct saturation on a computer-adaptive test is beneficial for the test developer because it allows different alternative items with similar logit values to be presented to different individuals as they proceed through the test. This overcomes the problem of "tracking", which occurs when all persons of similar ability are administered essentially the same test. Therefore, to avoid over-exposure of individual items and also "tracking", it is actually beneficial to have redundant alternative items.

In the presence of SMCDs and CMCDs, there are seven steps recommended below as a possible solution:
Step 1: Identification of any clinically or statistically meaningful gaps or redundancies in the continuum. This requires labeling the gaps as statistical, clinical, or both, and identifying sets of alternative items.
Step 2: Determine the number of items needed to fill each gap (e.g., 5-10 items, depending on the gap size).
Step 3: Formulation of new items by a committee comprised of clinical and statistical experts.
Step 4: Review by oversight committee. Reasons for rejection of items recorded in hard copy.
Step 5: Testing of new and revised items with clinical collaborators and selected group of patients.
Step 6: Patient testing utilizing computer-based-testing procedures that incorporate old and new items.
Step 7: Calibration of new items along the anchored continuum of the previous items.

Stacie Hudgens, Kelly Dineen, Kimberly Webster, Jin-Shei Lai, David Cella on behalf of the CORE Item Banking Team

Choppin, B. H. (1978) Item Banking and the Monitoring of Achievement Research in Progress Series, I. NFER.

Linacre, J.M. (2000) Redundant Items, Overfit and Measure Bias. RMT 14(3) p.755.

Assessing Statistically and Clinically Meaningful Construct Deficiency/Saturation: Recommended Criteria for Content Coverage and Item Writing, Stacie Hudgens, Kelly Dineen, Kimberly Webster, Jin-Shei Lai, David Cella on behalf of the CORE Item Banking Team, … Rasch Measurement Transactions, 2004, 17:4 p.954-955

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn. George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
July 21 - 23, 2025, Mon.-Wed.	Pacific Rim Objective Measurement Symposium (PROMS) 2025, www.proms2025.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com