# Why Fuss about Statistical Sufficiency?

Statistical sufficiency is an obscure property of the Rasch model. Although Georg Rasch does not use the term, he writes: "The best estimate of the ability parameter for a person can be derived from his raw score only" (Rasch, 1980, p.76). For Rasch, this is equivalent to the statement that a raw score is a sufficient statistic for an ability measure.

Ronald Fisher (1922) writes of a sufficient statistic "that the statistic chosen should summarize the whole of the relevant information supplied by the sample." The Fisher-Neyman theorem asserts that "T" is a sufficient statistic for the unknown measure underlying the data if, and only if, the probabilities associated with the data can be factored into two parts. One part must be dependent on the outcome of the measure through the sufficient statistic, "T", only, and the other part must be independent of the unknown measure (Halmos & Savage, p.226).

Halmos & Savage (p.240) provide an illustration of sufficiency that can be exported to the field of educational testing. Suppose a test of 20 items of known difficulty conforms to the Rasch model. Scoring these items is arduous, so the examination board decides to make pass- fail decisions for each examinee based on that examinee's success on a single item. To remove the possibility of bias in item selection, the board decides to select the item at random for each examinee. The board awards "pass", if the examinee succeeds on that item.

Next time, the board installs a scoring machine which reports the raw score for each examinee on the entire test. But the board wants to maintain the same pass-fail procedure as last time. So the board selects for each examinee a test item at random. Then, using the value of the raw score and the known difficulty of the selected item, the probability of success of the examinee on that item is estimated. This probability is compared with a random probability in order to assign the examinee a success or failure on that item. Now the pass- fail decision is made on that simulated outcome just as it had been previously with an observed response.

In the long run, which pass-fail method will be more accurate? The answer is that they will be equally accurate. This is because the raw score is a sufficient statistic. Statistical sufficiency implies that the examining board is just as well off knowing the value of the sufficient statistic as it is knowing the observations that comprise it. The extra details provided by knowing the actual responses do not provide the examining board any further useful guidance as to the size of the measure.

Statistical sufficiency is the same as the requirement that person measures be "sample-free." Among all relevant items of the same known difficulty, it must make no difference which one happened to be included in the test.

But what if there is differential item functioning? What if it matters which items of a certain difficulty are included in the test? Then the raw score is no longer a sufficient statistic. More information about the items must be provided. Perhaps the data can be decomposed into subsets, each with a sufficient statistic, or perhaps the test only admits of a qualitative description of each of the responses observed.

Sufficiency is an idea based on a probability model for the data. Idea sufficiency is never met in practice. Rather, the pattern of the observed data must be compared with that which would be expected were sufficiency to exist. This comparison forms the basis of the decision as to whether the data approximates the idea closely enough for measurement.

What is the relationship between the sufficient statistic and the underlying measure? A sufficient statistic does not provide an exact value for the underlying measure, rather the statistic summarizes all that is known on which to base an estimate of that measure. Whether that estimate is statistically unbiased or consistent or "best" (minimum variance) are matters quite apart from statistical sufficiency.

Achievement of statistical sufficiency is a theoretical idea. This idea corresponds to the practical intention that estimates of measures be as free as possible of the context from which they were obtained. To belittle sufficiency is to reject the goal of liberating measures from the local particulars of the measuring instrument and environment.

Halmos PR & Savage LJ. 1949. Application of the Radon-Nikodyna theorem to the theory of sufficient statistics. Annals of Math. Stat. 20, p.225-241.

Why Fuss about Statistical Sufficiency?, J Linacre … Rasch Measurement Transactions, 1992, 6:3 p. 230

Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

 Forum Rasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 21, 2019, Thur. 13th annual meeting of the UK Rasch user group, Cambridge, UK, http://www.cambridgeassessment.org.uk/events/uk-rasch-user-group-2019
April 4 - 8, 2019, Thur.-Mon. NCME annual meeting, Toronto, Canada,https://ncme.connectedcommunity.org/meetings/annual
April 5 - 9, 2019, Fri.-Tue. AERA annual meeting, Toronto, Canada,www.aera.net/Events-Meetings/Annual-Meeting
April 12, 2019, Fri. On-line course: Understanding Rasch Measurement Theory - Master's Level (G. Masters), https://www.acer.org/au/professional-learning/postgraduate/rasch
May 24 - June 21, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 22 - 30, 2019, Wed.-Thu. Measuring and scale construction (with the Rasch Model), University of Manchester, England, https://www.cmist.manchester.ac.uk/study/short/intermediate/measurement-with-the-rasch-model/
June 4 - 7, 2019, Tue.-Fri.In-Person Italian Rasch Analysis Workshop based on RUMM (Fabio La Porta and Serena Caselli; entirely in Italian). Prof David Andrich from Western Australia University will be hosted by the workshop. For enquiries and registration email to workshop.rasch@gmail.com
June 17-19, 2019, Mon.-Wed. In-person workshop, Melbourne, Australia: Applying the Rasch Model in the Human Sciences: Introduction to Rasch measurement (Trevor Bond, Winsteps), Announcement
June 20-21, 2019, Thurs.-Fri. In-person workshop, Melbourne, Australia: Applying the Rasch Model in the Human Sciences: Advanced Rasch measurement with Facets (Trevor Bond, Facets), Announcement
June 28 - July 26, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 2-5, 2019, Tue.-Fri. 2019 International Measurement Confederation (IMEKO) Joint Symposium, St. Petersburg, Russia,https://imeko19-spb.org
July 11-12 & 15-19, 2019, Thu.-Fri. A Course in Rasch Measurement Theory (D.Andrich), University of Western Australia, Perth, Australia, flyer - http://www.education.uwa.edu.au/ppl/courses
Aug 5 - 10, 2019, Mon.-Sat. 6th International Summer School "Applied Psychometrics in Psychology and Education", Institute of Education at HSE University Moscow, Russia.https://ioe.hse.ru/en/announcements/248134963.html
Aug. 9 - Sept. 6, 2019, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 14 - 16, 2019. Wed.-Fri. An Introduction to Rasch Measurement: Theory and Applications (workshop led by Richard M. Smith) https://www.hkr.se/pmhealth2019rs
August 25-30, 2019, Sun.-Fri. Pacific Rim Objective Measurement Society (PROMS) 2019, Surabaya, Indonesia https://proms.promsociety.org/2019/
Oct. 11 - Nov. 8, 2019, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Nov. 3 - Nov. 4, 2019, Sun.-Mon. International Outcome Measurement Conference, Chicago, IL,http://jampress.org/iomc2019.htm
Jan. 24 - Feb. 21, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 22 - June 19, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 26 - July 24, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 7 - Sept. 4, 2020, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 9 - Nov. 6, 2020, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 25 - July 23, 2021, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com