Correlation Coefficients: Describing Relationships

Correlation coefficients summarize the association between two variables. They include:

(a) Both variables are expressed as perfectly precise, normally distributed, real numbers (PPNDRN): the Pearson product-moment correlation.

(b) Both variables are PPNDRN, but one is grouped into two classes (high and low): the biserial correlation.

(c) Both variables are PPNDRN, and both are grouped into two classes (high and low): the tetrachoric correlation, or into multiple ordered classes: the polychoric correlation.

(d) One variables is PPNDRN, the other is a discrete variable with only two values (such as gender): the point-biserial correlation, or more than two equally spaced values, the point-polyserial correlation.

(e) Both variables are discrete with only two values: phi correlation.

If the numbers to be correlated are not perfectly precise, then it may be possible to disattenuate the correlation coefficient for measurement error (RMT 10:1, 479).

Pearson's product-moment correlation is the most commonly reported, even for those data for which it is superficially not a good match. Of course, the same is true of other familiar statistics, such as the mean and standard deviation.

So which correlation coefficient is most indicative in any particular instance? Here statistical theory encounters harsh reality. No empirical variable exactly matches the assumptions of a correlation coefficient. Even with natural discrete dichotomies, such as gender, there is always some fuzziness. Mendel's genetic experiments have come under attack for the manner in which he may have manipulated the fuzziness in his data.

So there are two criteria: (i) ease of communication and (ii) protection against misleading inferences.

For ease of communication, the more familiar the coefficient is, the better, provided it does not produce a misleadingly incorrect value.

For those coefficients which produce reasonable values, the temptation is almost always to report the highest (or most significant) relationship possible. This temptation is evident in factor analysis: the choice of communalities, rotation and obliqueness tends to be guided by the desire for a conspicuous finding. Thus, after the correlation reporting the highest or most significant value has been discovered, it is tempting to rationalize why that particular correlation coefficient is the "correct" one for those data.

Guilford (1965, p. 325) points out that if the data accord with the biserial correlation, then there is an exact mathematical relationship between the biserial and point-biserial. So, if both are computed their ratio must approximate specific values. So when this ratio is observed for empirical data, the biserial may be the correlation of choice. Under essentially all other conditions, Guilford recommends the more conservative point-biserial correlation.

 01
1167374
0203186

Phi correlation = Pearson = Point-biserial = 0.21

Biserial correlation = 0.27 or 0.31

Tetrachoric correlation = 0.34

An early objection to the tetrachoric correlation was that its value could only be approximated. With modern computer power, the approximation can be so precise as to be considered exact. But other objections remain.

Nunnally (1967, 123-4) remarks "There are very strong reasons for not [his emphasis] using the biserial and tetrachoric correlations in most of the ways they have been used in the past. ... Unless subsequent steps are made to turn the dichotomous variables into continuous variables, such estimates only serve to fool one into thinking that his variables have explanatory power beyond that which they actually have. It is tempting to employ biserial and tetrachoric correlations rather than phi and point-biserial correlations because the former are usually larger." He adds "When the assumption of normality is not met, the estimates can be off by more than 20 points of correlation."

 12345
100123240
204236623
3110677715
4122133403
5871125212

Pearson correlation = 0.61

Polychoric correlation = 0.67

Computation by Uebersax (2000)

Coote (1998, p. 404) has a provocative paragraph: "Product-moment correlation matrices are often used ... although they are only appropriate for continuous variables (Joreskog and Sorbom, 1996). Information collected using five and seven-point Likert scales have ordinal properties (Bollen, 1989). Ordinal variables do not have origins or units of measurement and should not be treated as though they are continuous (Joreskog, 1994). Treating ordinal data as continuous increases the likelihood of correlated error variances, particularly where the initial factor loadings are large. Another disadvantage of using a product-moment correlation matrix with categorical data is that the standard errors and chi-square test statistics are incorrect (Anderson and Gerbing, 1988). Where Likert scales are used polychoric correlations should be computed and analyzed."

But reality is rarely this clear-cut. The conceptualization of the ordinal scale required for the polychoric correlation accords with Samejima's "graded response" model. The categorization is regarded as ordered but the categorization itself is considered arbitrary.

A Rasch-consistent conceptualization would be a variant of the point-polyserial, with category intervals consistent with the corresponding Rasch-model ICC. In this regard, integer intervals are exact for two and three category scales, but generally too central for the extreme categories of longer rating scales. Consequently an integer-spaced point-polyserial would tend to misestimate the actual correlation for long rating scales, but generally only by a small amount. The effort required to improve on integer spacing does not appear to have a corresponding benefit.

So, for correlations of Rasch-analyzed data for which the categorization is considered qualitatively substantive, the analyst would need to make a strong case to depart from correlation coefficients with the algebraic form of the Pearson product-moment correlation. These coefficients are the product-moment correlation itself, and the point-biserial, point-polyserial and phi coefficients.

John Michael Linacre

Note: the biserial correlation originated in Karl Pearson, "On a New Method of Determining Correlation ....", Biometrika, Vol. VII, pp. 96-105, 1909, and the point-biserial correlation originated in Richardson, M.W. & Stalnaker, J.M. (1933). "A note on the use of bi-serial r in test research". Journal of General Psychology, 8, 463-465.

Anderson, J. C. & Gerbing, D. W. (1988) Structural Equation Modeling in Practice: A Review and Recommended Two-Step Approach. Psychological Bulletin, 103:3, 411-423.

Bollen, K. A. (1989) Structural Equations with Latent Variables. New York: John Wiley and Sons.

Coote, L. (1998) A Review and Recommended Approach for Estimating Conditional Structural Equation Models. Australia and New Zealand Marketing Academy Conference, University of Otago, Dunedin.

Guilford J. P. (1965) Fundamental Statistics in Psychology and Education. New York: McGraw-Hill.

Joreskog, K. G. (1994) On the Estimation of Polychoric Correlations and their Asymptotic Covariance Matrix. Psychometrika, 59:3, 381-389.

Joreskog, K. G. and Sorbom, D. (1996) PRELIS 2: User's Reference Guide. Chicago: Scientific Software International.

Nunnally, J. (1967) Psychometric Theory. New York: McGraw-Hill.

Olsson U. Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 1979, 44(4), 443-460.

Uebersax J.S. (2000) POLYCORR Polychoric Correlation EZ Version software.


Correlation Coefficients: Describing Relationships. Linacre J. M. … Rasch Measurement Transactions, 2005, 19:3 p. 1028-9

Please help with Standard Dataset 4: Andrich Rating Scale Model



Rasch Publications
Rasch Measurement Transactions (free, online) Rasch Measurement research papers (free, online) Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Applying the Rasch Model 3rd. Ed., Bond & Fox Best Test Design, Wright & Stone
Rating Scale Analysis, Wright & Masters Introduction to Rasch Measurement, E. Smith & R. Smith Introduction to Many-Facet Rasch Measurement, Thomas Eckes Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, George Engelhard, Jr. Statistical Analyses for Language Testers, Rita Green
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Journal of Applied Measurement Rasch models for measurement, David Andrich Constructing Measures, Mark Wilson Rasch Analysis in the Human Sciences, Boone, Stave, Yale
in Spanish: Análisis de Rasch para todos, Agustín Tristán Mediciones, Posicionamientos y Diagnósticos Competitivos, Juan Ramón Oreja Rodríguez

To be emailed about new material on www.rasch.org
please enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Rasch.org

www.rasch.org welcomes your comments:

Your email address (if you want us to reply):

 

ForumRasch Measurement Forum to discuss any Rasch-related topic

Go to Top of Page
Go to index of all Rasch Measurement Transactions
AERA members: Join the Rasch Measurement SIG and receive the printed version of RMT
Some back issues of RMT are available as bound volumes
Subscribe to Journal of Applied Measurement

Go to Institute for Objective Measurement Home Page. The Rasch Measurement SIG (AERA) thanks the Institute for Objective Measurement for inviting the publication of Rasch Measurement Transactions on the Institute's website, www.rasch.org.

Coming Rasch-related Events
March 31, 2017, Fri. Conference: 11th UK Rasch Day, Warwick, UK, www.rasch.org.uk
April 2-3, 2017, Sun.-Mon. Conference: Validity Evidence for Measurement in Mathematics Education (V-M2Ed), San Antonio, TX, Information
April 26-30, 2017, Wed.-Sun. NCME, San Antonio, TX, www.ncme.org - April 29: Ben Wright book
April 27 - May 1, 2017, Thur.-Mon. AERA, San Antonio, TX, www.aera.net
May 26 - June 23, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 30 - July 29, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
July 31 - Aug. 3, 2017, Mon.-Thurs. Joint IMEKO TC1-TC7-TC13 Symposium 2017: Measurement Science challenges in Natural and Social Sciences, Rio de Janeiro, Brazil, imeko-tc7-rio.org.br
Aug. 7-9, 2017, Mon-Wed. In-person workshop and research coloquium: Effect size of family and school indexes in writing competence using TERCE data (C. Pardo, A. Atorressi, Winsteps), Bariloche Argentina. Carlos Pardo, Universidad Catòlica de Colombia
Aug. 7-9, 2017, Mon-Wed. PROMS 2017: Pacific Rim Objective Measurement Symposium, Sabah, Borneo, Malaysia, proms.promsociety.org/2017/
Aug. 10, 2017, Thurs. In-person Winsteps Training Workshop (M. Linacre, Winsteps), Sydney, Australia. www.winsteps.com/sydneyws.htm
Aug. 11 - Sept. 8, 2017, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Aug. 18-21, 2017, Fri.-Mon. IACAT 2017: International Association for Computerized Adaptive Testing, Niigata, Japan, iacat.org
Sept. 15-16, 2017, Fri.-Sat. IOMC 2017: International Outcome Measurement Conference, Chicago, jampress.org/iomc2017.htm
Oct. 13 - Nov. 10, 2017, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 5 - Feb. 2, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 10-16, 2018, Wed.-Tues. In-person workshop: Advanced Course in Rasch Measurement Theory and the application of RUMM2030, Perth, Australia (D. Andrich), Announcement
Jan. 17-19, 2018, Wed.-Fri. Rasch Conference: Seventh International Conference on Probabilistic Models for Measurement, Matilda Bay Club, Perth, Australia, Website
May 25 - June 22, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 29 - July 27, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Further Topics (E. Smith, Winsteps), www.statistics.com
Aug. 10 - Sept. 7, 2018, Fri.-Fri. On-line workshop: Many-Facet Rasch Measurement (E. Smith, Facets), www.statistics.com
Oct. 12 - Nov. 9, 2018, Fri.-Fri. On-line workshop: Practical Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
The HTML to add "Coming Rasch-related Events" to your webpage is:
<script type="text/javascript" src="http://www.rasch.org/events.txt"></script>

 

The URL of this page is www.rasch.org/rmt/rmt193c.htm

Website: www.rasch.org/rmt/contents.htm