Correlation and Reliability Functions

The following is a summary of the correlation and reliability functions worksheet functions provided in the Real Statistics Resource Pack.

These functions are organized into the following categories:

  • Covariance and Correlation
  • Association
  • Interrater Reliability
  • Internal Consistency Reliability
  • Partitioning
  • Bradley-Terry Model
  • Testing Theory (Item Analysis and Item Response)
  • Statistical Power
  • Sample Size

Covariance and Correlation

COVARP(R1, R2) population covariance of the populations defined by ranges R1 and R2; equivalent to COVAR(R1, R2)
COVARS(R1, R2) sample covariance of the samples defined by ranges R1 and R2; equivalent to COVARIANCE.S(R1, R2) 
CORREL_ADJ(R1, R2) estimated population correlation coefficient corresponding to the sample data in ranges R1 and R2
MCORREL(R, R1, R2) multiple correlation of dependent variable z with x and y where the samples for z, x and y are the ranges R, R1 and R2 respectively
PART_CORREL(R, R1, R2) partial correlation rzx,y of variables z and x holding y constant based on Pearson’s correlation where the samples for z, x and y are the ranges R, R1 and R2 respectively
SEMIPART_CORREL(R, R1, R2) semi-partial correlation rz(x,y) based on Pearson’s correlation where the samples for z, x and y are the ranges R, R1 and R2 respectively
PCORREL(R1, i, j) partial correlation coefficient between the ith and jth columns in R1
SCORREL(R1, R2, lab, tails, α) array function which returns the Spearman’s rank correlation (rho), t-stat, p-value, lower and upper bounds of a 1–α confidence interval for the data in R1 and R2
KCORREL(R1, R2, lab, tails, α, ties) array function which returns Kendall’s tau, standard error, z-stat, z-crit, p-value, lower and upper bounds of 1–α confidence interval of biserial correlation for the data in R1 and R2
BCORREL(R1, R2, lab, tails, α) array function which returns the biserial correlation, z-stat, p-value, lower and upper bounds of a 1–α confidence interval of the biserial correlation for the data in R1 and R2
PART_KCORREL(R, R1, R2) partial correlation τzx,y of variables z and x holding y constant based on Kendall’s correlation where the samples for z, x and y are the ranges R, R1 and R2 respectively
SEMIPART_KCORREL(R, R1, R2) semi-partial correlation τz(x,y) based on Kendall’s correlation where the samples for z, x and y are the ranges R, R1 and R2 respectively
COV(R1, b) array function which returns the sample covariance matrix for the data in R1
COVP(R1, b) array function which returns the population covariance matrix for the data in range R1
CORR(R1, b) array function which returns the correlation matrix for the data in range R1
PCORR(R1) array function which returns the partial correlation matrix for the data in range R1

For CORR, COV and COVP if b = TRUE (default) then any row in R1 which contains a non-numeric cell is not used, while if b = FALSE then correlation/covariance coefficients are calculated pairwise (by columns) and so any row which contains non-numeric data for either column in the pair is not used to calculate that coefficient value.

For KCORREL, if lab = TRUE then the output takes the form of a 7 × 2 range with the first column consisting of labels, while if lab = FALSE (default) the output takes the form of a 7 × 1 range without labels. If ties = TRUE (default) then a ties correction is applied.

Similarly, SCORREL returns a 3 × 2 range if lab = TRUE and a 3 × 2 range if lab = FALSE.

The following are functions for correlation coefficient hypothesis testing:

CorrTTest(ρ, r, n, tails) p-value for one-sample t-test of the correlation coefficient ρ = 0
CorrTLower(r, n, α) lower bound of 1–α confidence interval of the population correlation coefficient 
CorrTUpper(r, n, α) upper bound of 1–α confidence interval of the population correlation coefficient 
CorrelTTest(r, n, α, lab, tails) array function which returns the test statistic z, p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient  based on a hypothetical correlation ρ = 0
CorrelTTest(R1, R2, α, lab, tails) CorrelTest(r, n, ρ, α, lab, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2)
CorrTest(ρ, r, n, tails) p-value for one sample test of the correlation coefficient 
CorrLower(r, n, α) lower bound of 1–α confidence interval of the population correlation coefficient 
CorrUpper(r, n, α) upper bound of 1–α confidence interval of the population correlation coefficient 
CorrelTest(r, n, ρ, α, lab, tails) array function which returns the test statistic z, p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient
CorrelTest(R1, R2, ρ, α, lab, tails) CorrelTest(r, n, ρ, α, lab, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2)
CorrETest(r, n, ρ, tails) p-value for one sample exact test of the correlation coefficient ρ
CorrETest(R1, R2, ρ, tails) CorrETest(r, n, ρ, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2)
CorrelETest(r, n, ρ, lab, α, tails) array function which returns the test statistic p-value and lower and upper bounds of 1–α confidence interval of the population correlation coefficient based on hypothetical correlation ρ
CorrelETest(R1, R2, ρ, lab, α, tails) CorrelETest(r, n, ρ, lab, α, tails) where r = CORREL(R1, R2) and n = CountPairs(R1,R2)
Correl2Test(r1, n1, r2, n2, α, lab) array function which returns the two independent sample test statistic z, p-value (two-tailed test) and lower and upper bounds of 1–α confidence interval of the difference of population correlation coefficients ρ1– ρ2, sample correlations r1 and r2, and sample sizes n1 and n2
Correl2Test(R1, R2, R3, R4, α, lab) Correl2Test(r1, n1, r2, n2, α, lab) where r1 = CORREL(R1, R2), r2 = CORREL(R3, R4), n1 = CountPairs(R1,R2) and n2 = CountPairs(R3,R4)
Correl2OverlapTTest(r12, r13, r23, n, α, lab) array function for the two overlapping dependent sample test that returns the |r12-r13|, the t test statistic, p-value (two-tailed test) and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients; n = size of all three samples.
Corr2OverlapTTest(R1, R3, R2, α, lab) Correl2OverlapTTest(r12, r13, r23, n, α, lab) where r12 = CORREL(R1, R2), r13 = CORREL(R1, R3), r23 = CORREL(R2, R3) n = COUNT(R1)
Correl2OverlapTest(r12, r13, r23, nα, lab) array function which is just like Correl2OverlapTTest except that the Fisher transformation is used and the output consists of |r12-r13| and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients.
Corr2OverlapTest(R1, R3, R2, α, lab) Correl2OverlapTest(r12, r13r23, nα, lab) where r12 = CORREL(R1, R2), r13 = CORREL(R1, R3), r23 = CORREL(R2, R3) n = COUNT(R1)
Correl2NonOverlapTest(R1, n1, n2α, lab) array function which returns |r12-r34| and lower and upper bounds of 1–α confidence interval for the difference of population correlation coefficients, where R1 contains the correlation matrix for 4 samples and n1 = size of sample 1 (or 2) and n2 = size of sample 3 (or 4)
Corr2NonOverlapTest(R1, α, lab) Correl2NonOverlapTest(R2, n1, n2α, lab) where R2 = CORR(R1), n1 =# of elements in the 1st (or 2nd) column of R1 and n2 = # of elements in the 3rd (or 4th) column of R1
PCRIT(n, α, tails) critical value for the Pearson’s correlation coefficient based on two samples of size n with alpha of α (default .05) and tails = 1 or 2 (default)

For CorrelTTest, CorrelTest and Correl2Test if lab = TRUE then the output takes the form of a 4 × 2 range with the first column consisting of labels, while if lab = FALSE (default) the output takes the form of a 4 × 1 range without labels. Similarly, a column of labels is appended to the output for all the other functions when lab = TRUE (default FALSE).

The CorrTest, CorrLower, CorrUpper, CorrelTest and Correl2Test functions are based on a normal test using a Fisher transformation. The CorrTTest, CorrTLower, CorrTUpper and CorrelTTest functions are based on a t-test where the population correlation is assumed to be zero. 

r = sample correlation, ρ = hypothetical population correlation, n = sample size and tails = the # of tails: 1 or 2 (default).

Association functions

The following array functions return a column array based on the contingency table in R1.

LAMBDA_COEFF(R1, lab) returns a column array with lambda(C|R), lambda(R|C), and symmetric lambda for the data in R1.
LAMBDA_TEST(R1, lab, alpha, lambda0) returns a column array with lambda(C|R), standard error, z-stat, p-value, and lower and upper ends of the 1-alpha confidence interval.
GAMMA_TEST(R1, lab, alpha, gamma0) returns a column array with gamma, # of concordance pairs, # of discordance pairs, standard error, z-statistic, p-value, and the lower and upper ends of the 1-alpha confidence interval for gamma.
SOMERS_TEST(R1, lab, alpha, d0) returns a column array with Somers’ d, # of concordance pairs, # of discordance pairs, standard error, z-statistic, p-value, and the lower and upper ends of the 1-alpha confidence interval for d.

The following non/array functions are used by some of the above functions.

CONCORD(R1) returns the number of concordant pairs
DISCORD(R1) returns the number of discordant pairs
MSOMERS(R1) returns the M statistic used to calculate the variance of Somers’ d statistic

Interrater Reliability functions

KAPPA(R1, k, lab, α, orig) array function which returns Cohen’s (Fleiss’) kappa for the data in R1 when k = 0 (default) and kappa for category k when k > 0, plus standard error, z-stat, p-value and lower and upper bound of 1−α confidence interval (α defaults to .05)
WKAPPA(R1, R2, lab, α) array function which returns weighted kappa for the data in R1 using the weights in R2, plus standard error and lower and upper bound of 1−α confidence interval (α defaults to .05).
ICC(R1, class, type, lab, α) array function which outputs the intraclass correlation coefficient ICC(class, type) plus the lower and upper bound of the 1−α confidence interval for the data in R1; default values are class = 2, type = 1, α = .05.
KENDALLW(R1, lab, ties) array function which returns a column range consisting of Kendall’s W, r, χ2, df and p-value; if ties = TRUE then a ties correction is applied
KENDALLU(R1, lab, comp, alpha, lookup) array function which returns a column array with u-stat, W-stat, χ2-stat, df, and p-value for Kendall’s u test on the square preference matrix in R1 (with pairwise comparisons if comp = TRUE (default) and rankings otherwise).
TC_TEST(R1, lab, alpha, lookup) array function that returns a column array with TC-stat, z-stat, and p-value for Kendall’s TC test on the square preference matrix in R1 (without headings).
PREF_MATRIX(R1) returns an n × n preference matrix corresponding to the rankings in a k × n array or range R1 that contains the rankings for n subjects by k raters.
KALPHA(R1, weights, ratings) Krippendorff’s alpha for the agreement table in range R1 based on the weights and ratings in the second and third arguments.
KTRANS(R1, col) array with agreement table for Krippendorff’s alpha that corresponds to the rating table in R1 when column col is removed; no column is removed if col = 0 (default)
KRIP_SES(R1, lab, weights, ratings, alpha, scorrection) column array that contains: Krippendorff’s alpha for the data in the agreement table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Krippendorff’s alpha.
KRIP(R1, lab, weights, ratings, alpha, scorrection, rcorrection) column array that contains: Krippendorff’s alpha for the data in the rating table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Krippendorff’s alpha, followed by the total s.e., including subjects and raters) and the lower and upper ends of the 1 – alpha confidence interval corresponding to this s.e.
KRIP_SER(R1, lab, weights, ratings, alpha, rcorrection) the standard error for raters, where range R1 contains data from a rating table for Krippendorff’s alpha.
GWET_AC2(R1, weights, ratings) Gwet’s AC2 for the agreement table in range R1 based on the weights and ratings in the second and third arguments.
GTRANS(R1, col) array with agreement table for Gwet’s AC2 that corresponds to the rating table in R1 when column col is removed; no column is removed if col = 0 (default)
GWET_SES(R1, lab, weights, ratings, alpha, scorrection) column array that contains: Gwet’s AC2 for the data in the agreement table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Gwet’s AC2.
GWET(R1, lab, weights, ratings, alpha, scorrection, rcorrection) column array that contains: Gwet’s AC2 for the data in the rating table in range R1, the corresponding standard error for subjects and the lower and upper ends of the 1 – alpha confidence interval for Gwet’s AC2, followed by the total s.e., including subjects and raters) and the lower and upper ends of the 1 – alpha confidence interval corresponding to this s.e.
GWET_SER(R1, lab, weights, ratings, alpha, rcorrection) the standard error for raters, where range R1 contains data from a rating table for Gwet’s AC2.
LINCCC(R1, R2, lab, alpha) returns a column array with the values Lin’s concordance correlation coefficient plus the lower & upper ends of the 1–alpha confidence interval
BKAPPA_SD(κ, p1, q1) the standard deviation when κ = Cohen’s kappa, p1 = the marginal probability that rater 1 chooses category 1 and q1 = the marginal probability that rater 2 chooses category 1

If lab = TRUE, then an extra column of labels is appended to the output (default = FALSE). 

If range R2 is omitted in WKAPPA it defaults to the unweighted measure where the weights on the main diagonal are all zeros and the other weights are ones. Range R2 can also be replaced by a number r. A value of r = 1 means the weights are linear (as in Figure 1), a value of 2 means the weights are quadratic. In general, this means that the equivalent weights range would contain zeros on the main diagonal and values (|i−j|)r in the ith row and jth column when i ≠ j.

The weights argument is a square range containing the weights or the value 0 (default) if categorical weights are used, 1 if ordinal weights are used, 2 for interval weights and 3 for ratio weights.

The ratings argument is a row or column range containing the rating values. If omitted then the ratings 1, …, q are used where q is the size of the range.

alpha is the significance level (default .05). scorrection and rcorrection are the subject and rater correction factors (referred to as the f parameters).

if lookup = TRUE (default), then a table lookup is conducted instead of normal or chi-square approximation for small samples.

Internal Consistency Reliability functions

SPLIT_HALF(R1, R2) split-half coefficient  for the data in ranges R1 and R2
SPLITHALF(R1, type) split-half coefficient for the scores in the first half of the items in R1 vs. the second half of the items if type = 0 and the odd items in R1 vs. the even items if type = 1.
GUTTMAN_SPLIT(R1, s) Guttman’s lambda for the data in range R1 based on the split described by string s consisting of 0’s and 1’s
GUTTMAN(R1) the Guttman’s reliability measure for the data in range R1, i.e. the maximum Guttman’s lambda based on all possible splits; when the number of splits is too large, a second argument iter can be used to find an approximate maximum Guttman’s lambda based on a randomly generated iter number of splits.
SB_SPLIT(R1, s) split half coefficient (after Spearman-Brown correction) for data in R1 based on the split described by string s consisting of 0’s and 1’s
SB_CORRECTION(r, n, m) Spearman-Brown correction when the split-half correlation based on an m vs. n–m split is r. If n is omitted, then it is assumed that there is a 50-50 split. If n is present but m is omitted, then it is assumed that m = n/2.
SB_PRED(m, rho, n) Spearman-Brown predicted reliability based on m items when Spearman-Brown for n items is rho.
SB_SIZE(rho1, rho, n) the number of items necessary to bring the Spearman-Brown predicted reliability up (or down) to rho1 from n items with Spearman-Brown of rho.
KUDER(R1) Kuder-Richardson Formula 20 coefficient for the data in range R1
CRONALPHA(R1, k) Cronbach’s alpha for the data in range R1 if k = 0 and Cronbach’s alpha with kth item (i.e. column) removed if k > 0
CALPHA(R1) array function which returns a row of Cronbach’s alpha for R1 with each item removed

Partitioning

INIT_SPLIT(n, m) a string of length n consisting of m 0’s followed by n–m 1’s. If omitted m defaults to n/2.
NEXT_SPLIT(s) the string representing the next split after the split represented by s.
RAND_SPLIT(n, m) a random string of length n consisting of m 0’s and n–m 1’s. If omitted m defaults to n/2.
COV_SPLIT(R1,  s) sample covariance for the data in range R1 based on the split described by string s.
CORR_SPLIT(R1,  s) correlation for the data in range R1 based on the split described by string s.
INIT_PARTITION(n) a string consisting of n 0’s
NEXT_PARTITION(n) the string representing the next partition after the partition represented by s.
RAND_PARTITION(n) a random string of length n

These strings consist only of 0’s and 1’s.

Bradley-Terry Model

In the following, R1 is a square array of the results of head-to-head matches and R2 is a column array with the (population) rankings of the competitors.

BT_MODEL(R1, prec, iter) column array with the rankings of the competitors. iter  = maximum # of iterations (default 100), iterations stop when the sum of the ranks <= 1+prec (prec defaults to .0000001)
PairwiseRanks(R1) square array with the pairwise probabilities based on R1
PairwiseRanks(R2) square array with the pairwise probabilities based on R2

Testing Theory (Item Analysis and Item Response)

ITEMDIFF(R1, mx) item difficulty index for the scores in R1 where mx is the maximum score for the item
ITEMDISC(R1, R2, p, mx) item discrimination index where R1 contains the scores for each subject for a single item and R2 contains the corresponding scores for all items based on the top/bottom p% of total scores and mx is the maximum score for the item whose scores are contained in R1
RASCH(R1, head, iter, prec) returns Rasch dichotomous UCON model expected values plus ability and difficulty parameter estimates for the data in R1.
RASCHFIT(R1, head, iter, prec) returns fit matrix for Rasch dichotomous UCON model based on the data in R1; includes infit and outfit values.
RASCH_EXP(R1, R2, R3) returns Rasch dichotomous UCON model expected values plus ability and difficulty parameter estimates for the data in R1 based on the ability parameters in R2 and difficulty parameters in R3.
RASCH_FIT(R1, R2, R3) returns Rasch dichotomous UCON model fit matrix for the data in R1 based on the ability parameters in R2 and difficulty parameters in R3.
RASCH_SUBJ(R1, head iter, prec) returns an array with three columns consisting of the subject labels, and corresponding ability parameters and standard errors for Rasch dichotomous UCON model based on the data in R1.
RASCH_ITEM(R1, head, iter, prec) returns an array with three columns consisting of the item labels, and corresponding difficulty parameters and standard errors for Rasch dichotomous UCON model based on the data in R1.
UCON(R1, head, iter, prec) returns Rasch polytomous UCON model expected values plus ability and difficulty parameter estimates for the data in R1.
UCONFIT(R1, head, iter, prec) returns fit matrix for Rasch polytomous UCON model based on the data in R1; includes infit and outfit values.
UCON_SUBJ(R1, head, iter, prec) returns an array with three columns consisting of the subject labels, and corresponding ability parameters and standard errors for Rasch polytomous UCON model based on the data in R1.
UCON_ITEM(R1, head, iter, prec) returns an array with three columns consisting of the item labels, and corresponding difficulty parameters and standard errors for Rasch polytomous UCON model based on the data in R1.
UCON_THRESH(R1, head, iter, prec) returns an array with two columns consisting of the category labels and corresponding category threshold parameters for Rasch UCON polytomous model based on the data in R1.
RASCH_INIT(R1) returns output similar to that in R1, but with all items and subjects with all or no correct answers eliminated; assumes R1 has row and column headings
PROX_SUBJ(R1) returns an array with 4 columns consisting of the subject labels, and corresponding ability parameters,  s.e. and % correct scores for Rasch PROX model based on the data in R1 (with headings).
PROX_ITEM(R1) returns an array with 4 columns consisting of the item labels, and corresponding difficulty parameters, s.e. and % correct scores for Rasch PROX model based on the data in R1 (with headings).
PROXX_SUBJ(R1, R2) returns an array with two columns consisting of the ability parameters and standard errors for the subject scores in R2 for the Rasch PROX model based on the data in R1 (w/o headings).
PROXX_ITEM(R1, R2) returns an array with two columns consisting of the difficulty parameters and standard errors for the item scores in R2 for the Rasch PROX model based on the data in R1 (w/o headings).

If head = TRUE (default FALSE), then R1 contains row and column headings; iter = # of iterations (default 25 for dichotomous functions and 100 for polytomous functions); prec = precision (default .00001 for dichotomous functions and .001 for polytomous functions).

Statistical power functions

CALPHA_POWER(ca0, ca1, n, k, tails, α) power of Cronbach’s alpha test where ca0 = Cronbach’s alpha under the null hypothesis, ca1 = Cronbach’s alpha under the alternative hypothesis and k = # of items; tails = # of tails: 1 or 2 (default)
ICC_POWER(ρ0, ρ1, n, k, α) power of ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items
BKAPPA_POWER(κ0, κ1, p1, q1, n, tails, α) power of Cohen’s kappa test achieved for a sample of size n when the null and alternative hypothesis kappa are κ0 and κ1, the marginal probabilities that rater 1 and rater 2 choose category 1 are p1 and q1,

Sample size functions

CALPHA_SIZE(ca0, ca1, k, 1−β, tail, α) minimum sample size required to obtain power of at least 1−β for Cronbach’s alpha test where ca0 = Cronbach’s alpha under the null hypothesis, ca1 = Cronbach’s alpha under the alternative hypothesis and k = # of items; tails = # of tails: 1 or 2 (default)
ICC_SIZE(ρ0, ρ1, k, 1−β, α) minimum sample size required to obtain power of at least 1−β for ICC(1,1) test where ρ0 = ICC(1,1) under the null hypothesis, ρ1 = ICC(1,1) under the alternative hypothesis and k = # of items
BKAPPA_SIZE(κ0, κ1, p1, q1, 1−β, tails, α) minimum sample size required to obtain power of at least 1−β for Cohen’s kappa test when the null and alternative hypothesis kappa are κ0 and κ1, the marginal probabilities that rater 1 and rater 2 choose category 1 are p1 and q1,

α = alpha (default = .05), default for 1−β = .80, n = the sample size.

2 thoughts on “Correlation and Reliability Functions”

Leave a Comment