cor
Correlation, Variance, and Covariance (Matrices)

Description

Calculates the variance of a vector, the variance-covariance (or correlation) matrix of a data matrix, or covariances between matrices or vectors. Converts a variance-covariance matrix to a correlation matrix.

Usage

var(x, y = NULL, na.rm = FALSE, use)

cov(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman"))

cor(x, y = NULL, use = "everything", method = c("pearson", "kendall", "spearman"))

cov2cor(V)

Arguments

x a numeric or logical vector, matrix, or data frame. If x is a matrix or data frame, columns represent variables and rows represent observations.
V a square covariance matrix for the cov2cor function. Missing values (NAs) are allowed but result in missing values in the result.
y a numeric or logical vector, matrix, or data frame where the same number of observations exist in y as in x. If x is a matrix or data frame, columns represent variables and rows represent observations. In this case, you can set y = NULL.

For cor() and cov(), y is required if x is a vector.

na.rm a logical value. If TRUE, missing values (NAs) are removed before computing. Default value is FALSE. If you specify use, the value of na.rm is ignored.
use a character string that specifies how missing values (NAs) are handled in the computing the results. The value can be one of the following:
  • all.obs means that all observations must be numeric and that missing values (NA)s are not allowed. An error is returned if there are any missing values (NAs) in x or y.

  • complete.obs means that rows that contain a missing value (NA) are ignored. An error is returned if all rows contain at least one missing value.

  • everything means that values for all pairs of columns are computed, but a missing value (NA) is returned for pairs that contain at least one missing value (NA). This is equivalent to supplying na.rm = FALSE and not specifying a value for use.

  • na.or.complete means that rows that contain a missing values (NA) are ignored in the computations. An error is returned if all rows contain at least one missing value. This is equivalent to supplying na.rm = TRUE and not specifying a value for use.

  • pairwise.complete.obs means variances are computed for each variable using all non-missing values, and covariances or correlations for each pair of variables are computed using observations with no missing data for that pair.

method a character string that specifies the standard method to employ for the computation of the covariance or correlation. Available methods are pearson (the default), kendall, or spearman.
  • pearson means pearson correlation coefficient.

  • kendall means that kendall's tau statistic is used to compute rank correlation coefficient.

  • spearman means that Spearman's rho statistic is used to compute rank correlation coefficient.
Value
If x is a vector, the return value is a vector where the length is equal to the number of columns in y. If you do not supply y, the length will be 1.
If x is a matrix or a data frame, the return value is a matrix such that the [i,j] element is the covariance (correlation) of x[,i] and either y[,j] or x[,j].
References
Chan, T., Golub, G., and LeVeque, R. (1983). Algorithms for computing the sample variance: analysis and recommendations. The American Statistician 37: 242-247.
Gnanadesikan, R. (1977). Methods for Statistical Data Analysis of Multivariate Observations. New York: Wiley.
Gnanadesikan, R. and Kettenring, J.R. (1972). Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28: 81-124.
Huber, P.J. (1981). Robust Statistics. New York: Wiley.
Little, R.J.A., and Rubin, D.R. (1987). Statistical Analysis with Missing Data. New York: Wiley.
Schafer, J.L. (1997). Analysis of Incomplete Multivariate Data. London: Chapman & Hall.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
mad, mean
Examples
# 7 by 7 correlation matrix for the longley data
cor(Sdatasets::longley)
# The same thing
cov2cor(cov(Sdatasets::longley))

cor(Sdatasets::longley, method="pearson") cor(Sdatasets::longley, method="kendall") cor(Sdatasets::longley, method="spearman")

cov(Sdatasets::longley, method="pearson") cov(Sdatasets::longley, method="kendall") cov(Sdatasets::longley, method="spearman")

var(Sdatasets::longley)

Package stats version 6.0.0-69
Package Index