cor
Correlation, Variance, and Covariance (Matrices)
Description
Calculates the variance of a vector, the variance-covariance (or
correlation) matrix of a data matrix, or covariances between matrices
or vectors. Converts a variance-covariance matrix to a correlation
matrix.
Usage
var(x, y = NULL, na.rm = FALSE, use)
cov(x, y = NULL, use = "everything", method = c("pearson",
"kendall", "spearman"))
cor(x, y = NULL, use = "everything", method = c("pearson",
"kendall", "spearman"))
cov2cor(V)
Arguments
x |
a numeric or logical vector, matrix, or data frame.
If x is a matrix or data frame, columns represent variables and
rows represent observations.
|
V |
a square covariance matrix for the cov2cor function. Missing
values (NAs) are allowed but result in missing values in the
result.
|
y |
a numeric or logical vector, matrix, or data frame where the same
number of observations exist in y as in x.
If x is a matrix or data frame, columns represent variables and rows represent observations. In this case, you can set y = NULL.
For cor() and cov(), y is required if x is
a vector.
|
na.rm |
a logical value. If TRUE, missing values (NAs) are removed before computing. Default value is FALSE. If you specify use, the value of na.rm is ignored.
|
use |
a character string that specifies how missing values (NAs)
are handled in the computing the results.
The value can be one of the following:
- all.obs means that all observations must be numeric
and that missing values (NA)s are not allowed.
An error is returned if there are any missing values (NAs) in x or y.
- complete.obs means that rows that contain a missing value (NA) are ignored. An error is returned if all rows contain at least one missing value.
- everything means that values for all pairs of columns
are computed, but a missing value (NA) is returned for pairs that contain at least one missing value (NA). This is equivalent to supplying na.rm = FALSE and not specifying a value for use.
- na.or.complete means that rows that contain a missing
values (NA) are ignored in the computations. An error is returned if all rows contain at least one missing value. This is equivalent to supplying na.rm = TRUE and not specifying a value for use.
- pairwise.complete.obs means variances are computed for
each variable using all non-missing values, and covariances or correlations for each pair of variables are computed using observations with no missing data for that pair.
|
method |
a character string that specifies the standard method to employ
for the computation of the covariance or correlation.
Available methods are pearson (the default), kendall,
or spearman.
- pearson means pearson correlation coefficient.
- kendall means that kendall's tau statistic is
used to compute rank correlation coefficient.
- spearman means that Spearman's rho statistic
is used to compute rank correlation coefficient.
|
Value
- var returns variances
- cor returns correlations
- cov returns covariances
- cov2cor returns a correlation matrix like V
If
x is a vector, the return value is a vector where the length
is equal to the number of columns in
y.
If you do not supply
y, the length will be
1.
If x is a matrix or a data frame, the return value is a matrix such that the [i,j] element is the covariance (correlation) of x[,i] and either y[,j] or x[,j].
References
Chan, T., Golub, G., and LeVeque, R. (1983).
Algorithms for computing the sample variance: analysis and recommendations.
The American Statistician 37: 242-247.
Gnanadesikan, R. (1977).
Methods for Statistical Data Analysis of Multivariate Observations.
New York: Wiley.
Gnanadesikan, R. and Kettenring, J.R. (1972).
Robust estimates, residuals, and outlier detection with multiresponse data.
Biometrics 28: 81-124.
Huber, P.J. (1981).
Robust Statistics.
New York: Wiley.
Little, R.J.A., and Rubin, D.R. (1987).
Statistical Analysis with Missing Data.
New York: Wiley.
Schafer, J.L. (1997).
Analysis of Incomplete Multivariate Data.
London: Chapman & Hall.
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
The New S Language.
Wadsworth & Brooks/Cole.
See Also
Examples
# 7 by 7 correlation matrix for the longley data
cor(Sdatasets::longley)
# The same thing
cov2cor(cov(Sdatasets::longley))
cor(Sdatasets::longley, method="pearson")
cor(Sdatasets::longley, method="kendall")
cor(Sdatasets::longley, method="spearman")
cov(Sdatasets::longley, method="pearson")
cov(Sdatasets::longley, method="kendall")
cov(Sdatasets::longley, method="spearman")
var(Sdatasets::longley)