ks.test
Kolmogorov-Smirnov Tests

Description

Performs a one or two sample Kolmogorov-Smirnov test, which tests the relationship between two distributions.

Usage

ks.test(x, y, ..., alternative = c("two.sided", "less", "greater"), exact = NULL, 
    distribution = "normal")

Arguments

x a numeric vector that contains the sample values from one of the distributions. Missing values (NAs) and infinite values (Infs) are ignored.
y
two-sample test:
a numeric vector that contains the sample values for the two-sample test. Missing values (NAs) and infinite values (Infs) are ignored.
one-sample test:
a character string that specifies the function that generates p-values for the hypothesized distribution, which can be one of pnorm, pbeta, pcauchy, pchisq, pexp, pf, pgamma, plnorm, plogis, pt, punif, pweibull, pbinom, pgeom, phyper, pnbinom, ppois, pwilcox.
alternative a character string that specifies the alternative hypothesis. To test the hypothesis, type one of the following:

two.sided CDF of x is not equal to the null hypothesis.
greater CDF of x lies below the null hypothesis.
less CDF of x lies below the null hypothesis.
Note: You only need to enter enough of the character string to create a unique match for the value.
... For the one-sample test, parameter arguments for the function that generates p-values for the hypothesized distribution. For example, if y = "pnorm", those arguments will be passed down to pnorm.
exact logical value that specifies if the function should compute an exact p-value. exact is valid only in the case of two-sided and where there are no duplicates in the sample values, that is, where there are no ties.

If you do not specify a value for exact, the function sets exact = FALSE except in the following cases:

  • one-sample test: the number of non-NA values in the x input vector is less than 100.
  • two-sample test: the product of non-NA values number of x and y is less than 10000.
distribution a character string that specifies the hypothesized distribution if y is not provided. It can be one of "normal", "beta", "cauchy", "chisquare", "exponential", "f", "gamma", "lognormal", "logistic", "t", "uniform", "weibull", "binomial", "geometric", "hypergeometric", "negbinomial", "poisson", "wilcoxon".

Details

We use a variety of algorithms to calculate p-values. To compute the asymptotic distribution of the one-sample and two-sample two-sided Kolmogorov-Smirnov statistics, the function uses algorithm(kstwo). To calculate approximation of p-value for the greater and less alternatives, the function uses the exponential algorithm. Except in the following cases:
one-sample test The Kolmogorov algorithm is used to get an exact p-value for every alternative.
two-sample test The Smirnov algorithm is used to get an exact p-value only for the two-sided alternative.
Value
returns a list of class htest that contains the following components:
statistic the KS statistic along with a names attribute that lists the statistic:
"D" two.sided
"D^+" greater
"D^-" less
p.value p-value for the test.
alternative a character string that returns the alternative hypothesis (two.sided, greater, or less) as specified in the alternative argument.

For a one-sample test, the character string for each alternative hypothesis is:

  • two.sided: for alternative.
  • greater: the CDF of x lies above the null hypothesis.
  • less: the CDF of x lies below the null hypothesis.
For a two-sample test, the character string for each alternative hypothesis is:
  • two.sided: is two-sided.
  • greater: the CDF of x lies above that of y.
  • less: the CDF of x lies below that of y.
method a character string for the name of the method used for the calculation.
data.name a character string (vector of length 1) that contains the names of the x and y input vectors.
Differences between Spotfire Enterprise Runtime for R and Open-source R
References
Birnbaum, Z. W. and Tingey, F. H. 1951. One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics. Volume 22, Issue 4. 592-596.
Conover, W. J. 1971. Practical nonparametric statistics. New York, NY: John Wiley & Sons. 295-301 (one-sample Kolmogorov test), 309-314 (two-sample Smirnov test).
Durbin, J. 1973. Distribution theory for tests based on the sample distribution function. Philadelphia, PA: SIAM proceedings.
Marsaglia, G., Tsang, W. W., and Wang, J. 2003. Evaluating Kolmogorov's distribution. Journal of Statistical Software. http://www.jstatsoft.org/v08/i18/. Volume 8, Issue 18.
See Also
chisq.test, ppoints (to create QQ plots), qqnorm, qqplot.
Examples
# one sample 
z <- rnorm(100)                   
ks.test(z, y = "pnorm")          # hypothesize a normal distn. 
ks.test(z, y = "pchisq", df = 2)  # hypothesize a chisquare distn. 
ks.test(z, y = "pgamma", shape = 3, scale = 2, exact = FALSE, alternative = "greater")

# two sample x <- rnorm(90) y <- rnorm(8, mean = 2.0, sd = 1) ks.test(x, y) ks.test(x, y, exact = FALSE, alternative= "less")

Package stats version 6.1.1-7
Package Index