ks.test
Kolmogorov-Smirnov Tests

Description

Performs a one or two sample Kolmogorov-Smirnov test, which tests the relationship between two distributions.

Usage

ks.test(x, y, ..., alternative = c("two.sided", "less", "greater"), exact = NULL, 
    distribution = "normal")

Arguments

x a numeric vector that contains the sample values from one of the distributions. Missing values (NAs) and infinite values (Infs) are ignored.
y
two-sample test:
a numeric vector that contains the sample values for the two-sample test. Missing values (NAs) and infinite values (Infs) are ignored.
one-sample test:
a character string that specifies the function that generates p-values for the hypothesized distribution, which can be one of pnorm, pbeta, pcauchy, pchisq, pexp, pf, pgamma, plnorm, plogis, pt, punif, pweibull, pbinom, pgeom, phyper, pnbinom, ppois, pwilcox.
alternative a character string that specifies the alternative hypothesis. To test the hypothesis, type one of the following:

two.sided CDF of x is not equal to the null hypothesis.
greater CDF of x lies below the null hypothesis.
less CDF of x lies below the null hypothesis.
Note: You only need to enter enough of the character string to create a unique match for the value.
... For the one-sample test, parameter arguments for the function that generates p-values for the hypothesized distribution. For example, if y = "pnorm", those arguments will be passed down to pnorm.
exact logical value that specifies if the function should compute an exact p-value. exact is valid only in the case of two-sided and where there are no duplicates in the sample values, that is, where there are no ties.

If you do not specify a value for exact, the function sets exact = FALSE except in the following cases:

  • one-sample test: the number of non-NA values in the x input vector is less than 100.
  • two-sample test: the product of non-NA values number of x and y is less than 10000.
distribution a character string that specifies the hypothesized distribution if y is not provided. It can be one of "normal", "beta", "cauchy", "chisquare", "exponential", "f", "gamma", "lognormal", "logistic", "t", "uniform", "weibull", "binomial", "geometric", "hypergeometric", "negbinomial", "poisson", "wilcoxon".

Details

We use a variety of algorithms to calculate p-values. To compute the asymptotic distribution of the one-sample and two-sample two-sided Kolmogorov-Smirnov statistics, the function uses algorithm(kstwo). To calculate approximation of p-value for the greater and less alternatives, the function uses the exponential algorithm. Except in the following cases:
one-sample test The Kolmogorov algorithm is used to get an exact p-value for every alternative.
two-sample test The Smirnov algorithm is used to get an exact p-value only for the two-sided alternative.
Value
returns a list of class htest that contains the following components:
statistic the KS statistic along with a names attribute that lists the statistic:
"D" two.sided
"D^+" greater
"D^-" less
p.value p-value for the test.
alternative a character string that returns the alternative hypothesis (two.sided, greater, or less) as specified in the alternative argument.

For a one-sample test, the character string for each alternative hypothesis is:

  • two.sided: for alternative.
  • greater: the CDF of x lies above the null hypothesis.
  • less: the CDF of x lies below the null hypothesis.
For a two-sample test, the character string for each alternative hypothesis is:
  • two.sided: is two-sided.
  • greater: the CDF of x lies above that of y.
  • less: the CDF of x lies below that of y.
method a character string for the name of the method used for the calculation.
data.name a character string (vector of length 1) that contains the names of the x and y input vectors.
Differences between TIBCO Enterprise Runtime for R and Open-source R
References
Z. W. Birnbaum and Fred H. Tingey (1951), One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics, 22/4, 592--596.
William J. Conover (1971), Practical Nonparametric Statistics. New York: John Wiley & Sons. Pages 295--301 (one-sample Kolmogorov test), 309--314 (two-sample Smirnov test).
Durbin, J. (1973) Distribution theory for tests based on the sample distribution function. SIAM.
George Marsaglia, Wai Wan Tsang and Jingbo Wang (2003), Evaluating Kolmogorov's distribution. Journal of Statistical Software, 8/18. http://www.jstatsoft.org/v08/i18/.
See Also
chisq.test, ppoints (to create QQ plots), qqnorm, qqplot.
Examples
# one sample 
z <- rnorm(100)                   
ks.test(z, y = "pnorm")          # hypothesize a normal distn. 
ks.test(z, y = "pchisq", df = 2)  # hypothesize a chisquare distn. 
ks.test(z, y = "pgamma", shape = 3, scale = 2, exact = FALSE, alternative = "greater")

# two sample x <- rnorm(90) y <- rnorm(8, mean = 2.0, sd = 1) ks.test(x, y) ks.test(x, y, exact = FALSE, alternative= "less")

Package stats version 4.0.0-28
Package Index