density
Kernel Estimate of Probability Density Function

Description

Returns x and y coordinates of a non-parametric estimate of the probability density function of the data.

Usage

density(x, ...)
density.default(x, bw = "nrd0", adjust = 1, kernel = c("gaussian", 
    "epanechnikov", "rectangular", "triangular", "biweight", 
    "cosine", "optcosine"), weights = NULL, window = kernel, 
    width, give.Rkern = FALSE, n = 512, from, to, cut = 3, na.rm = FALSE, 
    ...) 

Arguments

x the vector of observations from the distribution whose density is to be estimated. Missing values (NAs) are allowed if na.rm is TRUE.
bw the smoothing bandwidth to be used in density estimation.

bw can be either a positive number specifying the bandwidth explicitly, or it can be a character string.

When bw is specified as a character string, case is ignored. The following table describes the character string, the function it references, and its description.

character string function description
"nrd0" bw.nrd0 normal reference density, never returning 0.0
"nrd" bw.nrd normal reference density, possibly returning 0.0
"bcv" bw.bcv biased cross-validation
"ucv" bw.ucv unbiased cross-validation
"sj" "sj-ste" bw.SJ the Sheather-Jones "plug-in" estimator with method "ste"
"sj-dpi" bw.SJ the Sheather-Jones "plug-in" estimator with method "dpi"
These referenced functions use various algorithms for choosing the bandwidth given the data x. (All of these functions ignore the weights argument).
adjust the number derived from the bw argument is multiplied by adjust to make the bandwidth.
kernel a character string giving the type of kernel function used in the computations. Must be one of: "gaussian", "epanechnikov", "rectangular", "triangular", "biweight", "cosine", "optcosine" (one character is sufficient).
weights a vector of same length as x for computing a weighted density estimate. The weights must be nonnegative and sum to 1.0. When weights is NULL (the default), all points in x are equally weighted.
width For compatibility with S-PLUS, this can be used instead of bw. width is multiplied by a kernel-dependent quantity to make them compatible.
give.Rkern a logical flag. If TRUE, the quantity integral(u^2 * K(u) * du) * integral(K(u)^2 *du) of the selected kernel function is returned instead of the usual return value.
n the number of equally-spaced points at which to estimate the density. If n is greater than 512, it is rounded up to the power of 2.
from, to the n estimated values of density are equally-spaced between from and to. The default is the range of the data extended by bw*cut.
cut the fraction of the window width by which the x values are to be extended. The default is 3. cut is ignored if from and to are used.
na.rm a logical flag. If TRUE, then missing values (NAs) are removed before estimation. If FALSE (the default), then missing values are not allowed.
... other arguments for non-default methods.

Details

These are kernel estimates. For each x value in the output, the window is centered on that x and the heights of the window at each datapoint are summed. This sum, after a normalization, is the corresponding y value in the output: the value at x[i] is
y[i]=1/N*sum(K(x[i]-X))
where K is the kernel function specified by window and width, X is the input data, and N is the length of X. In the presence of weights the value is
y[i]=1/sum(weights)*sum(weights*K(x[i]-X)).
For efficiency, the convolution is computed using the discrete Fourier transform.
The bandwidth functions bw.SJ, bw.ucv, and bw.bcv are not yet in TIBCO Enterprise Runtime for R.
Value
returns the R-kernel value when give.Rkern is TRUE. Otherwise returns a list object of class "density" with the following components: two components, x and y, suitable for giving as an argument to approx or to be plotted.
x the vector of n points at which the density is estimated.
y the density estimate at each x point.
bw the smoothing bandwidth is used in density estimation.
n the number of non-NA observations used to calculate the estimate.
call the function call.
data.name the deparse name of x.
has.na a logical flag if NA exists in observations.
Background
Density estimation is essentially a smoothing operation. Inevitably there is a trade-off between bias in the estimate and the estimate's variability: wide windows produce smooth estimates that may hide local features of the density.
References
Becker, R. A., Chambers, J. M., and Wilks, A. R. 1988. The New S Language: A Programming Environment for Data Analysis and Graphics. Pacific Grove, CA: Wadsworth & Brooks/Cole Advanced Books and Software.
Scott, D. W. 1992. Multivariate Density Estimation. Theory, Practice and Visualization. New York, NY: John Wiley & Sons.
Sheather, S. J. and Jones, M. C. 1991. A reliable data-based bandwidth selection method for kernel density estimation. J. Roy. Statist. Soc. Volume B. 683-690.
Silverman, B. W. 1986. Density Estimation for Statistics and Data Analysis. London, UK: Chapman and Hall.
Venables, W. N. and Ripley, B. D. 2002. Modern Applied Statistics with S. Fourth Edition. New York, NY: Springer.
Wegman, E. J. 1972. Nonparametric probability density estimation. Technometrics. Volume 14. 533-546.
See Also
bw.nrd, bw.nrd0, hist, approx.
Examples
density((cos(1:300)+0.09)^3)
density((cos(1:300)+0.09)^3, bw=0.25)
Package stats version 6.0.0-69
Package Index