Chi square Goodness-of-Fit Test

x	numeric vector. NAs and Infs are allowed but will be removed.
n.classes	the number of cells into which the observations are to be allocated. If the vector cut.points is supplied, then n.classes is set to length(cut.points) - 1. The default is recommended by Moore (1986).
cut.points	vector of cutpoints that define the cells. x[i] is allocated to cell j if cut.points[j] < x[i] <= cut.points[j+1]. If x[i] is less than or equal to the first cutpoint or greater than the last cutpoint, then x[i] is treated as missing. If the hypothesized distribution is discrete, cut.points must be supplied.
distribution	character string that specifies the hypothesized distribution. distribution can be one of: "normal", "beta", "cauchy", "chisquare", "exponential", "f", "gamma", "lognormal", "logistic", "t", "uniform", "weibull", "binomial", "geometric", "hypergeometric", "negbinomial", "poisson", or "wilcoxon". You need only supply the first characters that uniquely specify the distribution name. For example, "logn" and "logi" uniquely specify the lognormal and logistic distributions.
n.param.est	number of parameters estimated from the data.
...	parameters for the function that generates p-values for the hypothesized distribution.

Details

The chi-square test, introduced by Pearson in 1900, is the oldest and best known goodness-of-fit test. The idea is to reduce the goodness-of-fit problem to a multinomial setting by comparing the observed cell counts with their expected values under the null hypothesis. Grouping the data sacrifices information, especially if the underlying variable is continuous. On the other hand, chi-squared tests can be applied to any type of variable: continuous, discrete, or a combination of these.

statistic:	chi square statistic, with names attribute "chisq".
parameters:	degrees of freedom of the chi square distribution associated with the statistic. Component parameters has names attribute "df".
p.value:	p-value for the test.
data.name:	character string (vector of length 1) containing the actual name of the input vector x.
counts:	vector of the number of data points that fall into each cell.
expected:	vector of counts expected under the null hypothesis.

Pearson's chi-square statistic, the same used in the function chisq.test. Asymptotically, the distribution of this statistic is the chi-square distribution. If the hypothesized distribution function is completely specified, the degrees of freedom are m - 1, where m is the number of cells. If any parameters are estimated, the degrees of freedom depend on the method of estimation. The usual procedure is to estimate the parameters from the original (i.e., not grouped) data, and then to subtract one degree of freedom for each parameter estimated. In truth, if the parameters are estimated by maximum likelihood, the degrees of freedom are bounded between (m-1) and (m-1-k), where k is the number of parameters estimated. Therefore, especially when the sample size is small, it is important to compare the test statistic to the chi-square distribution with both (m-1) and (m-1-k) degrees of freedom. See Kendall and Stuart (1979) for a more complete discussion.

The distribution theory of chi-square statistics is a large sample theory. The expected cell counts are assumed to be at least moderately large. As a rule of thumb, the each should be at least 5. Although authors have found this rule to be conservative (especially when the class probabilities are not too unequal), the user should regard p-values with caution when expected cell counts are small.

Description

Usage

Arguments

Details