chisq.test(x, y = NULL, correct = TRUE, p = rep(1/length(x), length(x)), rescale.p = FALSE, simulate.p.value = FALSE, B = 2000)
x |
a factor or a two-dimension contingency table in either a matrix or a data frame form
(a data frame is coerced to matrix with as.matrix).
If x is a contingency table, it must have at least two rows and two columns. All elements must be non-negative, and neither NAs nor Infs are allowed. The elements of the contingency table should be whole numbers, because the test is based on counts; however, because all computations are carried out to double precision accuracy, where possible, the storage mode of x is coerced to double. If x is a factor, certain restrictions are imposed. See argument y for details. |
y |
a factor object.
Conversely, if x or y is not a factor object (and x is not a contingency table), it is coerced to one implicitly. In this case, pairs (x[i],y[i]) containing NAs are removed, but pairs with Infs are not removed. Coercion of x and y in this manner is intended for datasets of mode numeric, whose elements are typically small integers. |
correct | a logical scalar. If TRUE (the default) and simulate.p.value = FALSE, Yates' continuity correction is applied, but only for dichotomous categories (2 by 2 tables). |
p | a numeric vector, with the same length as x, that contains the probabilities. Elements with a negative value are not allowed. p is used to calculate the return value for expected. |
rescale.p | a logical value. If TRUE and sum(p) > 1, then p is rescaled to sum of 1. Otherwise it returns the "probabilities must sum to 1" error. The default is FALSE. |
simulate.p.value | a logical value. If TRUE, p-values are computed by Monte Carlo simulation. The default is FALSE. |
B | an integer specifying the number of replicates to use in the Monte Carlo test. |
statistic | Pearson's X-squared statistic with the names attribute X-squared. See the details section for the definition. |
parameter | degrees of freedom of the asymptotic chi-square distribution that is associated with statistic with the names attribute "df". Given by the product (R-1)*(C-1), where R is the number of rows and C the number of columns of the contingency table. |
p.value | asymptotic p-value for the test. |
method | a character string listing the name of the method, along with whether Yates' continuity correction was applied. |
data.name | a character string (vector of length 1) containing the name of the input argument x, and of y if both x and y are factor objects. |
observed | the observed counts. The value of x. |
expected | the expected counts under the null hypothesis. |
residuals | the Pearson residuals, whose value is (x - E)/sqrt(E), where E is expected. |
x <- factor(c( "A","B","A","A","B","B","B","A","B","B","B","B","B","A","B", "B","A","B","A","A","A","A","B","A","A","B","A", "B","B","A","A")) y <- factor(c( "Yes","No","No","No","No","No","Yes","Yes","Yes","No", "No","Yes","No","Yes","No","No","Yes","Yes","Yes","No","Yes", "Yes","No","No","No","Yes","No","No","No","Yes","Yes")) table(x, y) # y # x No Yes # A 6 9 # B 11 5chisq.test(x, y) # Pearson's Chi-squared test # data: x and y # X-squared = 1.5534, df = 1, p-value = 0.2126
chisq.test(table(x, y)) # Pearson's Chi-squared test with Yates' continuity correction # data: table(x, y) # X-squared = 1.5534, df = 1, p-value = 0.2126