glm
Fit a Generalized Linear Model

Description

Produces an object of class "glm" that is a generalized linear fit of the data.

Usage

glm(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, contrasts = NULL, ...)

Arguments

 formula a formula expression as for other regression models, of the form response ~ predictors. For details, see the documentation for lm and formula. See the DETAILS section for special forms the response variable can take in logistic regression. family a family object. This is a list of expressions for defining the link, variance function, initialization values, and iterative weights for the generalized linear model. Supported families are gaussian binomial poisson Gamma inverse.gaussian quasi quasibinomial quasipoisson Functions like binomial produce a family object and can be given without the parentheses. Family functions can take arguments, as in binomial(link=probit). For more details, see the help files for family and family.object. data an environment, data.frame, or list in which to look up the names occurring in the formula. The default value is the environment in which the formula was constructed, environment(formula). If names of variables or functions in the formula cannot be found in data, then we look for them in the environment from which glm was called. weights the weights for the fitting criterion. By default, all observations are weighted equally. subset an expression defining which subset of the rows in the data to use in the fit. This can be a logical vector, which is replicated to have a length equal to the number of observations, a numeric vector indicating which observation numbers to include, or a character vector of the row names to include. By default, all observations are included. na.action a function or the name of a function to handle missing values in the data. This is applied to the model.frame of variables used in the model after any subset argument is used. The default value is taken from the global options vector, getOption("na.action"). na.action="na.exclude" deletes observations that contain one or more missing values. (Note that it attaches information about where the missing values were so predict and residuals can return results that line up with the original data.) A possible alternative is "na.fail", which creates an error if any missing values are found. start a vector of initial values on the scale of the linear predictor. This vector is passed to glm.fit as an argument. This argument is useful in rare cases where the default starting values pose convergence problems to the underlying algorithm. For more information, see Chambers and Hastie (1993). etastart an optional vector passed to glm.fit. It is used as the starting values for the the linear predictor. mustart an optional vector passed to glm.fit. It is used as the starting values for the vector of means. offset an optional offset passed to glm.fit. It is added to the linear predictor. control a list of iteration and algorithmic constants. See glm.control for their names and default values. These can also be given directly as arguments to glm itself, instead of through control. model a logical flag. If TRUE (the default), indicates that the model.frame is returned as a component of the glm object list(names as model). method the method to use in fitting the model. By default, the function glm.fit is used and the model is fit via iteratively reweighted least squares. An alternative fitting method can be model.frame; however, other fitting methods can be defined by the user. See Chambers and Hastie (1993), pages 245 to 246 for more information. x a logical flag. If TRUE, the model.matrix is returned as a component of glm object list(names as x). By default, it is FALSE. y a logical flag. If TRUE (the default), the response variable is returned as a component of glm object list(names as y). contrasts a list of contrasts to use for some or all of the factors appearing as variables in the model formula. The names of the list should be the names of the corresponding variables. The elements of the list should be either contrast-type matrices (matrices with as many rows as levels of the factor, and with columns linearly independent of each other and of a column of ones), or they should be functions that compute such contrast matrices. See the help file for contr.helmert for examples. ... additional arguments are passed to generate list for control argument if it is given with list(...).

Details

Other generic functions that have methods for glm objects are drop1 and add1.
The required formula argument to glm is in the same format as most other formulas in TIBCO Enterprise Runtime for R, with the response on the left side of a tilde (~) and the predictor variables on the right. However, in logistic regression the response can assume a few different forms:
• If the response is a logical vector or a two-level factor, it is treated as a 0/1 binary vector. The zero values correspond to failures and the ones correspond to successes.

• If the response is a multilevel factor, TIBCO Enterprise Runtime for R assumes the first level codes failures (0) and all of the remaining levels code successes (1).

• If the response is a two-column matrix, TIBCO Enterprise Runtime for R assumes the first column holds the number of successes for each trial and the second column holds the number of failures.

• If the response is a general numeric vector, TIBCO Enterprise Runtime for R assumes that it holds the proportion of successes. That is, the ith value in the response vector is s[i]/n[i], where s[i] denotes the number of successes out of n[i] total trials. The n[i] should be given as weights to the weights argument to indicate the relative importance of different cases.

 Note The weights are not interpreted as counts. This does not affect predictions or coefficients estimated by the model, but degrees of freedom and standard errors are calculated as if the number of observations is length(weights) rather than sum(weights).
The model is fit using Iterative Reweighted Least Squares (IRLS). The working response and iterative weights are computed using the functions contained in the family object. The workhorse of glm is the function glm.fit, which expects x and y arguments rather than a formula.
Value
returns an object of class "glm". This object inherits from lm. See glm.object for details.
The output object from glm has all of the components of an lm object, with a few more components. You can examine it using one of the following:
• print
• summary
• plot
• anova
You can extract the components using one of the following:
• predict
• fitted
• residuals
• coefficients
• deviance
• effects
• formula,\
• family
You can modify a glm object using update.
References
Chambers, J. M. and Hastie, T. J. (1993). Statistical Models in S. London: Chapman and Hall.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. London: Chapman and Hall.