lm(formula, data = environment(formula), subset, weights, na.action = getOption("na.action"), method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)
formula | a formula object. The response variable, specified as either a single numeric variable or a matrix, must be on the left of a tilde (~) operator and the explanatory variables must be on the right. Additive explanatory variables may be combined with the plus (+) and interactions are encoded by putting an asterisk, (*), between interacting terms. See the help file for formula.object for details on formula syntax. |
data |
If supplied, this is usually a data frame or named list containing the
variables named in the formula, subset, weights, and offset arguments.
Variables or functions not found in data are then expected to be found
in environment(formula), which is usually the environment
in which the formula was created.
data may also be an environment from which the variables and functions in the formula may be extracted, or a positive integer which is passed to parent.frame() to reference an environment in the call stack. If data is an environment (or positive integer signifying an environment), then environment(formula) is not used as a backup source of variables or functions. If you do not supply a value for data or supply data=NULL, then all variables and functions in the formula must be accessible from environent(formula). |
subset |
a vector that specifies which subset of observations to use in
the fit. This can be a logical vector that is replicated so that its
length is equal to the number of observations, a numeric vector
indicating the observation numbers to include, or a character
vector of the observation names that should be included. By default,
all observations are included.
The variables and functions used in the expression given to subset will be searched for in same manner as those in formula. |
weights |
a numeric vector that contains the observation weights. If supplied,
the fitting algorithm minimizes the sum of the weights multiplied by
the squared residuals. For additional technical details, see the
details section.
The number of observations must match length(weights).
The value for each weight must not be negative; however, because zero
weights are ambiguous, we recommend that the value for each weight be
strictly positive. To exclude particular observations from the model,
use the subset argument instead of assigning zero weights.
The variables and functions used in the expression given to weights will be searched for in same manner as those in formula. |
na.action | a function to filter missing data that is applied to model.frame after the application of any subset argument. The default na.fail returns an error if any missing values are found. An alternative is na.exclude, which excludes observations that contain one or more missing values. |
method | a character string that specifies the least squares fitting method to use in the function. The only available method is "qr", others are allowed for historical reasons, but the qr method is always used. The pseudo-method "model.frame" causes lm to return only the model.frame containing the variables that would be used in the model fitting process. |
model | a logical value. If TRUE, then the model.frame is returned as the model component of the fitted object. |
x | a logical value. If TRUE, then the model.matrix is returned as the x component of the fitted object. The default is FALSE. |
y | a logical value. If TRUE, then the response is returned as the y component of the fitted object. The default is FALSE. |
qr | a logical value. If TRUE (the default), then the QR decomposition of the model matrix is returned as the qr component of the fitted object. |
singular.ok | a logical value telling what to do if the explanatory variables are not all linearly independent (to a small numerical tolerance). If FALSE, then give an error. If TRUE (the default), then set the coefficients of the redundant variables to NA. |
contrasts | a list that gives contrasts for some or all of the factors that appear in the model formula. An element in the list should have the same name as the factor variable it encodes, and it should be either a contrast matrix (any full-rank matrix with as many rows as there are levels in the factor) or a function that computes such a matrix given the number of levels. If omitted or NULL use the contrasts listed in getOption("contrasts"). |
offset |
A numeric vector that will be subtracted from the response before fitting the model.
The expression given as the offset argument can instead be included
on the right side of the formula as the term +offset(offsetExpression).
The variables and functions used in the expression given to offset will be searched for in same manner as those in formula. |
... | additional arguments are passed to the fitting routine, lm.fit. |
lm(y ~ ., data=Sdatasets::freeny) summary(lm(Fuel ~ Weight + Disp., data=Sdatasets::fuel.frame))# Formulas have intercepts by default, so include # a -1 for regression without an intercept. lm(Fuel ~ Weight -1, data=Sdatasets::fuel.frame)
# Example of weighted regression lm(cost ~ age + type + car.age, data = Sdatasets::claims, weights = number, na.action = na.exclude)
# Are there significant interactions between driver age # and car type when modelling number of claims? anova(lm(number ~ age + type, data = Sdatasets::claims), lm(number ~ age * type, data = Sdatasets::claims))