model.matrix
Matrix of Predictors
Description
Creates a matrix of predictors from terms.object. Primarily
used as an internal call in other model functions.
Usage
model.matrix(object, ...)
model.matrix.default(object, data = environment(object),
contrasts.arg = NULL, xlev = NULL, ...)
Arguments
object |
an object from which the function can infer a model matrix. For the
default method, object is usually a formula or a terms object
constructed by a model-fitting function based on the model formula.
You can also specify a fitted model for any class of model that
inherits from class lm.
|
data |
any data frame, source for the data, or it can be missing. It is
commonly the model frame constructed by model.frame and if it
is not the model frame, the function coerces it into a model frame by
a call to model.frame. The data, from which the function
computes columns of the matrix, is extracted from the model frame. In
standard use of code matrix, the variables are numeric vectors,
factors, ordered factors, or numeric matrices. You can also submit
character vectors, logical vectors (which are coerced to factors), or
subsidiary data frames with numeric columns.
|
contrasts.arg |
a list (optional) that specifies contrasts for some or all of the
factors that appear in the terms object. The elements of the list
should have the same name as the variable and should be either a
contrast matrix (specifically, any full-rank matrix with as many rows
as there are levels in the factor), or else a function to compute such
a matrix given the number of levels.
The complete contrast list (any element specified as an argument plus
any additional contrast matrices that are computed) is returned as the
"contrasts" attribute of the model matrix, and hence as the
"contrasts" component of fitted models returned by lm()
and its descendants.
|
xlev |
a named list (optional) that contains names, which specify factors
contained in "data", along with values that correspond to
levels to retain in the resulting model frame. Note that the function
may not drop levels if they occur in "data". However, if some
levels are not contained in "data", you can use "xlev"
to retain those levels in the resulting model frame.
|
Value
an object of class "model.matrix" that inherits from
"matrix". This object is a matrix of predictor variables that
includes contrasts for all factors and ordered factors in the terms
object. If the model includes an intercept, the first column is the
vector of 1s. The matrix has several special attributes:
assign |
an integer vector, of length equal to the number of terms in the
model. The elements of the vector identify which columns of the model
matrix encode the corresponding term.
Note that assign attribute in R does not operate the
same as this assign attribute.
|
dimnames |
the row labels constructed from the model frame and the column labels
constructed from the variable names. (For more information about the
variable names see below.) The column labels define the names for the
coefficients and effects of the fitted model. (In the case of
multivariate response models, read "row labels" for "names".) The row
labels are the same as the names or row labels for fitted values and
residuals, but these usually come directly from the model frame
through model.extract.
|
contrasts |
a named list that contains contrast matrices or character vectors.
Any contrast matrices used are returned in an element of the list
with the same name as the corresponding variable. For more
information, see lm.object.
|
Contrasts
Factors, including ordered factors, are turned into columns of numeric
variables using contrasts or dummy variables according to the
instructions coded in the terms objects "factors" attribute.
Particular contrasts are chosen using the contrasts argument as
supplied (typically as passed down from lm(), and so on), from
the "contrasts" attribute of the factor, if any, or from the
default choice of contrast functions. In the absence of this
attribute, the two character strings in options("contrasts")
define the choice of contrast function for factors and ordered
factors. Note that the same variable may be used both with and without
contrasts. Interaction terms are formed by computing the various main
effects and then taking all products of the corresponding columns. However, in practice the computations do look back at previously
computed terms an an attempt to avoid re-computation. For details
concerning specifying contrast functions as arguments, see
contrasts and C.
Labels
The column labels are constructed by the following definition. Numeric
variables inherit the corresponding term label. Numeric matrices
produce column labels that concatenate the term label with the column
labels of the matrix, if any, or with "1", "2", and so
on.
Main effects for factors or ordered factors use the column label
concatenated with the column labels of the contrast matrix, again
using "1", "2", and so on as the default. For both
cases, the term label is used alone if there is only one column or one
contrast.
This function is primarily a support routine, called by lm and
by other model-fitting functions that call or derive from lm,
such as aov, glm, and gam. Note that the model-
fitting functions loess and tree do not use
model.matrix, primarily because they do not use contrasts to handle factors.
Note
When model.matrix is processing a column of character data it
will turn it into factor column and warn you that it is doing so.
This warning mimics its behavior in R. You can suppress the warning
by setting options(char.to.factor.warn=FALSE).
See Also
Examples
fl <- lm(Fuel ~ Weight + Disp., data=Sdatasets::fuel.frame)
model.matrix(fl)