Matrix of Predictors


Creates a matrix of predictors from terms.object. Primarily used as an internal call in other model functions.


model.matrix(object, ...) 
model.matrix.default(object, data = environment(object),
    contrasts.arg = NULL, xlev = NULL, ...)


object an object from which the function can infer a model matrix. For the default method, object is usually a formula or a terms object constructed by a model-fitting function based on the model formula. You can also specify a fitted model for any class of model that inherits from class lm.
data any data frame, source for the data, or it can be missing. It is commonly the model frame constructed by model.frame and if it is not the model frame, the function coerces it into a model frame by a call to model.frame. The data, from which the function computes columns of the matrix, is extracted from the model frame. In standard use of code matrix, the variables are numeric vectors, factors, ordered factors, or numeric matrices. You can also submit character vectors, logical vectors (which are coerced to factors), or subsidiary data frames with numeric columns.
contrasts.arg a list (optional) that specifies contrasts for some or all of the factors that appear in the terms object. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels. The complete contrast list (any element specified as an argument plus any additional contrast matrices that are computed) is returned as the "contrasts" attribute of the model matrix, and hence as the "contrasts" component of fitted models returned by lm() and its descendants.
xlev a named list (optional) that contains names, which specify factors contained in "data", along with values that correspond to levels to retain in the resulting model frame. Note that the function may not drop levels if they occur in "data". However, if some levels are not contained in "data", you can use "xlev" to retain those levels in the resulting model frame.
an object of class "model.matrix" that inherits from "matrix". This object is a matrix of predictor variables that includes contrasts for all factors and ordered factors in the terms object. If the model includes an intercept, the first column is the vector of 1s. The matrix has several special attributes:

assign an integer vector, of length equal to the number of terms in the model. The elements of the vector identify which columns of the model matrix encode the corresponding term. Note that assign attribute in R does not operate the same as this assign attribute.
dimnames the row labels constructed from the model frame and the column labels constructed from the variable names. (For more information about the variable names see below.) The column labels define the names for the coefficients and effects of the fitted model. (In the case of multivariate response models, read "row labels" for "names".) The row labels are the same as the names or row labels for fitted values and residuals, but these usually come directly from the model frame through model.extract.
contrasts a named list that contains contrast matrices or character vectors. Any contrast matrices used are returned in an element of the list with the same name as the corresponding variable. For more information, see lm.object.
Factors, including ordered factors, are turned into columns of numeric variables using contrasts or dummy variables according to the instructions coded in the terms objects "factors" attribute. Particular contrasts are chosen using the contrasts argument as supplied (typically as passed down from lm(), and so on), from the "contrasts" attribute of the factor, if any, or from the default choice of contrast functions. In the absence of this attribute, the two character strings in options("contrasts") define the choice of contrast function for factors and ordered factors. Note that the same variable may be used both with and without contrasts. Interaction terms are formed by computing the various main effects and then taking all products of the corresponding columns. However, in practice the computations do look back at previously computed terms an an attempt to avoid re-computation. For details concerning specifying contrast functions as arguments, see contrasts and C.
The column labels are constructed by the following definition. Numeric variables inherit the corresponding term label. Numeric matrices produce column labels that concatenate the term label with the column labels of the matrix, if any, or with "1", "2", and so on. Main effects for factors or ordered factors use the column label concatenated with the column labels of the contrast matrix, again using "1", "2", and so on as the default. For both cases, the term label is used alone if there is only one column or one contrast.
This function is primarily a support routine, called by lm and by other model-fitting functions that call or derive from lm, such as aov, glm, and gam. Note that the model- fitting functions loess and tree do not use model.matrix, primarily because they do not use contrasts to handle factors.
When model.matrix is processing a column of character data it will turn it into factor column and warn you that it is doing so. This warning mimics its behavior in R. You can suppress the warning by setting options(
See Also
model.frame, model.extract, terms.object.
fl <- lm(Fuel ~ Weight + Disp., data=Sdatasets::fuel.frame)
Package stats version 6.0.0-69
Package Index