model.frame
Construct or Extract a Model Frame

Description

Given a formula and some data, model.frame returns a data frame with a terms attribute that contains sufficient information for many fitting function (e.g., lm or glm) to fit the formula to the data. The model.frame will contain a column for each elementary term in the formula, but no columns for interaction terms. E.g., the formula ~sqrt(x)+log(x)*log(y) will cause it to have 3 columns, sqrt(x), log(x), and log(y).
Given a fitted model, model.frame will return the data.frame-with-terms object used to fit the model.
get_all_variables a data.frame containing a column for each variable mentioned in the formula. E.g., the formula ~sqrt(x)+log(x)*log(y) will cause it to have 2 columns, x and y.
Function model.frame is an S Version 3 generic (see Methods); method functions can be written to handle specific S Version 3 classes of data. Besides the default method, classes that already have methods for model.frame include aovlist, glm and lm.

Usage


model.frame(formula , ...)






## S3 method for class 'lm':
model.frame(formula, ...)


## S3 method for class 'aovlist':
model.frame(formula, data = NULL, ...)



## Default S3 method:
model.frame(formula, data = NULL, subset = NULL, na.action = na.fail,
    drop.unused.levels = FALSE, xlev = NULL, ...)

get_all_vars(formula, data = NULL, ...)

Arguments

formula the formula or other object defining what terms should be included in the model frame. Besides being a formula object, this can be a fitted model of various kinds, in which case the formula used in fitting the model defines the terms.
data data frame from which the model frame is to be constructed. After looking for variables in the data argument, model.frame will look in the environment of the formula, which is usually the environment in which the formula was constructed.
subset a vector that specifies a subset from the data frame (data) to use in formula.
na.action a function to filter missing data. The default is the function na.fail.
drop.unused.levels If TRUE then unused levels in any factor will be omitted from the "levels" attribute of the factor in the resulting model frame. If FALSE (the default), unused factors levels will not be dropped so the model frame will contain factors with all of the original levels, used or not.
xlev an optional named list with names specifying the factor columns contained in "data" and values corresponding to levels contained in them in the resulting model frame. Note that levels may be dropped if they occur in "data". However, if some levels are not contained in "data" but it is desired to retain those levels in the resulting model frame, "xlev" allows you to do so. If "xlev" is provided "drop.unused.levels" will be ignored.
... other arguments pass to or from the methods. It could be data, subset, na.action or weights etc.

Details

The response and any extra variables other than subset are stored in the data frame. They should be retrieved from the frame by using model.extract(fr, response) for response, model.extract(fr, weights) for weights, and so on for whatever names were used in the arguments to model.frame. Other than subset, the names of such extras are arbitrary; they only need to evaluate to a legitimate variable for the data frame (e.g., a numeric vector, a matrix, or a factor). The names of such variables are specially coded in the model frame so as not to conflict with variable names occurring in the terms. You should always use model.extract, which shares the knowledge of the coded names with model.frame, rather than assuming a specific coding.
The function get_all_vars get all variables from formula or data(if formula is not given).
Value
The function model.frame and S3 methods return a data frame representing all the terms in the model (precisely, all those terms of order 1; i.e., main effects), plus the response if any, and any special extra variables (such as weight arguments to fitting functions). One such argument is handled specially---namely, subset=. If this argument is present, it is used to compute a subset of the rows of the data. It is this subset that is returned. The returned data frame has an attribute terms containing the terms object defined by the formula, constructed by the terms function.
get_all_vars returns a data frame, containing variables in formula or data.
Note
Model frames are more typically produced as a side-effect of fitting a model rather than directly by calling model.frame. Functions like lm take an option model=TRUE/FALSE, that controls whether the model.frame is stored as part of the fitted model object.
References
Chambers, J. M. and Hastie, T .J. (Eds.) 1992. Statistical Models in S. Pacific Grove, CA.: Wadsworth & Brooks/Cole. Chapter 3.
See Also
model.extract, Methods, terms, all.vars.
Examples
model.frame(ozone ~ radiation + temperature, subset=(wind < 9.7),
    data=Sdatasets::air)

fit <- aov(plants ~ variety * treatment + Error(flats), data=Sdatasets::guayule) model.frame(fit)

get_all_vars(Mileage ~ Disp. + log(HP), data=Sdatasets::car.all[Sdatasets::car.all$Country == "USA", ])

glm.fit <- glm(Kyphosis ~ ., family = binomial, data = Sdatasets::kyphosis) model.frame(glm.fit)

Package stats version 6.1.1-7
Package Index