GRM Syntax - Example 3: Best-Subset Regression with Categorical Predictors
This example illustrates the best-subset regression facilities of GRM, and how they can be applied to experimental designs. The FORCE keyword is used to force all five main effects into the model; GRM will then search for a best subset of up to 5 additional two-way interactions (i.e., START = 6, STOP = 10). A unique feature of GRM is that when categorical predictor variables or effects have more than a single degree of freedom (as in this example), the stepwise and best-subset procedures ensure that the coded (sigma-restricted) variables representing the categorical predictors are moved in or out of the model as a block (so that always complete multi-degree of freedom effects are included or excluded from the final model). You can run the example shown below using the example data file Tomatoes.sta .
GRM;
{ Dependent variable (list): }
DEPENDENT = POUNDS;
{ Specification of grouping variables (factors); note that
no codes (values) are specified, so the program will by default
take all grouping codes found in the data file. }
GROUPS = 'SOIL CONDITION' POTSIZE VARIETY 'PRODUCTION METHOD' LOCATION;
{ Here the bar operator and the @ operator are used to construct the
factorial design to degree 2; the bar operator will evaluate to all main
effects and interactions up to the number specified after the @ operator }
DESIGN = 'SOIL CONDITION' | POTSIZE | VARIETY | 'PRODUCTION METHOD' | LOCATION @2;
{ Best-subset regression is requested as the model building method. }
MBUILD = BESTSUBSET;
{ Here the first 5 effects, i.e., main effects, are "forced" into the model. }
FORCE = 5;
{ Mallow's Cp index is will be used to evaluated the subsets. }
BESTCRIT = MALLOWSCP;
{ The search for the subsets will begin with subsets of size 6, up to
subsets of size 10 }
START = 6;
STOP = 10;
For more examples, see GRM Syntax - Examples.