GRM Syntax - Example 3: Best-Subset Regression with Categorical Predictors

This example illustrates the best-subset regression facilities of GRM, and how they can be applied to experimental designs. The FORCE keyword is used to force all five main effects into the model; GRM will then search for a best subset of up to 5 additional two-way interactions (i.e., START = 6, STOP = 10). A unique feature of GRM is that when categorical predictor variables or effects have more than a single degree of freedom (as in this example), the stepwise and best-subset procedures ensure that the coded (sigma-restricted) variables representing the categorical predictors are moved in or out of the model as a block (so that always complete multi-degree of freedom effects are included or excluded from the final model). You can run the example shown below using the example data file Tomatoes.sta .

GRM;

{ Dependent variable (list): }

   DEPENDENT = POUNDS;

{ Specification of grouping variables (factors); note that

   no codes (values) are specified, so the program will by default

   take all grouping codes found in the data file. }

   GROUPS = 'SOIL CONDITION' POTSIZE VARIETY 'PRODUCTION METHOD' LOCATION;

{ Here the bar operator and the @ operator are used to construct the

  factorial design to degree 2; the bar operator will evaluate to all main

  effects and interactions up to the number specified after the @ operator }

   DESIGN = 'SOIL CONDITION' |   POTSIZE | VARIETY | 'PRODUCTION METHOD' | LOCATION @2;

{ Best-subset regression is requested as the model building method. }

   MBUILD = BESTSUBSET;

{ Here the first 5 effects, i.e., main effects, are "forced" into the model. }

   FORCE = 5;

{ Mallow's Cp index is will be used to evaluated the subsets. }

    BESTCRIT = MALLOWSCP;

{ The search for the subsets will begin with subsets of size 6, up to

   subsets of size 10 }

   START = 6;

   STOP = 10;

For more examples, see GRM Syntax - Examples.