Example 2: Stepwise Regression Analysis
- Data File
- This example is based on the examples data file Job_prof.sta (from Neter, Wasserman, and Kutner, 1989, page 473). Open this data file by selecting Open Examples from the File menu (classic menus) or by selecting Open Examples from the Open menu on the Home tab (ribbon bar); it is in the Datasets folder. The first four variables (Test1-Test4) represent four different aptitude tests that were administered to each of the 25 applicants for entry-level clerical positions in a company. Regardless of their test scores, all 25 applicants were hired. Once their probationary period had expired, each of these employees was evaluated and given a job proficiency rating (variable Job_prof).
- Research problem
- Using stepwise regression, the variables (or subset of variables) that best predict job proficiency will be analyzed. Thus, the dependent variable will be Job_prof and variables Test1-Test4 will be the independent or predictor variables.
- Starting the analysis
- Select Multiple Regression from the Statistics menu. In the Multiple Linear Regression Startup Panel, click the Variables button and specify variable Job_prof as the Dependent variable and variables Test1-Test4 from the Independent variable list; then click the OK button. Next click the Advanced tab, and select the Advanced options (stepwise or ridge regression) check box. Then click the OK button to display the Model Definition dialog box.
- Specifying the Stepwise Regression
- You can choose to analyze the data using a Standard, Forward stepwise, or Backward stepwise regression method. The popular Forward stepwise method evaluates the independent variables at each step, adding or deleting them from the model based on user-specified criteria (for more information, see Neter, Wasserman, and Kutner, 1989, and
Regression Notes). Therefore, the forward stepwise regression will be used to analyze the data for this example.
On the Model Definition dialog box - Quick tab, click the Method drop-down box and select Forward stepwise. Next, on the Stepwise tab you can change the F to enter and F to remove values; however, for this example, accept the default values of 1 and 0, respectively. In order to view the results at each step of the analysis, select At each step in the Display results drop-down box.
Now, accept all other defaults in this dialog box and click the OK button to begin the forward stepwise regression.
- Step 0
- First, the Results dialog box will be displayed for step 0, when no variables have been entered in the model.
- Step 1
- Click the Next button to proceed to the next step in the analysis. In the first step, each of the independent variables are evaluated individually and the variable that has the largest F value greater than or equal to the F to enter value is entered into the regression equation.
Here, variable Test3 met the F to enter criteria (F>1. 0) and was added to the model. Select the Advanced tab, and then click the Stepwise regression summary button to produce a spreadsheet with a summary of the steps so far in the analysis.
Click the Next button in the Multiple Regression Results dialog box to proceed to the next step.
- Step 2
- Now, in subsequent steps when a variable is added to the model (based on the F to enter criteria), the forward stepwise regression method will examine the variables included in the model, and, based on the F to remove criteria, will determine whether any variables already in the model should be removed. In the second step, variable Test1 is entered into the model. Clicking the Stepwise regression summary button will produce the following results spreadsheet.
Once again, click the Next button in the Multiple Regression Results dialog box to proceed to step 3 in the forward stepwise analysis.
- Step 3 (Final Solution)
- There are two variables remaining to evaluate (Test2 and Test4). For this step, the largest F value was given by Test4, therefore, it was added to the model. When Test2 was evaluated, the F value was less than the F to enter value of 1.0, therefore, it was not entered into the model.
The Stepwise regression summary results spreadsheet now summarizes the variables that were entered into and kept in the model.
Now, according to the Forward stepwise regression procedure, the subset of aptitude tests (independent variables) that best predicts the job proficiency score (dependent variable) contains Test3, Test1, and Test4. Therefore, the regression equation appears as follows:
y = B0 + B1*X3 + B2*X1 + B3*X4
To obtain the regression coefficients from the regression summary spreadsheet, click the Summary: Regression results button.
The final regression equation is:
y = -124.200 + 1.357*X3 + 0. 296*X1 + 0. 517*X4