How to: |
Binary logistic regression finds the best linear separation between two classes in the space spanned by the predictors. The target variable has two values (0 and 1), corresponding to the two classes. The predicted value is either a class assignment (0 or 1) or the probability of belonging to class 1.
CLASSIFY_BLR('options', predictor_field1[, predictor_field2, ...] target_field)
where:
Is a dictionary of advanced parameters that control the model attributes, enclosed in single quotation marks. Most of these parameters have a default value, so you can omit them from the request, if you want to use the default values. Even with no advanced parameters, the single quotation marks are required. The format of the advanced parameter dictionary is:
'{"parm_name1": "parm_value1", ... ,"parm_namei": "parm_valuei"}'
The following advanced parameters are supported:
Indicates whether to display a probability of being in the target class or a class prediction. Valid values are "no" (the default) or "yes". The value "yes" produces a probability Y (0<=Y<=1) for belonging to the class specified by "pos_label" (default is "1") for each point in the space. The value "no" produces a class-prediction Y (0 or 1) for each point in the space.
Is relevant when "proba" is "yes". Specifies the class for which the probability of belonging is to be computed. Valid values are "0" and "1". The default is "1".
Is a value between 0 and 1 that specifies the fraction of data used for training the model. The default value is "0.8".
Is a value between 0 and 1 that specifies the fraction of data used for testing the model. The default value is "0.2".
Is a grid consisting of comma-separated positive numbers to be used as L2-regularization strengths. The default value is "0,1,1,10". The optimal value is chosen by cross-validation.
Is the number of folds used for cross-validation. Suggested values are integers between "2" and "10". The default value is "4".
Numeric
Are one or more predictor field names.
Numeric
Is the target field.
The following TABLE request uses CLASSIFY_BLR to compute the number of doors in a car, using the default values for the advanced parameters and predictors price, height, horsepower, peak RPM, city MPG, and highway MPG.
TABLE FILE imports85 PRINT numOfDoors COMPUTE predictedNumOfDoors/A6 = CLASSIFY_BLR('{"proba":"no","kfold":"4","test_ratio":"0.2"}', price, height, horsepower, peakRpm, cityMpg, highwayMpg, numOfDoors); WHERE price LT 30000 ON TABLE SET PAGE NOPAGE ON TABLE SET STYLE * GRID=OFF,$ ENDSTYLE END
The partial output is shown in the following image.