Workspace Node: Advanced Classification CHAID - Specifications - Validation Tab
In the Advanced Classification CHAID workspace node dialog box, under the Specifications heading, select the Validation tab to access options to specify the method of cross-validation to be used in the analysis. Two types of cross-validation methods are available from this tab: V-fold cross-validation and Test sample.
Element Name | Description |
---|---|
V-fold cross-validation | V-fold cross-validation is particularly useful when no test sample is available and the learning sample is too small to have the test sample taken from it. Select the V-fold cross-validation check box to make use of v-fold cross-validation. Additional specifications for v-fold cross-validation include Seed for random number generator and V-fold cross-validation; v-value. These values are used to control the sampling that Statistica performs to obtain cross-validation error estimates. If this check box is selected when you click the OK button, the program will automatically grow the (best) tree, and will compute risk estimates separately for the training and cross-validation samples if you select the Risk estimates check box on the Results - Quick tab. |
Seed for random number generator | The positive integer value entered in this box is used as the seed for a random number generator that produces v-fold random subsamples from the learning sample to test the predictive accuracy of the computed trees. |
V-fold cross-validation; v-value | The value entered in this box determines the number of cross-validation samples that will be generated from the learning sample to provide an estimate for the current tree. See also the Introductory Overview for details. |
Test sample | The test sample option enables you to use a subsample of cases for estimating the accuracy of the classifier or prediction. Click the
Test sample button to display the
Test-Sample dialog box, through which you can switch on or off the
Test sample option as well as select a variable that will be used as the sample identifier variable. Click the
Sample Identifier Variable button to display a variable selection dialog box, in which to choose the sample identifier variable. In addition, you need to select the code for the selected variable that uniquely identifies the cases to be used in the test sample. By default, when a sample identifier variable has been selected, a valid code will be displayed in the
Code for analysis sample box. If this is not the desired code for identifying the test sample, double-click on the box (or press the F2 key on your keyboard) to display a dialog box from which you can select the desired code from the list of valid variable codes.
If a Test sample is identified, the Risk estimates for the final tree (see the Results - Quick tab) and predicted values or classifications (and residuals; see the Results - Prediction tab) can be computed separately for the training and the testing sample. |
Options / C / W | See Common Options. |
OK | Click this button to accept all the specifications made in the dialog box and to close it. The analysis results are placed in the Reporting Documents workspace node after running (updating) the project. |
Copyright © 2021. Cloud Software Group, Inc. All Rights Reserved.