WS Node - Define Training Testing Sample - Specifications Tab
The Define Training Testing Sample workspace node can be accessed from the Feature Finder, ribbon bar, or Node Browser. Double-click the node to display the specifications dialog box.
This node is specifically designed to create a variable that can be used as a sampling code variable to specify the sample for training, testing, validation, or any number of categories to be used in downstream analysis nodes. For example, in ITrees for specifying the sample code for training and testing, or in SANN to specify the training, testing, and validation sample using the sample variable. The node doesn’t physically split the data into three different data sets, but provides the ability to define a sample code variable and define as many codes as possible.
Create new sampling variable/Sampling Variable Name. Select the Create new sampling variable check box and enter a new variable name in the Sampling Variable Name text box to define a new sampling variable for the training testing data sample.
Create new sampling variable/Select Variable. Clear the Create new sampling variable check box, and click the Select Variable button to display a standard variable selection dialog box, where you can select an existing variable to define a sampling variable for the training testing data sample.
Seed. Enter a number or adjust the number using the microscrolls to define the seed value for generating random sampling data.
Specify recode categories with proportions for each category. Each recode category includes a name and a proportion. The minimum number of recode categories to be valid for proceeding is 2. The OK button will not be available (will be dimmed) if the number of recode categories < 2. Recode categories' names cannot be repeated. An error message will be displayed when you click the OK button if there are duplicate category names.