Combining Groups Classification with Deployment

The program will automatically find and implement (e.g., for data marked Data for deployment) a best recoding scheme for the prediction of a categorical variable from one or more categorical predictors with many classes (e.g., such as SIC codes with over 10,000 distinct values). The program uses an efficient CHAID-like algorithm to determine the best combinations of classes that will yield a strong relationship to the respective outcome variable of interest. The recoded (aggregated) class variables (now with fewer distinct values) can then be submitted to subsequent analyses with the various tools for predictive data mining.

General

Element Name	Description
Min-N to stop (% of cases)	The minimum number of cases (observations) per recoded class (node), expressed as a percent of the total number of observations (if the specified percentage of cases evaluates to less than 5 observations, the minimum number of cases per recoded class (node) will be set to 5).
Minimum number of categories	Minimum number of categories to recode.
p value for splitting	p value used for splitting.
p value for merging	p value used for merging.
Splitting after merging	Splitting after merging of categories.
Bonferroni adjustment	Applies Bonferroni adjustment to probabilities.
Add new variables	Add new variables to the input spreadsheet to hold the recoded variables.
Generates data source	Generates a data source for further analyses with other Data Miner nodes.

Contents

Index

Search Results

Combining Groups Classification with Deployment

General