Decision Tree Regression - CART
Generates a decision tree that predicts the value of a numeric column based on several independent columns.
Information at a Glance
This operator uses the MADlib built-in function, tree_train(). The generated tree is a binary tree, with each node representing either a branching condition or a predicted value. The output of the operator can be sent to a predictor or confusion matrix. MADlib 1.8 or higher must be installed on the database.
For more information about working with decision trees, see Classification Modeling with Decision Tree.
Input
The input table must have a single, numeric (floating point) column to predict, and one or more independent columns to serve as input.
Restrictions
This operator works only on databases with MADlib 1.8+ installed. Source data tables must have a numeric ID column that uniquely identifies each row in the source table. The prediction column must be numeric, and all predictions are double-precision values.
Configuration
Outputs
- Visual Output
-
This operator produces the following tabs.
- Decision Tree Text - Contains a text representation of the generated decision tree. Each branch node contains a number of rows and a prediction. Branch nodes also contain a branching condition.
- Decision Tree Graph - Contains a tree graph. Branches reflect split conditions and associated predictions.