ROC
Generates a Receiver Operating Characteristic (ROC), or ROC curve.
Information at a Glance
|
Parameter |
Description |
|---|---|
| Category | Model Validation |
| Data source type | HD, DB |
| Send output to other operators | No |
| Data processing tool | MapReduce |
The ROC curve is used to verify and compare the trained model(s) passed from a preceding model operator or operators by applying the algorithm on the data set passed from a preceding operator. The ROC-AUC method considers the coordinate pairing of the false positive rate (FP) and the true positive rate (TP). This set of coordinates forms the Receiver Operating Characteristic (ROC) curve.
The value of the ROC curve can be summarized by calculating the Area Under the ROC curve (AUC).
A random model typically has an ROC curve running along the diagonal. A better model curves to the upper left-hand side, thus having an AUC value approaching one.
This operator can be applied, in general, to any classification model (for example, CART, Decision Tree, Logistic Regression, Naive Bayes, Neural Network and Alpine Forest Classification).
Input
- A data set from the preceding operator.
- One or more model(s) from the preceding operator(s). This input is optional on database.
Configuration
| Parameter | Description |
|---|---|
| Notes | Notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk appears on the operator. |
| Dependent Column | Define the column used as the class variable. (Not present in Hadoop) |
| Value to Predict | The value represents the event to analyze.
Note: The value of this column must match the data as it is stored in the database, which matches how it is displayed in the data explorer. For example, consider a column that contains Boolean values.
|
| Use Model | Specifies whether the evaluation should use a model from its preceding operator(s), or if it should use the data in the prediction columns of the input data set.
(Not present in Hadoop) |
| Confidence Columns | Specifies the list of columns in the input data set to compare to the Dependent Column. (Not present in Hadoop) |
Output
