Exploration Operators
Exploration operators provide different ways to explore and visualize your data.
Knowing which type of visualization that is appropriate for your data is important for successful modeling. For more information about selecting the best visualization type, see Visualizing data with charts and graphs.
- Bar Chart
Visualizes the attributes of a data set in bar chart format. - Box Plot
Visualizes the attributes of a data set in box plot format. - Correlation (DB)
Use to specify two or more numeric type attributes (columns) in a data set for relative analysis against each other by calculating the correlation between each pair of selected columns. - Correlation (HD)
Use to specify two or more numeric type attributes (columns) in a data set for relative analysis against each other by calculating the correlation between each pair of selected columns. - Frequency
Analyzes the values of selected fields in a table, helping to interpret the shape of the data column by column. - Histogram
Analyzes the values of the selected fields of a data set, and generates a graphical representation of the frequency distribution of the numeric data. - Information Value
Calculates both the information value (IV) and weight of evidence (WOE) of attributes. These are measures of the overall "relevance" of a data variable in predicting the dependent column's desired value or outcome. - Line Chart
Produces a line chart for a user-specified X-axis and Y-axis, and aggregates the values in the Y-axis. - Scatter Plot Matrix
Creates pairwise scatter plots of the selected columns. This gives a visual sense of the relationship between each of the paired attributes, as well as the calculated correlation. - Summary Statistics (DB)
Provides useful summary information for the selected columns of the data set passed by the preceding operator. - Summary Statistics (HD)
Provides useful summary information for the selected columns of the data set passed by the preceding operator. - Variable Selection (DB)
Identifies and prioritizes the variables of interest to a prediction task or model. This is especially helpful when there are a large number of potential variables for a model, enabling the modeler to focus on only a subset of those that show the strongest relevance. - Variable Selection (HD)
Identifies and prioritizes the variables of interest to a prediction task or model. This is especially helpful when there are a large number of potential variables for a model, enabling the modeler to focus on only a subset of those that show the strongest relevance.
Copyright © Cloud Software Group, Inc. All rights reserved.