Box Plot

Visualizes the attributes of a data set in box plot format.

Boxplot operator icon

Information at a Glance

Category Explore
Data source type DB, HD
Sends output to other operators No
Data processing tool Pig

Use this operator to create a graphical representation of a data set's attributes using the box-and-whisker diagrams.

For each box-and-whisker, the following applies.
  • The bottom and top of the box indicate the 25th and 75th percentile of the data.
  • The band in the middle of the box indicates the 50th percentile (median).
  • The bottom and top of the whisker represents the minimum and maximum within the data set.
  • The mean is denoted by a small circle.
The Box Plot operator allows you to choose the following three attributes (columns) in the data table:
  • X dimension (Analysis Type) - Accepts a category-type attribute to construct the box plot's X-axis.
  • Y dimension (Analysis Value) - Accepts a numerical attribute to construct the box plot's Y-axis.
  • Grouped dimension (Analysis Series) - Accepts a category-type attribute, which is shown in diagrams of different colors.

Input

A data set from the preceding operator.

Configuration

Parameter Description
Notes Any notes or helpful information about this operator's parameter settings. When you enter content in the Notes field, a yellow asterisk is displayed on the operator.
Analysis Series The categorical column for the groups.
Analysis Type (X-axis) The categorical column for the X-axis.
Analysis Value (Y-axis) The numerical column for the Y-axis.
Use approximation (faster) Specify whether to use approximation - Yes (the default) or No.

Output

Visual Output
A box plot diagram.
Data Output
None. This is a terminal operator.

Example

The following example shows monthly income (Analysis Value) for delinquent/non-delinquent credit customers (AnalysisType) grouped by the Number of Times 90 Days Late (Analysis Series).


Box plot example