Box plot
Box plots are graphical tools to visualize key statistical measures, such as median, mean and quartiles.
A single box plot can be used to represent all the data. It is also possible to show separate statistics for subsets of the data by selecting a column for the category axis.
The individual box plot is a visual aid to examining key statistical properties of a variable. The diagram below shows how the shape of a box plot encodes these properties. The range of the vertical scale is from the minimum to the maximum value of the selected column, or, to the highest or lowest of the displayed reference points.
Which reference points should be shown in the box plot is specified upon configuration of the visualization. A reference point can be indicated by either a marker or a line, and you can specify its color and shape. For details of each measure, see Aggregations and statistical measures.
The axis selectors control which column is mapped to which axis. See Selecting columns on an axis for more information about how they work.
The value axis should be set to the column or columns on which the statistical measures should be based. You can adjust the scales and scale labels in the visualization properties. You can also choose to use a Relative scale which sets the max and min for all box plots to 100% and 0% respectively.
A separate box is drawn for each unique value in the column or hierarchy on the category axis, and because of this, it should not contain too many unique values. To summarize the data in a single box, select (None) on the category axis. If multiple columns are used on the value axis, '(Column Names)' must be used either on the category axis or in one of the trellis options.
All visualizations can be configured to show data limited by one or more markings in other visualizations only (details visualizations). Box plots can also be limited by one or more filterings. Another alternative is to configure a box plot without any filtering at all. See Adding data limitations for a visualization for more information.
- Reference points
You can decide which reference points to show in a box plot, change their color and whether to present them as a marker or as a line. - Adding measures to the statistics table
You can decide which statistical measures to show in a box plot. - Comparison circles
The drawing of comparison circles is a way to show whether the mean values for various categories (boxes in the box plot) are significantly different from each other. The circles are drawn with their centers at the mean value for the box to which it corresponds. You can add comparison circles from the visualization properties panel in any client. - Confidence interval (95%)
You can add a black line next to the box showing the confidence interval. - Adding a distribution histogram to a box plot
You can add a histogram showing the distribution for each category to the box plot. - Creating a violin plot
You can add a kernel density estimate (KDE) to the box plot to effectively turn it into a violin plot.
- Reference points
You can decide which reference points to show in a box plot, change their color and whether to present them as a marker or as a line. - Adding measures to the statistics table
You can decide which statistical measures to show in a box plot. - Comparison circles
The drawing of comparison circles is a way to show whether the mean values for various categories (boxes in the box plot) are significantly different from each other. The circles are drawn with their centers at the mean value for the box to which it corresponds. You can add comparison circles from the visualization properties panel in any client. - Confidence interval (95%)
You can add a black line next to the box showing the confidence interval. - Adding a distribution histogram to a box plot
You can add a histogram showing the distribution for each category to the box plot. - Creating a violin plot
You can add a kernel density estimate (KDE) to the box plot to effectively turn it into a violin plot.