Workspace Node: 2D Box Plots - Advanced Tab

In the 2D Box Plots workspace node dialog box, select the Advanced tab to display the following options.

Element Name Description
Variables Click this button to display a variable selection dialog box, in which you select the grouping (category) and dependent variables for the graph. If you select a category variable and more than one dependent variable for Regular box plots (see the graph formats below in the Graph type option description), a sequence of graphs (one for each dependent variable) is produced; if the format is set to Multiple, box plots for all selected variables are combined in one graph.
Note: if you select only one dependent variable and no category variable, a single box plot representing the distribution of the dependent variable is produced. If you select multiple dependent variables and no category variable, a single graph with box plots for each selected variable is produced. The latter plot is useful for comparing the distribution of several variables.
Graph type Select the type of box plot to be plotted. Click a link below for a brief description of that type of graph.

Box Whiskers

High-Low Close

Boxes

Columns

Whiskers

You can also choose between two types of graph formats. Click the links below to learn more about the formats.

Regular

Multiple

Grouping intervals Specify a method of categorization for the selected variable(s). Each of the methods is discussed in Method of Categorization.
Change Variable Click this button to display a variable selection dialog box in which you can change the selection of the Dependent variables and/or the Grouping variables. If you change the variable(s) using this button, display of the variable selection under the Variables button changes accordingly.
Fit type You can choose to fit an equation to the mid-points in the box plots by selecting one of the predefined functions.

Linear

Polynomial

Logarithmic

Exponential

Distance Weighted

Negative Exponential Weighted

Spline

Lowess

Note: The graph icon above the Middle point group box represents the currently selected Graph type and the selection of values for the Middle point, Box, and Whiskers. The graph icon previews these three selections and the specific statistics that define the current box plot.
Middle point The options in this group box control the type of value and appearance of the middle point.
Value The middle point can be either the Mean, Median, Mean/Median (uses the Mean as the middle point, plus it has an added marker for the Median), or Median/Mean (uses the Median as the middle point, plus it has an added marker for the Mean) of the selected variable. The options available for the Box and Whisker depend on this selection.
Style Specify how the middle point will be represented (by a Line or Point).
Pooled variance

This check box is available when you select Mean as the Middle point Value. The setting of this check box determines how the standard deviations and standard errors (for the means) are computed from grouped data. When the Pooled variance check box is selected, Statistica computes the pooled within-group (category) variance for all groups (categories), and uses this value as an estimate of s (Sigma) when computing the standard errors for the means (see, for example, Milliken and Johnson, 1984). Specifically, the program computes the pooled within-group (category) variance as:

spooled2 = 1/(n-k) * [s12 *(n1 -1) + ... + sk2 *(nk -1)]

In this equation, k refers to the k groups in the plot, s12, refers to the variance in the i'th category or group, n1 refers to number of valid observations in the i'th category or group, and n is the overall number of valid observations in the plot.

The standard error of the mean for the i'th group is then computed as:

s.e.(mean) = spooled /square root(ni)

Multiple box layout

When the Multiple box plot style is selected for the Graph type format (see above), you can choose to display the boxes, whiskers, columns or box-whiskers in one of two styles:

Overlaid Series of box plots are displayed one on top of the other.
Shifted Series of box plots are displayed side by side.
Trim distrib. extremes Specify the percent of cases to be trimmed from the extremes (i.e., tails) of the distribution of cases for the selected variable. For example, if you specify 10%, for a variable with 100 cases, Statistica removes the first 10 lowest value cases and the 10 highest value cases from the distribution, and only plots the 80 middle cases. If you enter a value for Trim distrib. extremes for a mean-based box plot, so-called trimmed means will be plotted.
Statistics Select statistics to be included as footnotes in the graph.
Kruskal-Wallis Select this check box to include the Kruskal-Wallis test statistic as a footnote on the graph.
F test and p (ANOVA) Select this check box to include the F and p statistic as a footnote on the graph.
Box
  • If Median is specified as the Middle point, the range (box) can be represented by Percentiles or the Min-Max values of the selected variable, or a specified Constant value (when you want a fixed size box around the medians).
  • If Mean is specified as the Middle point, the range (box) can be defined in terms of standard deviations (Std Dev), standard errors (Std Error), Conf. Interval, Min-Max values of the selected variable, or a specified Constant value (when you want a fixed size box around the means).

You can also specify a Coefficient, by which the selected range value is multiplied. Note that, except for unusual applications, the default value of the coefficient should not be changed if the box Value is Min-Max. If Median is specified as the Middle point, and Percentiles is specified as the Value, the Coefficient entered must be between 0.01 - 50.0. If Mean is specified as the Middle point, and Conf. Interval is specified as the Value, the Coefficient entered must be between 0.15 - 0.9999.

Whisker
  • If Median is specified as the Middle point, the range (whiskers) can be represented by Percentiles or the Min-Max values of the selected variable, a specified Constant value, or Non-outlier range (see Outliers and extremes).
  • If Mean is specified as the Middle point, the range (whiskers) can be defined in terms of standard deviations (Std Dev), standard errors (Std Error), or Min-Max values of the selected variable, or Non-outlier range.

If you select Non-outlier range, Statistica determines which points in the data set are outliers (see Outliers and extremes), and then uses the highest and lowest data points which are closest to the outliers (but are not outliers) as the whiskers in the plot.

You can also specify a Coefficient by which the selected range value will be multiplied. In most typical applications the coefficient should be set to 1 when the value of the whisker is Min-Max or Non-outlier range.

Outliers You can elect to display none (select Off), only Outliers, only Extreme values, or both Outliers & Extremes in the box plot. For more details, see Outliers and Extremes.
Connect middle points In box plots, you can select the desired mid-point (Mean or Median or trimmed Mean or Median of the selected variable) to be represented by the selected style (point or line; see Middle point above). Select this check box to connect the middle points of the box plots with a line. If selected, for example, a line plot (or categorized line plot) of means with error bars or a line plot (or categorized line plot) of medians with quantiles and min-max range bars can be produced. Selecting the Overlaid option button in the Multiple box layout group box aligns the respective plots from each line. This setting of the Connect middle points check box can also be used to create line plots of means with error bars, or line plots of medians with range bars. Also, this format is used in some statistical procedures to create predefined graphical output.
Display raw data Select this check box to display the raw data points.
Jitter Use the options in this group box to jitter the data points, i.e., modify the original position of the data point from the center of the graph in order to more easily identify/brush overlapping points.
Off No jitter is applied to the raw data points, outliers, and extremes.
Sequential The jitter is applied sequentially to the raw data points, outliers, and extremes. The jitter is applied such that the first case in the data set is maximally shifted to the left and the last case is shifted maximally to the right.
Random The data point is randomly shifted within the available range.
Width. Specify the maximum jitter width defined as percentage of box width. Possible percentages range from 0 to 250.

Options / C / W. See Common Options.

OK Click this button to accept all the specifications made in the dialog box and to close it. The results are placed in the Reporting Documents workspace node after running (updating) the project.