Details on Insert Binned Column


Select Insert > Binned Column....

 

bin_insert_binned_column_dialog.png

Option

Description

Data table

Only available when more than one data table is present in the analysis and the dialog has been opened via the main menu.

Specifies the data table where the binned column will be inserted.

Column

Displays the available columns on which you can perform binning. It is possible to search for columns by typing in the field provided when the drop-down list is expanded. The values from the selected column will be sorted into several bins or categories based on your selections.

Specific limits

Allows you to explicitly enter the values of the limits to use for each bin.

 

Enter the values you wish to use for the limits of your bins and separate them with a semicolon. For example, typing "20;30;40" will result in the  following bins:

x ≤ 20

20 <  x ≤ 30

30 < x ≤ 40

40 < x

Even intervals

Allows you to specify the desired number of bins and divides the value range into equal intervals.

 

This method works for all data types except string. The current data range is divided up into the specified number of bins. Empty values will be empty in the new column, and when loading linked data tables, new values will be placed inside one of the available bins.

Even distribution of unique values

Allows you to specify the desired number of bins and divides the bins so that each one contains an equal number of unique values.

 

The suggested division works for all data types and is made so that the bins each contain an equal number of unique values. Extra values are placed in the final bins, so if you have four values and you want three bins with one value in each bin, your final value will be placed in the third bin. Empty values will be empty in the new column, and when loading linked data tables the bin ranges will be modified to fit the new data range.

Based on standard deviation

Divides the range into sections as described by the selected standard deviation multipliers.

 

This method works for numeric columns only. The range is divided into sections as described by the selected standard deviation multipliers. Bins are created using any of the standard deviations +/- 0.5, 1, 2, 3, 6. In the example below, the range is divided into the following six subsections (µ denoting the average value for the column and s the corresponding standard deviation):

lower limit -> (µ-3s)

(µ-3s) -> (µ-s)

(µ-s) -> µ

µ -> (µ+s)

(µ+s) -> (µ+3s)

(µ+3s) -> upper limit

Empty values will be empty in the new column, and when loading linked data tables the standard deviation will be recalculated.

Substring

Groups the rows by the first or last characters of the values in the column to be binned. The exact number of characters to take into account must be supplied.

Example:

Suppose the column to be binned contains family names, beginning with Adams and ending with Winter. To bin the rows according to the first letter in the name, use the Substring option considering one character from the beginning. Bin names are generated from the substring, and if Ignore case is used, the bin names are all formatted as upper case.

 

Empty values will be empty in the new column, and when loading linked data tables the new values will be placed in suitable bins, taking the substrings into consideration.

New column name

The name of the new, binned column.

See also:

What is Binning?

How to Use Binning

Binning Functions