Statistical Functions


Tip: The DISTINCT keyword can be used to return a result using the unique values only. For example, Avg(DISTINCT[Column]) would return the average of the unique values rather than the average of all values in the specified column. UniqueCount([Column]) is the equivalent of Count(DISTINCT[Column]).

Function

Description

Avg(Arg1, ...)

Returns the average (arithmetic mean) of the arguments. The arguments and the result are of type real. If one argument is given, then the result is the average of all rows. If more than one argument is given, then the result is the average for each row. Null arguments are ignored and do not contribute to the average.

Examples:

Avg([Column])

Avg(2,-3,4)           → 1

Avg(-1)                  → -1
Avg(1.5, -2, 3.5)   → 1
Avg(1, null, 3)       → 2
Avg(null)               → (Empty)

ChiDist(Arg1)

Returns the (upper tail) chi-square p-value of the argument.

Example:

ChiDist(x, deg_freedom)

ChiDist(7.377759, 2) =0.025

ChiInv(Arg1)

Returns the (upper tail) chi-square quantile value of the argument.

Example:

ChiInv(p, deg_freedom)

ChiInv(0.025, 2) =7.377759

Count(Arg1)

Calculates the number of non-empty values in the argument column, or, if no argument is specified, the total number of rows.

Example:

Count([Column])

CountBig(Arg1)

 

Calculates the number of non-empty values in the argument column, or, if no argument is specified, the total number of rows. This function returns a LongInteger.

Example:

CountBig([Column])

Covariance(Arg1, Arg2)

Calculates the covariance of two columns given as arguments.

Example:

Covariance([Column1], [Column2])

FDist(Arg1

Returns the upper tail F p-value of the argument.

Example:

FDist(x, deg_freedom1, deg_freedom2)

FDist(6.936728, 1, 10) =0.025

FInv(Arg1)

Returns the upper tail F quantile value of the argument.

Example:

FInv(p, deg_freedom1, deg_freedom2)

FInv(0.025, 1, 10) =6.936728

First(Arg1)

Returns the first valid value based on the physical order of the rows of data in the argument column.

Example:

First([Column])

GeometricMean()

Calculates the geometric mean value. If any input value is negative then the result will be "Empty". If any input value is equal to zero then the result will be zero.

Example:

GeometricMean([Sales])

IQR(Arg1)

Calculates the value difference Q3-Q1, or, the 75th percentile minus the 25th percentile. IQR is also referred to as the H-spread.

Example:

IQR([Column])

L95(Arg1)

Calculates the lower endpoint of the 95% confidence interval.

Example:

L95([Column])

Last(Arg1)

Returns the last valid value based on the physical order of the rows of data in the argument column.

Example:

Last([Column])

LAV(Arg1)

Calculates the lower adjacent value.

Example:

LAV([Column])

LIF(Arg1)

Calculates the lower inner fence. This is the threshold located at Q1 – (1.5*IQR).

Example:

LIF([Column])

LOF(Arg1)

Calculates the lower outer fence. This is the threshold located at Q1 – (3*IQR).

Example:

LOF([Column])

Max(Arg1, ...)

Calculates the maximum value. If one argument is given, then the result is the maximum for the entire column. If more than one argument is given, then the result is the maximum for each row. The argument and the result are of type real. Null arguments are ignored.

Examples:

Max([Column])

Max(-1)                 → -1
Max (1.5, -2, 3)    → 3
Max (1, null, 3)     → 3
Max (null)              → (Empty)

MeanDeviation(Arg1, ...)

Calculates the mean deviation value (average absolute deviation, AAD). If one argument is given, then the result is the mean deviation of all rows. If more than one argument is given, then the result is the mean deviation for each row.

Examples:

MeanDeviation([Column])

MeanDeviation(2,-3,4)        → 2.67

Median(Arg1)

Calculates the median of the argument. If one argument is given, then the result is the median of all rows. If more than one argument is given, then the result is the median for each row.

Examples:

Median([Column])

Median(2,-3,4)

MedianAbsoluteDeviation(Arg1, ...)

Calculates the median absolute deviation value (MAD). If one argument is given, then the result is the median absolute deviation of all rows. If more than one argument is given, then the result is the median absolute deviation for each row.

Examples:

MedianAbsoluteDeviation([Sales])

MedianAbsoluteDeviation(2,-3,4)

Min(Arg1, ...)

Calculates the minimum value. If one argument is given, then the result is the minimum for the entire column. If more than one argument is given, then the result is the minimum for each row. The argument and the result are of type real. Null arguments are ignored.

Examples:

Min([Column])

Min(-1)                 → -1
Min (1.5, -2, 3)    → -2
Min (1, null, 3)    → 1
Min (null)             → (Empty)

NormDist(Arg1)

Returns the (upper tail) normal p-value of the argument. If you do not specify them yourself, the default is mean=0 and standard deviation=1.

Example:

NormDist(x, mean, standard_dev)

NormDist(1.96) =0.025

NormInv(Arg1)

Returns the (upper tail) normal quantile value of the argument. If you do not specify them yourself, the default is mean=0 and standard deviation=1.

Example:

NormInv(p, mean, standard_dev)

NormInv(0.025) =1.96

NthLargest(Arg1, Arg2)

The nth largest value. The first argument is the column to analyze and the second argument is the value of n.

If n is larger than the number of values in the column, then the smallest value is returned.

Example:

NthLargest([Column], 10)

NthSmallest(Arg1, Arg2)

The nth smallest value. The first argument is the column to analyze and the second argument is the value of n.

If n is larger than the number of values in the column, then the largest value is returned.

Example:

NthSmallest([Column], 10)

Outliers(Arg1)

Outer value count. Calculates the count of values that are greater than the upper adjacent value or lower than the lower adjacent value.

Example:

Outliers([Column])

P10(Arg1)

The 10th percentile is the value at which 10 percent of the data values are equal to or lower than the value.

Example:

P10([Column])

P90(Arg1)

The 90th percentile is the value at which 90 percent of the data values are equal to or lower than the value.

Example:

P90([Column])

PctOutliers(Arg1)

Outer value percentile. Calculates the percent of values that are greater than the upper adjacent value or lower than the lower adjacent value.

Example:

PctOutliers([Column])

Percent(Arg1, Arg2)

The percent is the value calculated a certain percent above the minimum value within the value range (max value - min value). The first argument is the column to analyze and the second argument is the percent.

Example:

Percent([Column], 15.0)

Percentile(Arg1, Arg2)

The percentile is the value at which a certain percent of the data values are equal to or lower than the value. The first argument is the column to analyze and the second argument is the percent.

Example:

Percentile([Column], 15.0)

Q1(Arg1)

Calculates the first quartile.

Example:

Q1([Column])

Q3(Arg1)

Calculates the third quartile.

Example:

Q3([Column])

Range(Arg1)

The range between the largest and the smallest value in the column.

The result will be presented as a real or a timespan, depending on the data type of the argument.

Example:

Range([Column])

StdDev(Arg1)

Calculates the standard deviation.

Example:

StdDev([Column])

StdErr(Arg1)

Calculates the standard error.

Example:

StdErr([Column])

TDist(Arg1)

Returns the (upper tail) t p-value of the argument.

Example:

TDist(x, deg_freedom)

TDist(4.302653, 2) =0.025

TERR_Binary


Calls the TIBCO Enterprise Runtime for R engine and returns an output of the data type specified by the function name.

The first argument is a script and the following arguments are the arguments to the script. Requires at least one argument in addition to the script.

Inputs must be placed in variables named input1, input2, ... inputN.

The output must be placed in a variable called output.

Returns a column containing the same number of rows as the input.

Examples:

TERR_Real("output <- input1*100 + input2", [Record No], [Sales])

TERR_String("output <- input1", [String Column])

TERR_Boolean

See TERR_Binary above.

TERR_DateTime

See TERR_Binary above.

TERR_Integer

See TERR_Binary above.

TERR_LongInteger

See TERR_Binary above.

TERR_Real

See TERR_Binary above.

TERR_SingleReal

See TERR_Binary above.

TERR_String

See TERR_Binary above.

TrimmedMean(Arg1, Arg2)

Calculates the trimmed mean value (trimmed average). The first argument is the column to analyze and the second argument is, in percent, how many values to exclude from the calculation. If the trim value is set to 10%, then the highest 5%  and the lowest 5% of the values are excluded from the calculated mean.

Example:

TrimmedMean([Sales], 10)

U95(Arg1)

Calculates the upper endpoint of the 95% confidence interval.

Example:

U95([Column])

UAV(Arg1)

Calculates the upper adjacent value.

Example:

UAV([Column])

UIF(Arg1)

Calculates the upper, inner fence. This is the threshold located at Q3 + (1.5*IQR).

Example:

UIF([Column])

UniqueCount(Arg1)

Calculates the number of unique, non-empty values in the argument column.

Example:

UniqueCount([Column])

UOF(Arg1)

Calculates the upper, outer fence. This is the threshold located at Q3 + (3*IQR).

Example:

UOF([Column])

ValueForMax(Arg1, Arg2)

Returns the value of column 2 for the maximum value of column 1.

If there are more than one of the column 1 maximum value, then the result will be the value for the first max row.

Example:

ValueForMax([Column 1], [Column 2])

ValueForMin(Arg1, Arg2)

Returns the value of column 2 for the minimum value of column 1.

If there are more than one of the column 1 minimum value, then the result will be the value for the first min row.

Example:

ValueForMin([Column 1], [Column 2])

Var(Arg1)

Calculates the variance.

Example:

Var([Column])

WeightedAverage(Arg1, Arg2)

Calculates the weighted average of two columns. Arg1 is the weight column and Arg2 is the value column.

Example:

WeightedAverage([Column1],[Column2])

See also:

Binning functions

Conversion functions

Date and Time functions

Logical functions

Math functions

OVER functions

Property functions

Spatial functions

Ranking functions

Text functions