PARTITION_AGGR: Creating Rolling Calculations

How to:

Reference:

Using the PARTITION_AGGR function, you can generate rolling calculations based on a block of rows from the internal matrix of a TABLE request. In order to determine the limits of the rolling calculations, you specify a partition of the data based on either a sort field or the entire TABLE. Within either type of break, you can start calculating from the beginning of the break or a number of rows prior to the current row. You can stop the rolling calculation at the current row or the end of the partition.

By default, the field values used in the calculations are the summed values of a measure in the request. Certain prefix operators can be used to add a column to the internal matrix and use that column in the rolling calculations. The rolling calculation can be SUM, AVE, CNT, MIN, MAX, FST, or LST.

Syntax: How to Generate Rolling Calculations Using PARTITION_AGGR

PARTITION_AGGR([prefix.]measure,{sortfield|TABLE},from,to,operation)

where:

prefix.

Defines an aggregation operator to apply to the measure before using it in the rolling calculation. Valid operators are:

  • SUM. which calculates the sum of the measure field values. SUM is the default operator.
  • CNT. which calculates a count of the measure field values.
  • AVE. which calculates the average of the measure field values.
  • MIN. which calculates the minimum of the measure field values.
  • MAX. which calculates the maximum of the measure field values.
  • FST. which retrieves the first value of the measure field.
  • LST. which retrieves the last value of the measure field.

Note: The operators PCT., RPCT., TOT., MDN., and DST. are not supported. COMPUTEs that reference those unsupported operators are also not supported.

measure

Is the measure field to be aggregated. It can be a real field in the request or a calculated value generated with the COMPUTE command, as long as the COMPUTE does not reference an unsupported prefix operator.

sortfield

Is a BY or ACROSS field that defines the boundary of the partition. Operations will not cross a boundary. In the request the BY HIGHEST phrase to sort high-to-low is supported. ACROSS COLUMNS AND is also supported, but BY ROWS OVER and FOR are not supported.

Specifying TABLE as the boundary makes the partition boundary the entire internal matrix.

For example, if the sort is BY YEAR BY MONTH, with data from both 2014 and 2015, specifying the boundary as YEAR means that January 2015 - 2 will be valued as zero (0) or MISSING, as two months prior to January 2015 would cross the YEAR boundary. However, specifying TABLE as the boundary and requesting - 2 months would return the data for November 2014.

from

Identifies the starting point for the rolling calculation. Valid values are:

  • -n, which starts the calculation n rows back from the current row.
  • B, which starts the calculation at the beginning of the current sort break (the first line with the same sort field value as the current line).
to

Identifies the ending point of the rolling calculation. Valid values are:

  • C, which ends the rolling calculation at the current row in the internal matrix.
  • E, which ends the rolling calculation at the end of the sort break (the last line with the same sort value as the current row.)
operation

Specifies the rolling calculation used on the values in the internal matrix. Supported operations are:

  • SUM. which calculates a rolling sum.
  • AVE. which calculates a rolling average.
  • CNT. which counts the rows in the partition.
  • MIN. which returns the minimum value in the partition.
  • MAX, which returns the maximum value in the partition.
  • FST. which returns the first value in the partition.
  • LST. which returns the last value in the partition.

The calculation is performed prior to any WHERE TOTAL tests, but after any WHERE_GROUPED tests.

Example: Calculating a Rolling Average

The following request calculates a rolling average of the current line and the previous line in the internal matrix within the quarter.

TABLE FILE WF_RETAIL_LITE
SUM COGS_US 
COMPUTE AVE1/D12.2M = PARTITION_AGGR(COGS_US, TIME_QTR, -1, C, AVE);
BY BUSINESS_REGION
BY TIME_QTR
BY TIME_MTH
WHERE BUSINESS_REGION EQ 'North America' OR 'South America'
ON TABLE SET PAGE NOLEAD
END

The output is shown in the following image. Within each quarter, the first average is just the value from Q1, as going back 1 would cross a boundary. The second average is calculated using the first two rows within that quarter, and the third average is calculated using rows 2 and 3 within the quarter.

The following changes the rolling average to start from the beginning of the sort break.

COMPUTE AVE1/D12.2M = PARTITION_AGGR(COGS_US, TIME_QTR ,B, C, AVE);

The output is shown in the following image. Within each quarter, the first average is just the value from Q1, as going back would cross a boundary. The second average is calculated using the first two rows within that quarter, and the third average is calculated using rows 1 through 3 within the quarter.

The following command uses the partition boundary TABLE.

COMPUTE AVE1/D12.2M = PARTITION_AGGR(COGS_US, TABLE, B, C, AVE);

The output is shown in the following image. The rolling average keeps adding the next row to the average until a break in the business region sort field.

Reference: Usage Notes for PARTITION_AGGR

  • Fields referenced in the PARTITION_AGGR parameters but not previously mentioned in the request will not be counted in column notation or propagated to HOLD files.
  • Using the WITHIN phrase for a sum is the same as computing PARTITION_AGGR on the WITHIN sort field from B (beginning of sort break) to E (end of sort break) using SUM, as in the following example.
    TABLE FILE WF_RETAIL_LITE
    SUM COGS_US WITHIN TIME_QTR AS 'WITHIN Qtr'
    COMPUTE PART_WITHIN_QTR/D12.2M = PARTITION_AGGR(COGS_US, TIME_QTR, B, E, SUM); 
    BY BUSINESS_REGION AS Region
    BY TIME_QTR
    BY TIME_MTH
    WHERE BUSINESS_REGION EQ 'North America' OR 'South America'
    ON TABLE SET PAGE NOPAGE
    END

    The output is shown in the following image.

    With other types of calculations, the results are not the same. For example, the following request calculates the average within quarter using the WITHIN phrase and the average with quarter using PARTITION_AGGR.

    TABLE FILE WF_RETAIL_LITE
    SUM  COGS_US AS Cost 
    CNT.COGS_US AS Count AVE.COGS_US WITHIN TIME_QTR AS 'Ave Within'
    COMPUTE PART_WITHIN_QTR/D12.2M = PARTITION_AGGR(COGS_US, TIME_QTR, B, E, AVE); 
    BY BUSINESS_REGION AS Region
    BY TIME_QTR 
    ON TIME_QTR SUBTOTAL COGS_US CNT.COGS_US
    BY TIME_MTH
    WHERE BUSINESS_REGION EQ 'North America'
    ON TABLE SET PAGE NOPAGE
    END

    The output is shown in the following image. The average using the WITHIN phrase divides the total cost for the quarter by the total count of instances for the quarter (for example, $435,992.00/1514 =$287.97), while PARTITION_AGGR divides the total cost for the quarter by the number of report rows in the quarter (for example, $435,992.00/3 = $145,330.67).

  • If you use PARTITION_AGGR to perform operations for specific time periods using an offset, for example, an operation on the quarters for different years, you must make sure that every quarter is represented. If some quarters are missing for some years, the offset will not access the correct data. In this case, generate a HOLD file that has every quarter represented for every year (you can use BY QUARTER ROWS OVER 1 OVER 2 OVER 3 OVER 4) and use PARTITION_AGGR on the HOLD file.