Aggregate Functions

Aggregate functions are used in the select list of SELECT statements and use the values of multiple rows to calculate a single value.

ActiveSpaces supports the following aggregate functions:
  • COUNT
  • SUM
  • MIN
  • MAX
  • AVG
  • GROUP_CONCAT
For more information on aggregate functions, see the section on "Aggregate Functions" in Functions Used in The Select List .

Performance of Aggregate Functions

When using an aggregate function as a value expression, the performance of the aggregation depends on how many rows are used in the aggregation. A query that includes aggregation, but does not contain a WHERE clause performs aggregation using all the rows from the table but must be avoided, if possible.

For the best aggregation performance, the column used as an argument to your aggregation function must have an index defined, which uses that column, and your query must include a WHERE clause that causes that index to be selected for the query. For example, if you have a secondary index named lastname_idx defined for the lastname column, the following queries use lastname_idx to reduce the number of rows in the query on which aggregation is performed:
 SELECT COUNT(lastname) FROM mytable WHERE lastname LIKE 'B%'
SELECT lastname, COUNT(lastname) FROM mytable WHERE lastname LIKE 'B%' GROUP BY lastname
For more information on query performance see Tips on Constructing an Efficient WHERE Clause.
Timeouts
On large data sets, calculating the aggregate results can take a long time because large numbers of rows must be processed before the computation can be completed. In such cases, when creating or executing the statement, it is advisable to set TIBDG_STATEMENT_PROPERTY_DOUBLE_FETCH_TIMEOUT to a value that prevents timeouts. For example, if the client request timeout is 5 seconds, a more complex query can set the fetch timeout to a value larger than 5 seconds to prevent a timeout. For more information on Statement properties, see Properties.
Floating Point Calculations in Aggregate Functions
By default, floating point calculations are inexact and the order in which they are carried out can subtly affect the results. When executing queries involving aggregate functions, ActiveSpaces processes the data from the different copysets in parallel leading to slight variations in the results of floating point calculations. Even if the variations might be very small, they can be amplified by rounding the decimal value. For example, the average of a column can be calculated as 80.2849999999999966 one time and 80.2850000000000108 another time. This variation is clearly very small (1.4e-14), but if the results are rounded to 2 decimal places, that is 80.28 and 80.29, the variation appears greater than that.
Proxy Binding
Applications that run large queries that use aggregation must work with their administrator to determine the best approach for binding to proxies in the system. Your administrator can recommend a proxy binding strategy that keeps computation intensive queries from interfering with queries that need to run more quickly by either isolating or distributing your queries across the proxies available to you.
Limitations of the Aggregate Functions
The current support for aggregate functions requires that the argument to each aggregate function must be a column name. For example:
AVG(col1)
The aggregate function argument cannot be an expression or function. For example, the following function arguments are invalid uses of aggregate functions:
AVG(col1 + col2)
MIN(trim(col1))
For SQL keywords that cannot be used with aggregation functions, see Unsupported SQL Features.