Aggregation Rules

You can define aggregation rules to optimize the performance of aggregation (GROUP BY) queries.

After you create and enable aggregation rules, the system pre-computes the aggregations as and when log events arrive in the system. The aggregations are in turn used to compute search query results. As time progresses, the aggregations are pre-computed at real time, providing query results much faster than the queries that were not optimized. Such optimized aggregation queries can be vital for creating responsive dashboards.

An aggregation rule is defined as a regular EQL or SQL query that contains a GROUP BY statement and aggregated projections. For details, see GROUP BY Statement.

Note: Advanced Features must be enabled to use aggregation rules.

For example, consider the data model for BW process information. This data model has columns such as Domain, AppSpace, Application, Process, Activity, exectionTime, success, and failure.

Consider the aggregation query as:
USE BWProcesses | GROUP BY Domain, AppSpace,  Application, Process COLUMNS avg(executionTime), max(failure), min(success)
For this search query, the aggregation rule maintains the aggregate metrics avg(executionTime), max(failure), min(success) for each combination of the GROUP BY fields: Domain, AppSpace, Application, and Process. The computation happens on every relevant log event ingested in the system at real time.

When you use an aggregation query for which there is a matching aggregation rule, the query is a candidate for optimization. If pre-computed metrics are already present, the results are returned without any query-time computation.

Such optimized aggregation queries are very sensitive to time filter conditions. In the query matching an aggregation rule, the time filter condition must be greater than the rule creation time and the rule retention period. For example, suppose the rule was created at 10 am on Jan 1, with a retention period of 2 weeks. After 2 days, on Jan 3, if you query for aggregated data which matches the rule but with filter condition of -2w, the result should still be un-optimized because 2 weeks from the time of search will result in time range starting from Dec 19, but the aggregation rule was not created on Dec 19.

Aggregation rules also support grouping by time aggregates. The supported time aggregations are:

  • Group by Years: using the scalar function years(sys_eventTime)
  • Group by Months of the year: using the scalar function months(sys_eventTime)
  • Group by Weeks of the month: using the scalar function weeks(sys_eventTime)
  • Group by Days of the week: using the scalar function days(sys_eventTime)
  • Group by Hours of the day: using the scalar function hours(sys_eventTime)
  • Group by Minutes of the hours: using the scalar function minutes(sys_eventTime)
  • Group by Seconds of the minute: using the scalar function seconds(sys_eventTime)
For example:
USE BWProcesses | GROUP BY Domain, AppSpace,  Application, Process, days(sys_eventTime), hours(sys_eventTime), minutes(sys_eventTime) COLUMNS avg(executionTime), max(failure), min(success)
This rule computes the metrics avg(executionTime), max(failure), min(success) for each combination of Domain, AppSpace, Application, and Process across each possible minute of the hour, each hour, and each day.

Such queries retrieve the time series data for trend analysis. For example, avg(executionTime), max(failure), min(success) for AppSpace=‘appSpace1’ AND Process='Process1' AND Domain='domain1' aggregated for each hour for the day='Monday'.

This query can be used in the Advanced Dashboard to create a time series chart showing the trend, for example, average execution time for a process across hours of the day.

Tip: After creating an aggregation rule, a filter Blok is automatically saved in the system. You can use the filter Blok in Advanced Dashboards instead of typing the entire query to save on query response time.
An aggregation query can be equal to or subset of another rule query if all of the following conditions are met:
  • The FROM clauses are equal
  • The GROUP BY non-time aggregates are exactly equal
  • The time aggregates are equal or subset
  • The projection aggregates of the two queries are equal or subset

Examples

  • Rule1 query is:
    use LogLogic_Monitor_Cpu | GROUP BY ll_nodeId, weeks(sys_eventTime), days(sys_eventTime), hours(sys_eventTime) COLUMNS max(ll_systemCPU), max(ll_processCPU), avg(ll_systemCPU), avg(ll_processCPU)
    If a search query 'Search1' query contains the USE clause as
    use LogLogic_Monitor_Cpu
    then the USE clause in Search1 is equal to that of Rule1.
  • Rule1 contains the GROUP BY clause as: ll_nodeId, weeks(sys_eventTime), days(sys_eventTime), hours(sys_eventTime) and Search1 contains the GROUP BY clause as: ll_nodeId, weeks(sys_eventTime), then the Search1 GROUP BY clause is a subset of the Rule1 GROUP BY clause.
  • Rule1 contains the projection aggregates as: max(ll_systemCPU), max(ll_processCPU), avg(ll_systemCPU), avg(ll_processCPU) and Search1 contains the projection aggregates as: max(ll_systemCPU),avg(ll_processCPU), then the Search1 projection aggregates are a subset of Rule1.

Results of an Aggregation Query

When the query that you type on the Advanced Search page matches an aggregation rule and the time filter value is within the optimization criteria of the aggregation rule, optimized results are displayed.

When the query on the Advanced Search page matches an aggregation rule but the time range is not within the time optimization criteria of the aggregation rule, optimized results are not displayed. The notification The execution of this query was not optimized due to mismatching optimization criteria is displayed on the screen.

When the query on the Advanced Search page matches an aggregation rule but the aggregation is not possible, optimized results are not displayed. The notification Query Optimization unavailable at this time is displayed on the screen.