Aggregate Processor

Description

With the Aggregate Processor you can aggregate metrics and events for specified fields, and then evaluate those aggregations using defined conditions to send alerts when those conditions are met.

Use

Evaluate Metric or Log event fields using any aggregation strategy such as Sum, Average, Min, or Max and trigger alerts based on specified conditions.

Tumbling windows are a series of fixed-sized, non-overlapping and contiguous time intervals. For example, if you set it to a five-minute tumbling window, the elements with timestamp values [0:00:00-0:05:00) are in the first window. Elements with timestamp values [0:05:00-0:10:00) are in the second window.

A Sliding window has a fixed time length, and it moves forward or “slides” at a time interval smaller than the window’s length. For example, a sliding window can be five minutes long, and slide every one minute and capture five minutes of data. The length of the slide is not user-configurable by user, the system will automatically calculate an appropriate slide based on the window size.

Each Processor input to the Aggregate Processor is a single thread. Inputs from three or more Processors can result in slower processing times.

OptionDescriptionExample
Group BySelect one or more field names. The processor aggregates the data based on a unique set of field values. Uses the Name, Namespace, and Tag fields for the grouping..app or .tags.cluster
Evaluate

Choose the evaluation method for the fields. Note that these evaluation methods only apply to metrics. For logs, you will need to create a custom evaluation method.

  • Add
  • Sum
  • Minimum
  • Maximum
  • Average
  • Set intersection
  • Distribution cocantenation
  • Custom
Window Type
  • Tumbling
  • Sliding
Sliding

Tumbling | Interval in seconds

Range: 1 minute to 25 hours

1800
Sliding | Interval1800
Sliding | Minimum Duration180
ConditionThe conditions to trigger an Alert based on aggregated/evaluated value. Two types of alerting are supported:

Threshold Alert

Compare the aggregated value to a specified threshold value.

Comparison operators:

greater

greater_or_equal

less

less_or_equal

.value.value <greater_or_ equal_to> 90

Change Alert

Set conditions based on how much the aggregated value changed compared to the prior evaluation. This change can be based on % change or absolute value change.

Percent change operators:

percent_change_greater

percent_change_greater_or_equal

percent_change_less

percent_change_less_or_equal

Value change operators:

value_change_greater

value_change_greater_or_equal

value_change_less_or_equal

.value.value <percent_change_greater> 50

.value.value <value_change_greater> 200

Custom Option

If the event isn't an OTEL metrics event (for example, the metric value is not in the path .value.value), you can aggregate the value with custom aggregation logic based on Mezmo's JavaScript framework. The topic for the Script Execution Processor provides more details about Mezmo’s JavaScript framework.

For example, if you are looking to sum the error_count property of all log events, you would use this script:

Javascript
Copy

With a Custom aggregation strategy, it is important to note that the initial value of the accum object is the first event in the window . Your script will only be executed for subsequent events in the window. Each time the script is executed within the window, it will be called with the previous value of accum and the current event . When the window elapses, the value of accum will be emitted as the aggregated event.

For example, if you are looking to aggregate a count of events into a new field:

Javascript
Copy

Metadata Fields

The Aggregate Processor rocessor adds these metadata fields when an event is emitted.

Metadata Field
.metadata.aggregate.flush_timestamp

The time when the Processor emitted the aggregation event. This could be due to the following:

  • Window time has been completed
  • Triggered by the condition
.metadata.aggregate.start_timestampAggregation window start time
.metadata.aggregate.end_timestampAggregation window end time
.metadata.aggregate.event_count# of events aggregated

Detecting Alert vs Aggregation Output

You can use these fields to determine if the event is triggered due to a threshold breach or a normal aggregation event.

An alert is triggered if

Copy

Normal Aggregation Event if

Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated