Configuring alerts based on the index rate (the retention and storage rate) of your log data provides a way to monitor anomalous behavior in your systems, such as sudden spikes in volume. The Index Rate Alert feature in Mezmo provides actionable insight into which applications or sources produced the data spike, as well as any recently added sources. Index rate alerts also can help managers who are responsible for budget to analyze and predict costs associated with storage.
The alert message includes information about the top 20 sources and apps that have had the largest index rate growth in the past 2 hours, as well as a list of the 20 newest sources added. The alert also includes a link to download generated lists of all applications and sources. You can view the “Manage Usage” Dashboard to download these lists at any time.
The Index Rate Alert page gives you an overview of your account's current rate of indexing, and also the current rate of ingestion, so that you can see the ratio between the log lines that Mezmo is ingesting versus the number of log lines that are being stored. This can answer the question of "how much of what I ingest actually gets indexed?"
There are two types of index rate alerts:
- Max lines/s: an “excessive flow rate” alert which measures, in lines per second, the log lines indexed (stored).
- Max z-score: an alert for anomalous flow rates, based on the number of standard deviations away from the "normal" baseline of the last 30 days.
To configure an alert you define a threshold value that, when surpassed, triggers the alert. In order to define that threshold, it is helpful to understand what is considered the normal, or standard, index rate for your environment.
Mezmo collects index metrics for a rolling 30-day period, and calculates the index rate for each hour in that 30-day time frame. Knowing the "standard" hourly index rate allows Mezmo to identify spikes, or deviations, in volume relative to standard index volume. The severity of a spike is measured in "standard deviations"; that is, how far away from the normal index rate is the spike?
The Index Rate Alert page displays a graph that visualizes the index rate for your log data over the past 7 days. You can toggle the graph to view visualizations of either:
- Index Rate: the daily average index rates contrasted to the daily minimum and maximum index rates
- Standard Deviation: the daily average index rates contrasted to the standard range from the past 30 days
You can also toggle between viewing index rates for the past 30 days, the past seven days, or the past one day (measured as the past 24 hours).
For more details about how Mezmo measures and analyzes rate indexing metrics, refer to Metrics definitions and terminology below. And for information about analyzing your data usage on the Usage Dashboard, refer to our documentation.
- One alert per 60-minute/24-hour time frame: Mezmo will only send one alert (per alert channel) until another 60 minutes/24 hours has elapsed. This alert will indicate all thresholds crossed at that time. So if you configure Index Rate Alerts to be sent on each threshold separately, then you will get an alert indicating whichever threshold is crossed first (either Max lines/s or Max z-score, or both).
This section covers how to define alerts and specify recipients for the alerts.
To create an index rate alert, follow the steps below.
- Open the Index Rate Alerts page by navigating to Settings -> Usage -> Index Rate Alerts.
- On the Index Rate Alerts page, decide if you want to set alerts based on Max lines/s (the average lines per second over the current rolling hour), the Max z-score (based on the number of standard deviations away from the "normal" baseline of the last 30 days), or both.
- Typically, setting a maximum of 3 standard deviations as a threshold for triggering an alert is a good starting point.
- Alerts for Max z-score, which are based on anomalous index rates, require that at least 30 days of data have been ingested and analyzed to produce the standard deviation figure.
- Select if you want to Alert on each threshold separately or Alert only when both thresholds have been exceeded.
Selecting to alert only when both thresholds are passed could be useful in preventing false-positives, but assumes a high confidence in the accuracy of the thresholds. A false-positive could occur if you set a max lines/s threshold, but then the overall log ingestion increases over time; this might trigger alerts prematurely. But requiring that both the lines/s and the std devs values be surpassed would prevent the premature alert.
- Add alert recipients, and select which notification channels (email, Slack, PagerDuty) to use.
- Specify the Alert Frequency (hourly or daily).
- Optionally, define a custom schedule of specific days and times when you want to receive alerts.
Note that after configuring your alerts, Mezmo requires approximately 15 minutes before the alerts are fully implemented.
- historical hours: Mezmo records the ingest and index rates for fixed hours, e.g. June 22 from 1 pm to 2pm. These rates for the “historical hours” are used to calculate a rolling 30 day average index rate, as well as a rolling 30 day std deviation after the collection of 30 days of data.
- index rate: This rate is the average value of all index rates measured over the last moving (rolling) hour. Mezmo records the number of log lines that are indexed (log lines that are retained/stored) every five minutes, in order to gain the total number of lines indexed over the hour, which divided by 60 yields the average per minute, which then divided by 60 again yields the average index rate, per second, for that specific hour. If no log lines are ingested for over a 60-minute period, you will see a "Check Later" message instead of the index rate value. After log lines resume, the "Check later" message is replaced with the index rate value. Note that because measurements are taken every five minutes, the first measurement of ingested log lines for that five minutes will be averaged out for the full hour. We suggest that any time there is an interruption in ingestion, wait at least an hour before using the index rate value for analysis.
- ingest rate: The ingest rate measures the average hourly rate at which log lines are ingested, or consumed, by Mezmo. To determine the ingest rate, Mezmo applies the same formula as used to determine the index rate average (finding the average rate across the moving hour using five-minutes increments).
- lines/s: The lines-per-second value is the measurement of the average number of lines that were indexed, per second, over the current rolling hour.
- no data: On the graph for average index rates, any area on the graph for which no data was collected will be marked as ”no data.”
- std deviation: The standard deviation is calculated by comparing each historical hour in a rolling 30 day period with the average of those same hours.
- z-score: The z-score is the count of standard deviations. Standard deviations are measured by assessing the hourly index rate averages of the past 30 rolling days in order to contrast that historic hourly average against the current hourly average to identify significant spikes.
Updated about 2 months ago