How Rolling‑Window Threshold Alerts Work
ThresholdIQ doesn’t fire alerts based on single noisy points. Instead, it evaluates rolling windows of data — the same technique used by Datadog, Grafana, CloudWatch, and enterprise monitoring systems. This guide explains how windows work, how thresholds are applied, and how multi-tier severity is determined.
1. Why Rolling Windows Matter
Single-point spikes are unreliable. A rolling window smooths noise by evaluating a slice of recent data. Every time a new data point arrives, the window slides forward and the engine re-evaluates the rule.
2. What’s Inside a Window?
Each window contains:
- All metric values within the window duration
- Timestamp range (oldest → newest)
- Derived stats: avg, median, p95, min, max, rate-of-change
- Dimension context (Region, Store, Signal ID, etc.)
3. How Threshold Rules Are Defined
A rule in ThresholdIQ looks like this:
IF metric = "error_rate"
AND window = "last 15 minutes"
AND aggregation = "avg"
AND avg(value) > 2.0
THEN fire "Critical" alert
The engine computes the aggregation for each window and compares it to the threshold.
4. Static vs Baseline‑Relative Thresholds
- Static thresholds: Compare directly to a fixed number.
- Baseline-relative: Compare to learned behavior, e.g.
avg > baseline + 3σ.
5. The Alert Evaluation Flow
- New data point arrives.
- Window slides forward.
- Engine computes aggregates.
- Each rule is evaluated against the window.
- Hysteresis/debounce logic prevents flapping.
- Alert is fired with severity and context.
6. Why This Approach Works
- Reduces noise and false positives
- Captures trends, not spikes
- Matches real-world SLO/SLA monitoring
- Supports multi-tier severity (Warning → Critical → Emergency)
Conclusion
Rolling windows make ThresholdIQ feel like a real monitoring system — not a simple “if value > X” script. By evaluating trends, applying baselines, and supporting multi-tier thresholds, you get meaningful alerts that reflect real operational conditions.