Zum Inhalt springen
Back to Insights
Monitoring
Published on July 2, 2026
7 min Reading time

Network Monitoring That Thinks: Why Static Thresholds Miss Outages

Florian Hödl

Project Lead

> Summary: For most providers, "monitored 24/7" means a dashboard of red and green dots. That is not enough. A fixed threshold reports an outage only once it has already happened, produces false alarms, and never sees the slow degradation coming. Anexum runs more than 200 connections over fiber, DSL, leased lines, and 5G. To make "monitored" mean more than a green light, we build our monitoring intelligence (internally ANXEngine) around three questions: What happens next? Why is it happening, one site or the whole carrier? And what do we do now?

Why "monitored 24/7" says little at most providers

Almost every managed-service provider puts "24/7 monitoring" on its website. In practice it often comes down to a simple test: does the connection answer a ping? Is the response time above X milliseconds? Does the signal level fall below Y dBm? A value crosses a fixed line, and an alert fires.

This model has three well-known weaknesses:

  • It reports too late. A fixed threshold triggers only once the connection is already impaired. The customer notices the outage at the same moment the monitoring does.
  • It reports the wrong thing too often. In the evening, response times rise across many connections because more traffic flows. A rigid threshold produces false alarms during those hours. Anyone woken without cause too often stops taking the next alert seriously.
  • It does not see slow degradation. An antenna whose signal fades over two weeks stays under any fixed threshold, right up until one Tuesday morning when it does not.

What is "normal" for this exact connection?

Here is the core of the problem: a fixed threshold knows one number, but no "normal". A 40-millisecond response time is a warning sign for a fiber connection and a good value for a rural 5G site. The same threshold is too strict for one connection and too loose for another.

Instead of one number for all, ANXEngine learns what is usual for each individual connection at each time of day. A reading becomes an anomaly only when it deviates from its own pattern, not from an average across hundreds of different sites. Each connection gets its own expected band across the day.

In our own network these patterns are clear: response times rise in the evening hours, in the early morning many connections show brief dropouts from maintenance and backup windows, and mobile connections vary more than fiber. Monitoring that knows these patterns does not alert on every predictable evening peak, and instead notices the deviation that genuinely breaks the pattern.

One outage or a hundred? The question of cause

When twenty connections drop at three in the morning, weak monitoring turns that into twenty alarms. The on-call engineer works through them one by one, searching twenty sites for a local fault that does not exist.

Usually there is a single cause: a carrier disruption. So on every event, ANXEngine checks not just the individual connection but the level above it:

  • Are an above-average number of connections on the same carrier dropping at once? Then it is a carrier disruption, a call to the network operator rather than a technician dispatch.
  • Does it affect connections in the same region? Then it is probably a regional event.
  • Does a single connection deviate from its neighbours? Then the fault is likely on site, at the antenna, router, or cabling.

Twenty separate alerts become one statement with direction: not "twenty sites red", but "carrier disruption, twenty sites affected". That decides who gets the call and how quickly the fault is resolved.

Not every alert matters equally

One connection is not like another. A temporary construction-site connection and the headquarters of a retail company with forty-five workstations must not raise the same alarm. Good monitoring weighs by actual impact: how critical the site is, how many people depend on it, which SLA applies, and how far the value strays from normal.

That puts the outage at the Gold-SLA site at the top of the list and the brief flicker at a minor connection further down. Both stay visible, but in the right order.

QuestionStatic threshold monitoringContext-aware monitoring
When does it alert?When a fixed threshold is crossedWhen a connection deviates from its own pattern
Evening traffic riseFalse alarmRecognised as a normal daily pattern
Slow degradationInvisible until the outageDetected early as a deviation from trend
20 connections drop at once20 separate alarms1 report: carrier disruption, 20 affected
PrioritisationAll alarms equalBy site criticality and SLA

Three questions good monitoring has to answer

That experience shaped the structure behind ANXEngine. It answers three questions in order:

1. What happens next? From a connection's learned pattern, you can estimate where a value is heading. A signal that has been falling for hours is a warning before the connection goes down. The goal is lead time, not a report at the moment of failure.

2. Why is it happening? Correlation across carrier, region, and technology separates the local fault from the network fault. That is the difference between "we send a technician" and "we call the network operator".

3. What do we do? An event in plain language: what is affected, what the likely cause is, and which next step makes sense. An alert is only useful once it is clear what it means.

Where ANXEngine stands today

ANXEngine is our own system and grows with our network. The building blocks that carry it today are the per-connection expected bands and the correlation across carrier and region. They run against the telemetry of our own connections. The predictive layer, calling an incident before it happens, and the generative layer, explaining the event in plain language, we build and validate step by step.

One principle holds throughout: we do not publish a figure we cannot back with real data. A model whose advantage does not prove out against the existing rules in a test run does not go into operation. Monitoring that thinks is work on accuracy, not a switch you flip.

What this means for you

For a company with distributed sites, what counts in the end is not how many metrics a dashboard shows, but whether the right alert reaches the right person at the right time. That is the standard behind the line "monitored 24/7" at Anexum: fewer false alarms, earlier warnings, and a clear statement of whether one site or a whole carrier is the problem.

Monitoring is part of [Managed Connectivity](/en/services/managed-connectivity/): procurement, operation, and monitoring of your connections from one source, with one contact and one invoice. [Let us talk about your sites →](/en/contact/)

Frequently Asked Questions

What does "24/7 monitoring" mean at Anexum in concrete terms?

It means continuously watching your connections against their own normal behaviour, not against a single fixed threshold. Anexum runs more than 200 connections over fiber, DSL, leased lines, and 5G, and evaluates their telemetry continuously to catch deviations early and classify them by cause and criticality.

Why are fixed thresholds not enough for network monitoring?

A fixed threshold has no context. It reports the outage only once it has occurred, produces false alarms on predictable patterns such as the evening traffic rise, and misses the slow degradation that stays under the line. Context-aware monitoring compares each connection with its own pattern across the day.

What is ANXEngine?

ANXEngine is Anexum's internal monitoring intelligence. It learns an expected band for each connection, correlates events across carrier, region, and technology, and ranks alarms by actual impact. The system grows with the network; improvements go live only once they prove out against the existing rules on real data.

How does ANXEngine tell a carrier disruption from a single outage?

On every event it checks the level above the individual connection. When an above-average number of connections on the same carrier or in the same region drop at once, that becomes one report instead of many separate alarms. When a single connection deviates from its neighbours, the fault is likely on site.

Free checklist

Audit your IT infrastructure
across six areas

Network, IT security, cloud readiness, compliance, communication, and monitoring — 90+ checkpoints, structured to tick off. Enter your name and email and the download starts immediately. No newsletter.

IT Infrastructure Checklist (PDF, 10 pages)Six areas, 90+ checkpoints: network, security, cloud readiness, compliance, communication, and monitoring.