Skip to content

Nagios, Parent Hosts, and traceroute on the Internet

Nagios has the - very useful - feature of "parent hosts". If it deems a host A being down, it first checks its parent host, B, and reports A only as down if B is up. This goes back recursively until a host with state "up" is found and only the first "down" host is actually reported. This keeps on-call people from being bombed with alerts in case of major network outages and makes sure that the alerts that are actually sent out do reasonably accurately describe the actual outage.

As an individual who has some "external" servers in various data centers on the Internet, I would like to not be alerted multiple times that my servers at ISP C, D, and E are down if there is an outage at the ISP F hosting my Nagios installation or at one of the various exchange points temporarily rendering the servers unreachable (without me being able to do anything).

The solution sounds easy but is surprisingly hard.

Continue reading "Nagios, Parent Hosts, and traceroute on the Internet"

ping ist boese?

Vermieter von dedizierten Mietservern sind offensichtlich nicht daran interessiert, dass ihre Kunden im Störungsfall in der Lage sind zu diagnostizieren, wo die Störung liegt. Denn sonst wäre es nicht so üblich, auf den Coreroutern nicht auf ICMP echo requests zu antworten. Das ist doof, denn so erzeugt mein Nagios viel mehr Alarme als er müsste.

Bleibt also nur, im Störungfall stets sofort den Anbieter zu nerven - denn er will es offensichtlich so.

P.S. Ich will ein Nagios-Plugin das traceroutes auswerten kann. TTL exceeded verschicken die Corerouter der Serveranbieter nämlich immer.