Project

General

Profile

The ungleich monitoring infrastructure 2024 » History » Revision 3

Revision 2 (Nico Schottelius, 12/17/2023 11:50 AM) → Revision 3/8 (Nico Schottelius, 12/17/2023 11:51 AM)

h1. The ungleich monitoring infrastructure 2024 (WIP) 

 {{toc}} 

 h2. Intro 

 This is a work-in-progress update from [[The_ungleich_monitoring_infrastructure]]. The infrastructure is still based on prometheus + blackbox exporter, but now also makes use of kubernetes native objects. 

 h2. Monitoring definition 

 h3. External primary router/link monitoring 

 * Objective: find out from an external PoV whether the lines are functioning 
 * Implementation: 
 ** Collecting/alerting with prometheus on place12 
 ** blackbox on place12 
 ** blackbox on place11 
 * Targets 
 ** ipv6/router1.place10/snr 
 ** ipv4/router1.place10/snr 
 ** ipv6/server12X.place10/snr 
 ** ipv4/server12X.place10/snr 

 h3. External primary router 

 * Objective: find out whether a router is reachable via any path 
 * Implementation: 
 ** Collecting/alerting with prometheus on place12 
 ** blackbox on place12 
 ** blackbox on place11 

 h3. Test external monitoring 

 * Objective: find out whether the external monitoring is alive 
 * Implementation: 
 ** Collecting/alerting with prometheus on place10 
 * Targets 
 ** ipv6/emonitor1.place12/prometheus 
 ** ipv6/emonitor1.place12/blackbox 
 ** ipv6/emonitor1.place12/alertmanager 
 ** ipv6/vm1.place11/blackbox