Task #7185
closed
Setup network monitoring system on new off-site VPS
Added by Nico Schottelius about 5 years ago.
Updated 11 months ago.
Description
Objective¶
- We want to have a live update similar to https://status.ungleich.ch/ running on the VPS.
- We want to do this via some open source monitoring.
- We want to check ping, http and maybe https reachability
- The information should be public
Hints¶
Checkout on one of our VMs, if opennms allows to create a public interface.
Checkout the list of systems on https://www.pcwdld.com/linux-network-monitor-software-and-tools#OpenNMS
Another interesting project that we already have running is smokeping
- Easiest way, reproduce what we already have:
- setup prometheus
- setup grafana
- setup blackbox-exporter
- create a public dashboard
-> checkout the tools, discuss with me in the chat which one to use
Files
- Related to Task #7184: Create a mailing list for maintenance notifications added
- Description updated (diff)
- Status changed from New to Seen
- Assignee changed from Ahmed Bilal to Nico Schottelius
you can also add another name to the VPS like
monitoring.place11.ungleich.ch ;-)
redmine@ungleich.ch writes:
- Assignee changed from Nico Schottelius to Ahmed Bilal
Seems not to be solved:
monitoring.place11.ungleich.ch’s server IP address could not be found.
redmine@ungleich.ch writes:
Ah, sorry, thought it was assigned to me!
/etc/prometheus/blackbox.yml
modules:
http_2xx:
prober: http
http:
http_post_2xx:
prober: http
http:
method: POST
tcp_connect:
prober: tcp
pop3s_banner:
prober: tcp
tcp:
query_response:
- expect: "^+OK"
tls: true
tls_config:
insecure_skip_verify: false
ssh_banner:
prober: tcp
tcp:
query_response:
- expect: "^SSH-2.0-"
irc_banner:
prober: tcp
tcp:
query_response:
- send: "NICK prober"
- send: "USER prober prober prober :prober"
- expect: "PING :([^ ]+)"
send: "PONG ${1}"
- expect: "^:[^ ]+ 001"
icmp:
prober: icmp
/etc/prometheus/prometheus
global:
scrape_interval: 10s # can be overridden by setting scrape_interval in a job
evaluation_interval: 30s # for rules
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
# monitor: 'place5-prod'
scrape_configs:
- job_name: 'routers'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets:
- router1.place6.ungleich.ch
- router2.place6.ungleich.ch
- router1.place5.ungleich.ch
- router2.place5.ungleich.ch
- router3.place5.ungleich.ch
- router4.place5.ungleich.ch
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: post.ungleich.ch:9115
- job_name: 'core-services'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- https://ungleich.ch
- https://monitoring.place5.ungleich.ch
- https://monitoring.place6.ungleich.ch
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: post.ungleich:9115
rule_files:
- /etc/prometheus/*.rules
alerting:
alertmanagers:
- consul_sd_configs:
- server: '127.0.0.1:8500'
services:
- alertmanager
- File Ungleich-1570814496841.json added
- File deleted (
Ungleich-1570814496841.json)
- Assignee changed from Ahmed Bilal to Nico Schottelius
- Status changed from Seen to New
- Assignee changed from Nico Schottelius to ll nu
Well. done. Balazs, can you confirm/ensure that you can
- reproduce the setup
- understand how monitoring is done there
- can make changes to prometheus and grafana
Additionally, please create a new ticket for creating an email account on the external system and configure alertmanager as follows:
Send alerts to
- Status changed from New to Seen
ABK is added to sre@
imap mailbox creation is pending
- Assignee changed from ll nu to Nico Schottelius
- Status changed from Seen to Rejected
Also available in: Atom
PDF