Project

General

Profile

Task #6541

Monitor fans and power supplies

Added by Nico Schottelius over 2 years ago. Updated 7 months ago.

Status:
In Progress
Priority:
Normal
Target version:
-
Start date:
03/21/2019
Due date:
04/17/2019
% Done:

0%

Estimated time:
PM Check date:
04/10/2019

Description

  • Fans and PSU of switches need to be monitored
    • Probably using snmp for getting metrics
    • Maybe/likely storing in prometheus
  • Fans and PSU of servers need to be monitored
    • probably using local ipmitool
    • Maybe extending node_exporter or writing our own exporter
    • Might already exist
    • Saving in Prometheus

For both appropriate alarms should be created

History

#1

Updated by Jason Kim over 2 years ago

  • PM Check date set to 03/30/2019
#2

Updated by Jason Kim over 2 years ago

  • PM Check date changed from 03/30/2019 to 03/31/2019
  • Status changed from New to Seen
#3

Updated by Jason Kim over 2 years ago

  • PM Check date changed from 03/31/2019 to 04/10/2019
  • % Done changed from 0 to 10
  • Assignee changed from Jason Kim to Jin-Guk Kwon
  • Status changed from Seen to In Progress
  • Due date set to 04/17/2019
#4

Updated by Jason Kim over 2 years ago

  • % Done changed from 10 to 0
  • Assignee changed from Jin-Guk Kwon to Samuel Hailu
#5

Updated by Nico Schottelius over 2 years ago

Not sure if this is a good fit für Samuel

writes:

#6

Updated by Mirjana Rupar about 2 years ago

  • Assignee deleted (Samuel Hailu)
#7

Updated by Mirjana Rupar about 2 years ago

  • Project changed from backlog to queue
#8

Updated by Mirjana Rupar about 2 years ago

  • Status changed from In Progress to New
#9

Updated by Nico Schottelius about 2 years ago

  • Assignee set to Dominique Roux
  • Project changed from queue to Open Infrastructure
#10

Updated by Dominique Roux about 2 years ago

  • Status changed from New to Seen
#12

Updated by Nico Schottelius about 2 years ago

Most important for this task is handing over / explaning to llnu & kjg

writes:

#13

Updated by Dominique Roux about 2 years ago

Current freeipmi version on devuan available: 1.4.11
Version needed for having ipv6 support: >= 1.6.1
...

#14

Updated by Dominique Roux about 2 years ago

Nico Schottelius wrote:

Most important for this task is handing over / explaning to llnu & kjg

writes:

Ok, so in general, the IPMI thing should not be a big of a deal:
I recommend to have the ipmi exporter running only on each monitoring.place{5,6} then using remote ipmi to access the servers. Since in this way we don't need to change the BootOS also we're able to get the data as long as the iDRAC interface is up.

IPMI

  • Getting the required version running on monitoring.place{5,6}
    • Either compile by yourself or check if the prepackaged version for debian buster works
    • Probably best is to move it to our package mirror => Adding a new repository (prob. the ungleich use the ungleich repo) on monitoring and do the installation like this
  • Installing ipmi_exporter on monitoring.place{5,6}
  • Configuring prometheus accordingly
  • Put everything in cdist:
    • Update the monitoring.place{5,6} cdist manifest (to install the new repository and install the required freeipmi package)
    • Update the prometheus config: According to the config from the README
#15

Updated by Dominique Roux about 2 years ago

SNMP

Documentation for enabling SNMP on aristas: https://www.arista.com/en/um-eos/eos-section-43-3-configuring-snmp#ww1159793

The installation for the snmp-exporter is quite similar to the freeipmi:
  • Get the prebuild package from https://github.com/prometheus/snmp_exporter/releases
  • Create a devuan package, move it to the ungleich mirror
    • Don't forget the init.d file
    • Use the same path hierarchy as the other prometheus packages (also log etc.)
  • Install the exporter on monitoring.place{5,6}
  • Create the config file according to the README
  • cdistify everything (similar to IPMI)
#16

Updated by Dominique Roux about 2 years ago

For both services you'll have to open up ports
Open them only from the inside (so only monitoring.place{5,6} is allowed to talk with theses ports)

#17

Updated by Nico Schottelius about 2 years ago

  • Dominique, please approach Jin-Guk Kwon in chat/infrastructure and implement it together with him
  • Jin-Guk Kwon: please read the ticket and ping Dominique Roux when you have understood it -> you'll then implement it together
#18

Updated by Nico Schottelius 7 months ago

  • Assignee changed from Dominique Roux to Nico Schottelius

Also available in: Atom PDF