Project

General

Profile

Actions

Task #6519

open

Stabilisation spring 2019

Added by Nico Schottelius over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
Urgent
Assignee:
-
Target version:
-
Start date:
03/10/2019
Due date:
03/27/2019 (over 2 years late)
% Done:

0%

Estimated time:
PM Check date:
03/13/2019

Description

We need to add / change:

  • CHANGE mirror.ungleich.ch cannot be used in cdist types that are used on servers
    • I suggest to introduce "mirror.placeX.ungleich.ch" in each place
    • I suggest to point mirror.placeX.ungleich.ch to the active router IP (IPv6 only)
    • I suggest to use http[s]://mirror.placeX.ungleich.ch in the netboot image and cdist
  • CREATE mirror.placeX.ungleich.ch
    • Install nginx on both routers
    • Ensure that the essential packages are present
      • Devuan packages
      • hwraid repo
      • consul / prometheus (?)
    • Only usable for our own range (i.e. 2a0a:e5c0::/29 and the other v6 network)
    • Use nftables for it
  • code.ungleich.ch cannot be used in cdist types that are used on servers:
    • code.ungleich.ch is a VM
    • if the VM is down, servers don't get configured
    • Suggestion: mirror the ungleich-tools repo to the routers, mirror.placeX..., accessible by http(s)
  • We will try to connect all systems to UPS ONLY
    • My theory is that because they are also connected to the regular grid they experience an outage
    • We started with router1
    • Need to get in touch with Juanito or Bernegger (electricity company) to test whether UPS only setup works
  • Update the monitoring infrastructure:
    • ensure that prometheus (port 9090) is not reachable without authentication
    • Ensure that there is 1 entry point for both monitoring systems
    • Ensure that changes (dashboards) are saved to both monitoring systems
    • Ensure that all production systems are monitored

Objective:

  • Servers only depend on routers
  • If there is no network to the outside, servers are still booted/configured
  • If VMs are down, server are still booted/configured

Jason, can you coordinate this with Dominique (consultant), Jin-Guk (implementation) and Roli/Marc/Sami (learning, understanding)?

Actions

Also available in: Atom PDF