Project

General

Profile

Task #6519

Stabilisation spring 2019

Added by Nico Schottelius over 2 years ago. Updated about 2 years ago.

Status:
New
Priority:
Urgent
Assignee:
-
Target version:
-
Start date:
03/10/2019
Due date:
03/27/2019
% Done:

0%

Estimated time:
PM Check date:
03/13/2019

Description

We need to add / change:

  • CHANGE mirror.ungleich.ch cannot be used in cdist types that are used on servers
    • I suggest to introduce "mirror.placeX.ungleich.ch" in each place
    • I suggest to point mirror.placeX.ungleich.ch to the active router IP (IPv6 only)
    • I suggest to use http[s]://mirror.placeX.ungleich.ch in the netboot image and cdist
  • CREATE mirror.placeX.ungleich.ch
    • Install nginx on both routers
    • Ensure that the essential packages are present
      • Devuan packages
      • hwraid repo
      • consul / prometheus (?)
    • Only usable for our own range (i.e. 2a0a:e5c0::/29 and the other v6 network)
    • Use nftables for it
  • code.ungleich.ch cannot be used in cdist types that are used on servers:
    • code.ungleich.ch is a VM
    • if the VM is down, servers don't get configured
    • Suggestion: mirror the ungleich-tools repo to the routers, mirror.placeX..., accessible by http(s)
  • We will try to connect all systems to UPS ONLY
    • My theory is that because they are also connected to the regular grid they experience an outage
    • We started with router1
    • Need to get in touch with Juanito or Bernegger (electricity company) to test whether UPS only setup works
  • Update the monitoring infrastructure:
    • ensure that prometheus (port 9090) is not reachable without authentication
    • Ensure that there is 1 entry point for both monitoring systems
    • Ensure that changes (dashboards) are saved to both monitoring systems
    • Ensure that all production systems are monitored

Objective:

  • Servers only depend on routers
  • If there is no network to the outside, servers are still booted/configured
  • If VMs are down, server are still booted/configured

Jason, can you coordinate this with Dominique (consultant), Jin-Guk (implementation) and Roli/Marc/Sami (learning, understanding)?

History

#1

Updated by Nico Schottelius over 2 years ago

  • Description updated (diff)
#2

Updated by Jason Kim over 2 years ago

  • PM Check date set to 03/11/2019
  • Status changed from New to Seen
#3

Updated by Jason Kim over 2 years ago

  • PM Check date changed from 03/11/2019 to 03/13/2019
  • Status changed from Seen to In Progress
  • Due date set to 03/27/2019
#4

Updated by Mirjana Rupar about 2 years ago

Jason Kim, can you, please, update the status of the task?

#5

Updated by Nico Schottelius about 2 years ago

  • Assignee changed from Jason Kim to Jin-Guk Kwon

Better suited for Jin-Guk

#6

Updated by Nico Schottelius about 2 years ago

  • Priority changed from Normal to Urgent
  • Assignee deleted (Jin-Guk Kwon)
  • Status changed from In Progress to New
  • Project changed from datacenterlight to queue

important, but queued for the moment

Also available in: Atom PDF