Project

General

Profile

Task #6869

Updated by Nico Schottelius over 5 years ago

{{toc}} 

 h2. Introduction 

 For proper growth and stability, we need to challenge our setup. We will try to replace opennebula with ucloud and at the same time challenge OpenStack. 

 This is not only a technical, but also a public project, which Sanghee will write the public story of. 

 ucloud supports IPv6 first and might only support IPv4 via NAT64 or proxying. 

 h2. General Requirements: 

 * a fully portable cloud management system that is API based, exposes all internals w/o secret keys 
 ** supporting console via guacamole 
 ** users in ldap 
 ** api authentication in ungleich-otp 
 ** firewalling in ufirewall (or similar) 
 * A great team 
 * To be built in less than 100 days 


 h2. Technology stack 

 * python3 
 ** easy to read 
 * flask 
 ** easy to understand 
 * ldap 
 ** well known 
 * ungleich-otp 
 ** for API authentication 
 * etcd 
 ** storing VMs, networks, etc. 
 * nft (Linux), pf (BSD) 
 * JSON 
 ** describing the data, easy to handle 
 * Prometheus 
 ** For monitoring hosts and VMs 
 * VXLAN 
 ** For networks 
 * Ceph 
 ** as a datastore 

 h2. Technical requirements 

 * ucloud should be portable 
 ** While the primary target is Linux, it should run on FreeBSD/OpenBSD as well 
 * There should be no single point of failure 
 ** APIs should be announced via BGP to the routers 
 ** Switches will then use ECMP to load balance 
 ** All APIs write to a distributed data store (v1: etcd) -> all data is distributed, too. 
 * Fast dead host detection 
 ** Dead hosts should be detected fast, VMs should be rescheduled fast 


 h2. Components 

 h3. User API 

 Entrance point / communication with the user by CLI. Flask based. Allows for the following actions in v1: 

 * Create VM 
 * Delete VM 
 * Create new network 
 * Attach network to VM 
 * Detach network from VM 
 * Delete network 

 h3. User web interface 

 Might be based on original dynamicweb code (https://code.ungleich.ch/ungleich-public/dynamicweb). 

 h3. Scheduler 

 The scheduler knows about hosts, their capacities and their usage. The scheduler decides which VM gets scheduled where. The scheduler is also responsible for rescheduling VMs (f.i. due to another host becoming better for a specific VM). 

 How it works: 

 * Has a list of hosts for usage 
 * Knows about the capacity (installed cores, installed ram) of a host 

 h3. Host manager 

 Manages hosts. If a host crashes, instructs scheduler to restart VMs (host will be selected by scheduler). 
 If a host is added, the scheduler can use it. 


 h3. VM manager 

 Starts and stops VMs. Runs on every VM host. 
 Reacts on information from the scheduler. This component needs to support qemu on Linux and bhyve and co. on OpenBSD/FreeBSD. 

 How it works: 

 * watches a specific key in etcd, for instance: /v1/vm/ 
 * if a key is added, check if it is a VM that should be started on THIS host. 
 ** if yes, start it 
 * if a key is modified, check if a VM that is on this host, should be stopped 


 h3. Network manager 

 Creates and manages virtual layer 2 networks. Basically does the following: 

 * Create VXLAN with correct IPv6 multicast address 

 h3. IPAM 

 Network address manager. Provides dhcpd/dhcp6d/router advertisements. 

 h3. Legacy IP supporter 

 For users who require legacy IP (IPv4), add a service that  

 * adds a 1:1 NAT64 entry 
 * adds a protocol based proxy entry 
 ** http(s) 
 ** smtp(s) 
 ** ssh jumphost 

 h3. Firewall 

 The VMs should not be able to interfere with other VMs or hosts in a malicious way. The following protective measures need to be implemented: 

 * prevent dhcpd answers in public networks 
 * prevent router advertisements in public networks 
 * prevent VM from using incorrect mac address 
 * prevent VM from using incorrect ipv6 addresses 


 Note to Nico: ping reyk for possible involvement 

 h3. Image store 

 Ceph will be used for storing images. 

 Features: 

 * Allow uploading of images 
 * Allow cloning of images (required for starting a VM based on an existing image) 
 * Allow deleting of images 
 ** For cloned images after shutting down the VM 


 h3. Metadata 

 Provides access to to  

 * public ssh keys 
 * other data users provide 

 How it works 

 * VMs need well known entry point 
 * Should likely be DNS based 
 * Might be reachable by http://metadata 
 ** This excludes https! 
 ** Maybe network configuration can contain metadata server? 
 *** dhcp option 
 *** router advertisement? 
 ** Maybe by convention: metadata.$domain 
 *** $domain injected by dhcp/router advertisements 


 h3. Payment service 

 If a user requests a service and the service is not free, the user will be asked to pay for it. Should support at least 

 * credit card 
 * bank transfer 

 Might also manage existing money / coupon / etc. 


 



 h3. CLI 

 All services will be primarily available via API, web is a second class citizien. The CLI might be based on / related to https://code.ungleich.ch/ungleich-public/ungleich-cli. 


 h3. Authentication 

 time based one time tokens as implemented in ungleich-otp will be used for service authentication: https://code.ungleich.ch/ungleich-public/ungleich-otp 

 h2. Order of development 

 First a MVP will be created that can 

 * create and delete VMs 

 Version 2 will then additionally support 

 * network service: so VMs get an IPv6 address 

 Version 3 will then additionally support 

 * metadata service (for injecting ssh keys and more) 

Back