Project

General

Profile

Actions

ucloud (pre v1)

Preface

The main development work is described in ticked #6869.
This manual is in an early development status.

Notations:

  • $DOMAIN: refers to a dns domain that you will use for running ucloud

Administrator Manual

(being written: how to setup the ucloud infrastructure)

Requirements

ucloud is an opinionated cloud management framework. It is made to solve the problem of "virtual machine hosting" including migrations, scale out, etc. While there are many different ways on how to run, store and distribute VMs, there are not that many ways that work reliable and scale out. For this reason ucloud is based on existing software that has proven to be working for virtual machine hosting.

The following components are required for running ucloud:

1. Python 3.7, pip, pipenv (the programming language of choice)
2. etcd (for data storage)
3. ceph (for storage of VMs)
4. Guacamole (for console access)

Setting up these components is out of scope of this manual.

Base installation

Note: This installation guide assumes that the ~ is /root

  • Install QEMU
    sudo apt install qemu qemu-kvm
    
  • Create Directory Structure
    mkdir /var/vm/
    mkdir /var/www/
    
  • Create User Directories
    mkdir /var/www/ahmedbilal-admin
    mkdir /var/www/nico
    mkdir /var/www/kjg
    cd ~
    
  • Download and place Alpine Linux image in /var/www/ahmedbilal-admin
    wget https://www.dropbox.com/s/5eyryhxun6847hx/alpine.zip?dl=1 -O alpine.zip
    unzip alpine.zip
    mv alpine.qcow2 /var/www/ahmedbilal-admin
    
  • Clone repos
    git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-api.git
    git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-scheduler.git
    git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-vm.git
    git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-file-scan.git
    git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-image-scanner.git
    

Env file installation

  • Run "ucloud-api" as daemon
    cd ucloud-api
    pipenv install
    pipenv run gunicorn --bind [::]:80 main:app --daemon
    wget https://www.dropbox.com/s/lyu3atymxyw3846/setup.py?dl=1 -O setup.py
    pipenv run python setup.py
    cd ~
    
  • Run "ucloud-scheduler" as daemon
    cd ucloud-scheduler
    pipenv install
    pipenv run python main.py &>/dev/null &
    cd ~
    
  • Run "ucloud-vm" as daemon
    cd ucloud-vm
    pipenv install
    pipenv run python main.py /v1/host/1 --vm True &>/dev/null &
    cd ~
    
  • Install ucloud-file-scan
    cd ucloud-file-scan
    pipenv install
    cd ~
    
  • Install ucloud-image-scanner
    cd ucloud-image-scanner
    pipenv install
    cd ~
    
  • Run crontab -e and put these entries there and save by pressing Ctrl+x then y then enter
    */1 * * * * (cd /root/ucloud-file-scan && /usr/local/bin/pipenv run python main.py)
    */1 * * * * (cd /root/ucloud-image-scanner/ && /usr/local/bin/pipenv run python main.py)
    

User Manual

(being written: how to use ucloud as a user)

Creating a VM

(to be filled in by ahmed)

Viewing the console of the VM

  • Go to console.$DOMAIN
  • Login with your username password
  • Select the VM that you want to view

Deleting a VM

API reference

(to be moved here by Ahmed)

ucloud-api

Outside world (indirect) communication with the internal systems.

Its responsibilities are
  • Create VM (Done)
  • Delete VM (Done)
  • Status VM (Done)
  • Create new network
  • Attach network to VM
  • Detach network from VM
  • Delete network

ucloud-scheduler

It schedules/reschedules VM that are created using ucloud-api. How does it schedules? It does by watching any changes under the /v1/vm/ prefix. Basically, it deals with two ETCD entries

1. Virtual Machines Entries that looks like

/v1/vm/1
{
  "owner": "ahmedbilal-admin",
  "specs": {
    "cpu": 20,
    "ram": 2,
    "hdd": 10,
    "ssd": 10
  },
  "hostname": "",
  "status": "REQUESTED_NEW" 
}

2. Hosts Entries that look like
/v1/host/1
{
    "cpu": 32,
    "ram": 128,
    "hdd": 1024,
    "ssd": 0
    "status": "UP" 
}

It loop through list of host entries found under /v1/host/ and check whether we can run the incoming VM on that host. if we can, then it (ucloud-scheduler) sets the hostname key of incoming VM to that host. Suppose, our scheduler decides that we can run the earlier shown VM /v1/vm/1 on /v1/host/1. Then, our updated entry would look like

/v1/vm/1
{
  "owner": "ahmedbilal-admin",
  "specs": {
    "cpu": 20,
    "ram": 2,
    "hdd": 10,
    "ssd": 10
  },
  "hostname": "/v1/host/1",
  "status": "REQUESTED_NEW" 
}

Note the hostname key/value pair.

ucloud-vm

It is responsible for creating / deleting / starting / shutdown / suspending / resuming / migrating and monitoring virtual machines. It does that by watching the /v1/vm/ prefix for events and act upon that.

Current Problems

Intermingling of Actual State + Requested State

Currently, we have the following states that a VM can have

  1. REQUESTED_NEW (Assigned to brand new VM)
  2. SCHEDULED_DEPLOY ( ucloud-scheduler assign some host to the VM. ucloud-vm look for such VM and create its image file in CEPH and set its status to REQUESTED_START )
  3. REQUESTED_START ( ucloud-vm starts the VM)
  4. RUNNING (VM is running now)
  5. REQUESTED_SUSPEND (User requested to suspend execution of VM. If suspension requests is accepted the VM would go to SUSPENDED state otherwise it would remain in the REQUESTED_SUSPEND )
  6. SUSPENDED (VM is suspended i.e No CPU is being used, but the memory is still owned/maintained, not sure about the network resources yet but assume them still owned/maintained)
  7. REQUESTED_RESUME (User requested to resume execution of VM. If resumption request is accepted the VM would go to RUNNING state otherwise it would remain in the REQUESTED_RESUME )
  8. REQUESTED_SHUTDOWN (User requested to shutdown the VM. If shutdown request is accepted the VM would go to STOPPED state otherwise it would remain in the REQUESTED_SHUTDOWN )
  9. STOPPED (VM is stopped)
  10. KILLED (VM is killed i.e it is detected to be not running when it should be running by ucloud-vm or ucloud-scheduler detects that the host on which the VM is hosted is dead so it set all VM running on that hosts to KILLED )

There are some problems with current design. For Example, we can't track previous state when user requested some action i.e Suppose, our VM is in KILLED state and user issued REQUESTED_SUSPEND . ucloud-vm would check that the VM isn't running so it would reject the request. But, our VM would stay in REQUESTED_SUSPEND . It would also create another problem which is ucloud-scheduler schedules VMs that are either brand new (having status " REQUESTED_NEW ") or status KILLED . But, as status was changed to REQUESTED_SUSPEND because of user's request ucloud-scheduler won't schedule it on any host.

Solution

  1. Have Two Statuses (So, we would check actual state before doing any kind of action)
    • One For Actual State (in our previous case KILLED )
    • The other one is requested state (in our previous case REQUESTED_SUSPEND )
  1. Make ucloud-api more intelligent (For Example, if the VM is not running, ucloud_api should not accept the REQUESTED_SUSPEND i.e we won't change the status of VM from KILLED to REQUESTED_SUSPEND )
    • Check before changing the status of VM whether the request is valid or not.

Updated by Ahmed Bilal 4 months ago · 16 revisions