Ucloud » History » Revision 15
Revision 14 (Ahmed Bilal, 07/30/2019 12:41 PM) → Revision 15/16 (Ahmed Bilal, 07/30/2019 01:45 PM)
h1. ucloud (pre v1) {{toc}} h2. Preface The main development work is described in ticked #6869. This manual is in an *early development status*. Notations: * $DOMAIN: refers to a dns domain that you will use for running ucloud h2. Administrator Manual (being written: how to setup the ucloud infrastructure) h3. Requirements ucloud is an opinionated cloud management framework. It is made to solve the problem of "virtual machine hosting" including migrations, scale out, etc. While there are many different ways on how to run, store and distribute VMs, there are not that many ways that work reliable and scale out. For this reason ucloud is based on existing software that has proven to be working for virtual machine hosting. The following components are required for running ucloud: 1. Python 3.7, pip, pipenv (the programming language of choice) 2. etcd (for data storage) 3. ceph (for storage of VMs) 4. Guacamole (for console access) Setting up these components is out of scope of this manual. h3. Base installation **Note:** This installation guide assumes that the ~ is /root * **Install QEMU** <pre><code class="shell"> sudo apt install qemu qemu-kvm </code></pre> * **Create Directory Structure** <pre><code class="shell"> mkdir /var/vm/ mkdir /var/www/ </code></pre> * **Create User Directories** <pre><code class="shell"> mkdir /var/www/ahmedbilal-admin mkdir /var/www/nico mkdir /var/www/kjg cd ~ </code></pre> * **Download and place Alpine Linux image in _/var/www/ahmedbilal-admin_** <pre><code class="shell"> wget https://www.dropbox.com/s/5eyryhxun6847hx/alpine.zip?dl=1 -O alpine.zip unzip alpine.zip mv alpine.qcow2 /var/www/ahmedbilal-admin </code></pre> * **Clone repos** <pre><code class="shell"> git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-api.git git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-scheduler.git git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-vm.git git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-file-scan.git git clone --recurse-submodules git@code.ungleich.ch:ungleich-public/ucloud-image-scanner.git </code></pre> h3. Env file installation * **Run "ucloud-api" as daemon** <pre><code class="shell"> cd ucloud-api pipenv install pipenv run gunicorn --bind [::]:80 main:app --daemon wget https://www.dropbox.com/s/lyu3atymxyw3846/setup.py?dl=1 -O setup.py pipenv run python setup.py cd ~ </code></pre> * **Run "ucloud-scheduler" as daemon** <pre><code class="shell"> cd ucloud-scheduler pipenv install pipenv run python main.py &>/dev/null & cd ~ </code></pre> * **Run "ucloud-vm" as daemon** <pre><code class="shell"> cd ucloud-vm pipenv install pipenv run python main.py /v1/host/1 --vm True &>/dev/null & cd ~ </code></pre> * **Install ucloud-file-scan** <pre><code class="shell"> cd ucloud-file-scan pipenv install cd ~ </code></pre> * **Install ucloud-image-scanner** <pre><code class="shell"> cd ucloud-image-scanner pipenv install cd ~ </code></pre> * **Run @crontab -e@ and put these entries there and save by pressing @Ctrl+x@ then @y@ then @enter@** <pre> */1 * * * * (cd /root/ucloud-file-scan && /usr/local/bin/pipenv run python main.py) */1 * * * * (cd /root/ucloud-image-scanner/ && /usr/local/bin/pipenv run python main.py) </pre> h2. User Manual (being written: how to use ucloud as a user) h3. Creating a VM (to be filled in by ahmed) h3. Viewing the console of the VM * Go to console.$DOMAIN * Login with your username password * Select the VM that you want to view h3. Deleting a VM h2. API reference (to be moved here by Ahmed) h3. ucloud-api Outside world (indirect) communication with the internal systems. Its responsibilities are * Create VM *(Done)* * Delete VM *(Done)* * Status VM *(Done)* * Create new network * Attach network to VM * Detach network from VM * Delete network h3. ucloud-scheduler It schedules/reschedules VM that are created using ucloud-api. How does it schedules? It does by watching any changes under the _/v1/vm/_ prefix. Basically, it deals with two ETCD entries 1. Virtual Machines Entries that looks like <pre><code class="javascript"> /v1/vm/1 { "owner": "ahmedbilal-admin", "specs": { "cpu": 20, "ram": 2, "hdd": 10, "ssd": 10 }, "hostname": "", "status": "REQUESTED_NEW" } </code></pre> 2. Hosts Entries that look like <pre><code class="javascript"> /v1/host/1 { "cpu": 32, "ram": 128, "hdd": 1024, "ssd": 0 "status": "UP" } </code></pre> It loop through list of host entries found under _/v1/host/_ and check whether we can run the incoming VM on that host. if we can, then it (ucloud-scheduler) sets the hostname key of incoming VM to that host. Suppose, our scheduler decides that we can run the earlier shown VM _/v1/vm/1_ on _/v1/host/1_. Then, our updated entry would look like <pre><code class="javascript"> /v1/vm/1 { "owner": "ahmedbilal-admin", "specs": { "cpu": 20, "ram": 2, "hdd": 10, "ssd": 10 }, "hostname": "/v1/host/1", "status": "REQUESTED_NEW" } </code></pre> *Note* the hostname key/value pair. h3. ucloud-vm It is responsible for creating (running) / deleting (stopping) / starting / shutdown / suspending / resuming / migrating and monitoring virtual machines. machines It does that by watching watches the _/v1/vm/_ prefix for events any event and act upon that. Currently, we have the following states that a VM can have # *REQUESTED_NEW* (Assigned to brand new VM) # *SCHEDULED_DEPLOY* ( _ucloud-scheduler_ assign some host to the VM. _ucloud-vm_ look for such VM and create its image file in CEPH and set its status to *REQUESTED_START* ) # *REQUESTED_START* ( _ucloud-vm_ starts the VM) # *RUNNING* (VM is running now) # *REQUESTED_SUSPEND* (User requested to suspend execution of VM. If suspension requests is accepted the VM would go to *SUSPENDED* state otherwise it would remain in the *REQUESTED_SUSPEND* ) # *SUSPENDED* (VM is suspended i.e No CPU is being used, but the memory is still owned/maintained, not sure about the network resources yet but assume them still owned/maintained) # *REQUESTED_RESUME* (User requested to resume execution of VM. If resumption request is accepted the VM would go to *RUNNING* state otherwise it would remain in the *REQUESTED_RESUME* ) # *REQUESTED_SHUTDOWN* (User requested to shutdown the VM. If shutdown request is accepted the VM would go to *STOPPED* state otherwise it would remain in the *REQUESTED_SHUTDOWN* ) # *STOPPED* (VM is stopped) # *KILLED* (VM is killed i.e it is detected to be not running when it should be running by _ucloud-vm_ or _ucloud-scheduler_ detects that the host on which the VM is hosted is dead so it set all VM running on that hosts to *KILLED* ) There are some problems with current design. For Example, we can't track previous state when user requested some action i.e Suppose, our VM is in KILLED state and user issued REQUESTED_SUSPEND. _ucloud-vm_ would check that the VM isn't running so it would reject the request. But, our VM would stay in REQUESTED_SUSPEND. It would also create another problem which is _ucloud-scheduler_ schedules VMs that are either brand new (having status == "REQUESTED_NEW") or status == KILLED. But, as status was changed to REQUESTED_SUSPEND because of user's request _ucloud-scheduler won't schedule it on any host. h4. Solution # Have Two Statuses (So, we would check actual state before doing any kind of action) * One For Actual State (in our previous case KILLED) * The other one is requested state (in our previous case REQUESTED_SUSPEND) # Make _ucloud-api_ more intelligent (For Example, if the VM is not running, _ucloud_api_ should not accept the REQUESTED_SUSPEND i.e we won't change the status of VM from KILLED to REQUESTED_SUSPEND) * Check before changing the status of VM whether the request is valid or not.