This time we describe how to store information in a database and why we selected etcd as the primary database.
The previous time we described how to generate MAC addresses, a key element of uncloud.
We now have a couple of running VMs, we want to remember which VMs are running and also add more information. Who owns a VM? And later also where is the VM running.
We decided to use etcd as our primary database. The main reason for it is that we don't want to add a single point of failure into uncloud and we don't need guarantees provided by standard SQL.
An alternative we still consider is postgresql. While it is not inherently distributed (at all), it also supports storing JSON and has quite a sophisticated messaging system.
Refactoring: phasing in a database
So far we used a couple of python and shell scripts to create the base
of uncloud. Now that things become a bit more serious, we needed to
refactor our code. Shell and python scripts are cleaned up and
become python a proper python module, which we lovely call
Python, ETCD and JSON
We decided to use python-etcd3 to access etcd from the python world, as it supports the API version 3.
For the data format we decided to use JSON, as it is easy to read.
Each VM is identified by a random UUID, so we don't need to store a counter for VMs.
At this point uncloud can create VMs and the VMs are registered in etcd as the database. So while we don't have logic yet for (automatic) VM migration, the information about VMs is already stored in a distributed database.
So if one of our hosts vanishes, we can in theory already redeploy the existing VMs.