How to build an OpenStack alternative: Step 5, adding metadata

This time we describe how virtual machines can get information about themselves like which ssh keys should have access to it.

The previous time we added a database to uncloud.

Motivation

If we were to start VMs without a metadata service, all of the VMs would be looking identical and would not be able to know, whom to allow access to it.

To customise a VM or to make it usable, we need to tell it who has access to it and potentially inject even more information.

Metadata service: how others do it

Enters the metadata service. OpenNebula solves this problem quite nicely by attaching a virtual cdrom to the VMs. That cdrom contains only one file, context.sh. This file contains information about

networking
ssh keys

OpenStack with cloud-init on the other side uses an HTTP based service that is found on the address http://169.254.169.254/.

Both schemes come with disadvantages that we don't want to replicate in uncloud:

In the opennebula case changing metadata information while the VM is running requires to create a new CDROM and if that one is still mounted, the VM might not get the up-to-date information. This is a bit of a theorethical case, as the metadata is rarely re-used after booting.

However changing the information provided in the context.sh inside the ISO always requires to generate a new ISO. While technical possible, not very elegant.

The OpenStack based approach has (from our point of view) a much bigger problem: it relies on IPv4. VMs running on uncloud primarily run IPv6 and should function without any IPv4 stack.

The motivation for using the 169.254.0.0/16 network is clear: it works without having an IP address management system in place.

Solving it the smart way

So it seems like the general approach of OpenStack/cloud-init is actually quite elegant, if it wasn't forcing IPv4.

In the IPv6 world, we always have link local addresses in the fe80::/10 network. Should we just replace the OpenStack approach with IPv6?

We don't think so, it has the same argument in favor for IPv4 networks that we have in favor for IPv6 networks.

Instead, we suggest to add a simple change to the OpenStack approach: Use http://metadata instead of using an IP address.

http://metadata

So how should this work and why is this better than using http://169.254.169.254/?

Using a name, it doesn't matter whether the VM is on an IPv4 ore IPv6 network.

Using just the hostname, not an FQDN (i.e. metadata.example.com) makes it portable.

The name can be resolved via various methods:

uncloud: it will be delivered by DNS
openstack: either via DNS (like uncloud) or if there is no IPAM, it can be statically set in /etc/hosts

In the DNS resolving case, this actually gets even more interesting, because we can use the DNS search path. So while the client tries to resolve the hostname metadata, the underlying resolver library will also look for metadata.example.com, if example.com is in the search path.

uncloud implementation

In uncloud we have implemented a sample metadata service.

However IPAM (i.e. router advertisements) and DNS servers are not part of uncloud and can be used from the regular system infrastructure.

In one of the next versions we plan to include helpers that allow you to bootstrap IPAM and DNS easily.

Status

At this point uncloud can create VMs and the VMs can get the ssh keys that should have access from the metadata service.

With this latest add-on uncloud gets near the range of a usable prototype. A lot of things will probably need to be refactored in the future, but at the moment uncloud supports already:

creating VMs (using qemu)
securing the VM network (using nftables)
generating unique mac addresses (uncloud python code)
storing information in a distributed database (using pytho-etcd3 and etcd)
providing basic metadata inforamtion (uncloud python code)