The “Robot Operating System” (ROS) is widely used on several robotics platforms, and also runs on the turtlebot robots in our lab. One of the ideas behind cloud robotics is to enable ROS components (so called ROS nodes) to run distributed across the cloud infrastructure and the robot itself, so we can shift certain parts of the robotics application to the cloud. As a logical first step we tried to run existing ROS nodes, such as a ROS master in containers on Kubernetes, then we tried to use a proper Platform as a Service (PaaS) solution, in our case Red Hat OpenShift .
OpenShift offers a full PaaS experience, you can build and run code from source or run pre-built containers directly. All of those features can be managed via a intuitive web interface.
However, OpenShift imposes tight security restrictions on the containers it runs.
- Prevention from running processes in containers as root
- Using random user ID for running containers (Support Arbitrary User IDs)
Following the series that we started with the Vamp Blog post, we proceed to take a look of one more of the container management tools which includes running a simple practical example while we pay attention to the main advantages and limitations. This series happens in the context of the work on cloud-native applications in the Service Prototyping Lab to explore how easily developers can decompose their applications and fit them into the emerging platforms.
On this occasion, we inspect Kubernetes, one of the most popular open-source container orchestration tool for production environments. Kubernetes builds upon 15 years of experience of running production workloads at Google. Moreover the community of Kubernetes appears to be the biggest among all the open source container management communities. Kubernetes provides a Slack channel with more than 8000 users who share ideas and are often Kubernetes engineers. Also, one can find community support in Stack Overflow using the tag kubernetes. Inside the Github repository, we can see more than 970 contributors, 1500 watches, 18500 starts and 6000 forks. In the community it is popular to abbreviate the system as K8s.
Following our previous blog post, we are still looking at tools for collecting metrics from an Openstack deployment in order to understand its resource utilization. Although Monasca has a comprehensive set of metrics and alarm definitions, the complex installation process combined with a lack of documentation makes it a frustrating experience to get it up and running. Further, although it is complex, with many moving parts, it was difficult to configure it to obtain the analysis we wanted from the raw data, viz how many of our servers are overloaded over different timescales in different respects (cpu, memory, disk io, network io). For these reasons we decided to try Prometheus with Grafana which turned out to be much easier to install and configure (taking less than an hour to set up!). This blog post covers the installation process and configuration of Prometheus and Grafana in a Docker container and how to install and configure Canonical’s Prometheus Openstack exporter to collect a small set of metrics related to an Openstack deployment.
In one of our blog posts we presented a basic tool which extends the Openstack Nova client and supports executing API calls at some point in the future. Much has evolved since then: the tool is not just a wrapper around Openstack clients anymore and instead we rebuilt it in the context of the Openstack Mistral project which provides very nice workflow as service capabilities – this will be elaborated a bit more in a future blog post. During this process we came across a very interesting feature in Keystone which we were not aware of – Trusts. Trusts is a mechanism in Keystone which enables delegation of roles and even impersonation of users from a trustor to a trustee; it has many uses but is particularly useful in an Openstack administration context. In this blog post we will cover basic command line instructions to create and use trusts.
In one of our projects, we needed to test some mongo based backend functionality: we wrote a small application which comprised of a mongo backend and a python app which communicated with the backend via pymongo. We like the flexibility of mongo in a rapid prototyping context and did not want to go with a full fledged ORM model for this app. Here we describe how we used MockupDB to perform some unit testing on this app. Continue reading
In the Service Engineering research area, we aim at producing high-quality output in terms of software, publications, lecture materials and other results. From time to time, this implies departing from old habits and taking a bit of extra effort to reach new quality levels. For publications, there are excellent tools like LaTeX to achieve a compelling layout and typesetting. Using the standard templates and the rubber tool is enough to produce a distributable PDF quickly. Now, quality and effort are seemingly in a good balance.
[This post originally appeared on the XiFi blog – ICCLab@ZHAW is a partner in XiFi and is responsible for operating the Zurich node.]
As with any open compute systems, security is a serious issue which cannot be taken lightly. XiFi takes security seriously and has regular reviews of security issues which arise during node operations.
As well as being reactive to specific incidents, proper security processes require regular upgrading and patching of systems. The Venom threat which was announced in April is real for many of the systems in XiFi as the KVM hypervisor is quite widely used. Consequently, it was necessary to upgrade systems to secure them against this threat. Here we offer a few points on our experience with this quite fundamental upgrade.
The Venom vulnerability exploits a weakness in the Floppy Disk Controller in qemu. Securing systems against Venom requires upgrading to a newer version of qemu (terminating any existing qemu processes and typically restarting the host). In an operational KVM-based system, the VMs are running in qemu environments so a simple qemu upgrade without terminating existing qemu process does not remove the vulnerability; for this reason, upgrading the system with minimal user impact is a little complex.
Our basic approach to perform the upgrade involved evacuating a single host – moving all VMs on that host to other hosts in the system – and then performing the upgrade on that system. As Openstack is not a bulletproof platform as yet, we did this with caution, moving VMs one by one, ensuring that VMs were not affected by the move (by checking network connectivity for those that had public IP address and checking the console for a sample of the remainder). We used the block migration mechanism supported by Openstack – even though this can be somewhat less efficient (depending on configuration), it is more widely applicable and does not require setup of NFS shares between hosts. Overall, this part of the process was quite time-consuming.
Once all VMs had been moved from a host, it was relatively straightforward to upgrade qemu. As we had deployed our node using Mirantis Fuel, we followed the instructions provided by Mirantis to perform the upgrade. For us, there were a couple of points missing in this documentation – there were more package dependencies (not so many – about 10) which we had to install manually from the Mirantis repo. Also, for a deployment with Fuel 5.1.1 (which we had), the documentation erroneously omits an upgrade to one important process – qemu-kvm. Once we had downloaded and installed the packages manually (using dpkg), we could reboot the system and it was then secure.
In this manner, we upgraded all of our hosts and service to the users was not impacted (as far as we know)…and now we wait for the next vulnerability to be discovered!
[Note: This blog post was originally published on the XiFi blog here.]
One of the main jobs performed by the Infrastructures in XiFi is to manage quotas: the resources available are not infinite and consequently resource management is necessary. In Openstack this is done through quotas. Here we discuss how we work with them in Openstack.
I have introduced Disaster Recovery (DR) services in the tutorial of last ICCLAB newsletter which also made an overview of possible OpenStack configurations. Several configuration options could be considered. In particular, in case of a stakeholder having both the role of Cloud provider and DR service provider, a suitable safe configuration consists in distributing the infrastructures in different geographic locations. OpenStack gives the possibility to organise the controllers in different Regions which are sharing the same keystone. Here you will find the an overall specification using heat, I will simulate same configurations using devstack on Virtual BoX environments. One of the scope of this blog post is to support the students who are using Juno devstack.
There will be a second part of this tutorial to show a possible implementation of DR services lifecycle between two regions.