We organised (a very last minute call) Docker Switzerland User Group Meetup last night at the Swisscom offices in Hardbrücke, Zürich.
Luke Marsden, CTO & Co-Founder of ClusterHQ, gave an insightful talk on “Docker, data and extension”. He spoke about Flocker, a manager for data volumes in Docker allowing for easily moving around storage (e.g., databases, queues and key-value stores) along with containers across different hosts.
Despite a very short notice announcement on the meetup page, we were 19 people listening to Luke and enjoying the delightful cold beer (from us) and afterwards, a delicious round of pizzas (from ClusterHQ). Unfortunately we couldn’t video-capture the talk, but you can see Luke giving a very similar talk at a previous event in London here.
Many thanks to Swisscom for providing space at the short notice and to ClusterHQ for not sending us home with an empty stomach! Thanks to all the kind folks who joined us and made the event fun with their interesting discussion and stories.
The Docker team will be back soon!
The ICCLab organized the International Workshop on Automated Incident Management in Cloud (AIMC’15) in conjunction with EuroSys’15 Conference.
Two members of the lab where present: Victor Munteanu which chaired the workshop and Florian Dudouet presented his papers. Continue reading
Three ICCLab members were in sunny Würzburg for the ACROSS COST action meeting last week. ACROSS stands for “Autonomous Control for a Reliable Internet of Services” and our own TMB is a member of the management committee (MC) for Switzerland.
For those of you unfamiliar with COST actions, they are an instrument for research funding from the EU that provides networking opportunities for researchers.
The meeting was spread over three days with:
- The 1st ACROSS Open Workshop on Autonomous Control for the Internet of Services on Tuesday;
- Task forces and Management Committee on Wednesday;
- Plenary and work group meetings on Thursday.
The keynote speakers for the workshop gave motivating talks that spawned interesting discussions on autonomous control spanning multiple domains including mobile, compute, and application-level quality of experience (QoE). The keynote speakers were: Marco Hoffmann from Nokia Networks, Thomas Zinner from Würzburg University, and Maris van Sprang from IBM.
ICClab’s Giovanni Toffetti (that’s me!) gave a talk on the Mobile Cloud Networking (MCN) project motivation and architecture. Here are the slides I used for the talk: MCN-Vision-Scope-Architecture.
There are many different technologies which can increase availability of a cloud infrastructure. In our newest Techcouting paper we evaluate several HA technologies in order to define a HA architecture for an OpenStack deployment which is part of the XiFi project. HA technologies can be grouped in the following classes:
- Resource monitors that check if IT-services are alive and (sometimes automatically) recover them in case of failure.
- Load balancers that direct end user requests to those resources that are still alive and show reasonable prformance.
- Distributed disks and file systems that increase redundancy of data and help to prevent data loss in case of failure.
- Distributed databases which help to prevent loss of database records.
Every OpenStack component has the purpose to deliver a service to an end user. Availability of a cloud instance is dependent on the availability of the delivered end users services as perceived by end users. If we want to use a HA technology to increase availability of OpenStack we have to analyze dependencies of end user services on IT and infrastructure components. Therefore we created a dependability model of the provided IT services and the business services consumed by end users.
As availability always depends on the requirements that are defined by end users we asked several OpenStack end users in a survey on the importance of each business service. The result is that end users tended to rate “Infrastructure Management” and “Security Management” as the most important services. Therefore we had to ensure that these services have high availability levels.
By linking the importance of the service to the IT components that provide it, we can assign a target availability level to each component. Furthermore we can compare several HA architectures to each other and check the availability levels they can achieve. We built several fault tree diagrams that represent the link of component failures to service outages:
A simulation of service outages by given inputs of failure rates revealed that adding HA technologies to OpenStack can add up to 7-8 percent points to the average availability level of the provided services.
We tested several technologies that belong to one of the HA technology classes. Our evaluation included chances and risks associated with implementing the technology and technological maturity. We assigned each technology a chances, risks and maturity score.
The result of our evaluation is that we prefer to use keepalived, HAProxy, Ceph/RADOS and MySQL Galera as HA technologies to improve availability of our OpenStack installation. These technologies are all open-source. They have been preferred because their performance is not significantly lower than the performance of commercial products, but they are available for free, while commercial products are not. The final HA architecture is able to increase availability levels of all OpenStack services up to three nines – which is a very high availability level in cloud computing.
It is clear that another organization would come to other conclusions when the concrete implementation of a HA technology has to be selected, but the evaluation methodology used in our paper shows how to make more reasonable technology choice decisions by linking end user requirements with system architecture characteristics and rate several architectural alternatives by the availability levels that are reasonably achievable.
In our previous blog posts we mostly focused on virtual machine live migration performance comparing pre-copy, post-copy and hybrid approaches in an Openstack context rather than exploring other live migration features. Libvirt together with the Qemu hypervisor provides many migration configuration options. One of these options is a possibility to use tunneled live migration. Recently we found that the current libvirt tunneling implementation is not supported in post-copy migration. Consequently, In order to make the post-copy patch more production ready we decided to support the community and add support for post-copy tunneled live migration to libvirt on our own. This blog post describes the whole story of immersing ourselves into the open source community and hacking an established open source project since we believe this experience can be generalized. Continue reading
Balazs stumbled across the ICCLab when he was looking for a part-time employment during his master studies. He immediately got hooked on the technological challenges of cloud computing. At the Lab, Balazs is working in the Distributed Computing in the Cloud initiative, juggling with several state-of-the-art frameworks. Before he found his way to the ICCLab, he developed Embedded Software for a small company based in a town in the Swiss Alps after having worked in Business Integration for several years.
When he’s not in front of the computer screen, he likes to do volunteering and traveling the world, especially the lesser-known places. He also likes playing the guitar, learning Japanese or involving in Couchsurfing activities (which is not to be confused with being a couch potato!).