Software Defined Networking for Clouds

Description

Software Defined Networking (SDN) is a technology that has introduced an important paradigm shift in the networking world. With the OpenFlow protocol as a main technological enabler, the essential goal is to extend the conventional network configuration approach by introducing the concept of network control and programmability.

The advances in the OpenFlow protocol and the strong community involved in the OpenDaylight (ODL) framework has significantly leveraged SDN over the past few years, which booked it a ticket as a de-facto technology in the datacenter network management journey. To follow this initiative, OpenStack has been a pioneer technology that urged to provide a direct SDN support for Neutron. Such approach has introduced new challenges arising from the direct mapping of network traffic between the physical hosts and the virtual tenant networks.

Identifying scenarios that embrace different issues to consider, has a high priority in the current SDN world. With the main focus on SDN-managed datacenter networks, this initiative will provide a technical implementation and know-how on managing cloud-based network resources in a straightforward manner.

Objectives

The “SDN for clouds at the ICCLab” mission involves: establishing use cases, revealing potential issues, analyzing alternative approaches and optimizations in order to achieve efficient networking for classical datacenters, network carriers, Internet Service Providers and Cloud providers. The tasks to achieve this include:

  • Provide on-demand, scalable, commodity deployment to facilitate SDN knowledge transfer to academia and business partners
  • Provide Network as a Service for the tenants
  • Monitor and optimize intra-cloud-traffic
  • Automate changing flows with the SDN-controller
  • Minimize complexity of the network logic
  • Efficient handling of QoS and QoE network parameters
  • Independent network-hardware vendors

Research Challenges

The on-going technology and protocols applied to cloud networking are not optimal in terms of resource usage, reliability, deployment and maintenance. For example, the current implementation of Open Stack Neutron relies on different tunnelling mechanisms in order to provide isolation and multi-tenancy support. From a network application developer point of view, this is inefficient since it injects additional overhead and impedes a transparent application development.

To address the issues in that context, we define the following research tasks:

  • Reconsider the current concepts and state of the art proposals and determine a sophisticated solution towards optimized SDN design for modern cloud architectures
  • Define competitive use cases as direct controllers and evaluators of our SDN solution
  • Provide a high level framework for management of cloud based network resources in a uniform manner

Relevance to current and future markets

Having in-house deployment implies an up and running environment prepared to leverage ideas deployed and tested over commodity-hardware. The ICCLab SDN testbed will essentially facilitate the validation of use cases towards comprehensive solutions. The high-level framework on top of the ODL controller will provide smart virtual datacenter management in OpenStack deployments, and potentially target industry partners among the content delivery network companies, like Akamai for example, IPTV and streaming service providers. We also aim to expand the cooperation by exchanging technical expertise with industry partners involved in the SDN-Cloud field.

Impact

Articles and Info

Contact Point

Irena Trajkovska – mailto:traj@zhaw.ch

 

 

High Availability on OpenStack

Motivation for OpenStack High Availability

ICCLab’s MobileCloud Networking solution is supposed to offer private cloud services to end users. MobileCloud is based on OpenStack. Since our OpenStack installation is supposed to be used mainly by end users, it is necessary to provide High Availability.

As mobile end users we all know that we want our IT services to be available everytime and everywhere – 24 hours per day, 7 days per week, 365 days per year. End users normally don’t reflect that this requirement is challenge for system architects, developers and engineers who offer the IT services. Cloud components must be kept under regular maintenance to remain stable and secure. While performing maintenance changes, engineers have to shut down components. At the same time the service should still remain available for the end user. Achieving High Availability in a cloud environment is a very complex and challenging task.

Requirements for OpenStack High Availability

For delivering High Availability on an OpenStack environment there are different requirements:

  • Availability of a cloud service is the result of the availability of all its participating components. An app hosted in the cloud is only available if its supporting OS is available. The OS is only available if its underlying virtual or physical server is available. And everything breaks down if the network devices between service user and service provider break down. If one crucial component participating in the service fails, the whole service becomes unavailable. Therefore “High Availability on OpenStack” means High Availability on all components managed by OpenStack.
  • To maintain availability of service componenets, it is necessary to implement redundancy. If a crucial service component fails, a redundant component must take over its function to maintain availability of the service.
  • There’s a trade off between redundancy and costs: if you establish redundancy of MobileCloud service by doubling its components you double the overall availability of the service, but you also double the costs of the service.
  • 100%-Availability is an illusion since no service component can be available all the time. A better solution is to define availability levels or classes of availability for every component that define the possible idle time of service components. Availability classes have to be assigned to service components according to their importance to the total availability of the service.
  • High Availability is related to the concept of Event Management. An event is Service components must be able to react to events that could lead to outages in order to maintain their stability.
  • High Availability closely depends on monitoring tools. High Availability can only be implemented if outages and events which are harmful to availability of components can be monitored. The High Availability on OpenStack project depends on Monitoring on OpenStack project.
  • The High Availability solution for the OpenStack installation must contain the following parts: architecturial overview of all components (virtual and physical servers, network devices, operating systems, subsystems and software) that are crucial for service operation, assignment of availability levels for all those components, redundant components, a monitoring tool that captures events (traffic, load, outage etc.) and an event management systems that reacts to events.
  • Availability information of the monitored resources must be assignable to its tenant.
  • The metered values must be collected and correlated automatically.
  • The collection of values must be able to trigger events.
  • The event management system must be able to drive changes (e. g. switch traffic to a redundant device) in the service architecture and reconfigure components automatically.
  • Monitoring tool and event management system must be as generic as possible to ensure support of any device.
  • The monitoring tool and event management system must offer an API.

Architecture

OpenStack_HA

OpenStack High Availability Architecture

As-is state

Currently an extended version of the Ceilometer monitoring tool is used for the OpenStack environment of the ICCLab. An evaluation of possible Event Management functionality is currently performed. There is also an ongoing evaluation on solutions that implement redundancy in OpenStack.

ICCLab Present on Ceilometer at 2nd Swiss OpenStack User Group Meeting

On the 19th February the 2nd Swiss OpenStack User Group Meeting took place. One of the presentations was held on Ceilometer by Toni and Lucas from the ICCLab. They talked about the history, the current and future features, the architecture and the requirements of ceilometer and explained how to use and extend it. You can take a look at the presentation here:

A video of the presentation is available here

Cloud Automation & Orchestration

Description

At the heart of the ICCLab is the Management Server, which provides an easy way to stage different setups for different OpenStack instances (productive, experimental, etc.). The Management Server provides a DHCP, PXE and pre-configured processes which allow a bare metal computing unit to be provisioned automatically, using Foreman, and then have preassigned roles installed, using a combination of Foreman and Puppet. This provides a great deal of flexibility and support for different usage scenarios. Puppet is a key enabler. Puppet is an infrastructure automation system for the efficient management of large scale infrastructures. Using a declarative approach puppet can manage infrastructure lifecycles from provisioning, configuration, update and compliance management. All of these management capabilities are managed logically in a centralised fashion, however the system itself can be implemented in a distributed manner. A key motivation in using puppet is that all system configuration is codified using puppet’s declarative language. This enables the sharing of “infrastructure as code” not only through out an organisation but outside of an organisation by following open source models.

Problem Statement

With the infinite resources available today through cloud computing it is very possible to have large numbers of cloud resources (e.g. compute, storage, networking) delivering services to end users. Managing these cloud resources and the software stacks deploy on top is a huge challenge when the number of resources to configure and mange increase beyond single digits. The only way forward here is to investigate, adopt, improve automated management, configuration and orchestration tools. With automation comes increased testability, reliability (when done right) and ultimately faster times to market as exemplified by continuous integration and DevOps practices.

Articles and Info

There are a number of blog posts detailing how foreman and puppet are used in the ICCLab:

Contact Point

Konstantin Benz

kobeKonstantin Benz is researcher at the ICCLab. His field of expertise ranges from reporting, data analysis and applied statistics to the design of distributed system architectures. His main research interest is system architecture of cloud computing infrastructure.

In 2008 he started his professional career in IT Service Management at IBM, where he worked in several projects that aimed at reengineering and further development of IBM’s internal infrastructure.

In 2011, he moved to the Technical University of Applied Science in Zurich, which meanwhile became part of ZHAW. He participates in an MSE program at the Institute of Information Technology where he studies Service Engineering.

Cloud Interoperability

Description

To be interoperable means to imbue the common abilities of mobility to cloud service instances, to extract all service instance described by a common representation, to share all cloud service instance related data in and out of providers and to allow cloud service instances work together.

To bring interoperability, it must be present at the lowest level of the cloud stack and so IaaS should firstly be the target, with those interoperability capabilities offered to the upper layer of PaaS where lock-in is even more prevalent. To execute upon this, standard specifications need to be agreed upon by both research and industrial domains. In essence this means, in the context of IaaS, to agree upon standardised ways to import and export IaaS customer deployments, to interface with those deployments in a common way during their lifecycle and runtime and to have access to the data supplied and generated and in creating that deployment. These three types of standards must cooperate and integrate as there is no one SDO that can capture research and industry interest and supply the relevant skills all as one. In terms of the IaaS domain this specifically means:

  • Standardised specifications for the import and export of virtualised infrastructure service instances
  • Standardised runtime specification to allow the run-time and life cycle management of virtualised infrastructure service instances
  • Standardised data access, import and export capabilities to the data that created and was generated by the virtualised service instances

Problem Statement

There are many challenges to cloud computing but one core to enabling further value is the removal of lock-in and enabling of interoperability between cloud services. Typical approaches to providing interoperability include setting standards through standards defining organisations such as DMTF, OGF, SNIA. The other approach is providing software tool kits and frameworks such as jClouds, Apache libcloud and fog.io that provide abstract programmatic APIs who’s implementation carries out the semantic and syntactical mapping from the abstract interface to the target cloud service provider’s interface. Where as both approaches provide some uniformity to operating with cloud services, they do not cover other life cycle aspects. One area of investigation within the ICCLab is how to relocate services using one of the two (or potentially both).

Articles and Info

Contact Point

Andy Edmonds