Cloud Application Management

Overview

Currently today, large internet-scale services are still architected using the principles of service-orientation. The key overarching idea is that a service is not one large monolith but indeed a composite of cooperating sub-services. How these sub-services are designed and implemented are given either by the respective business function, as in the case of traditional SOA to technical function/domain-context as in the case of the microservice approach to SOA.

In the end what both approaches result in, is a set of services, each of which carrying out a specific task/function. However, in order to bring all these service units together an overarching process needs to be provided to stitch them together and manage their runtimes. In doing so present the complete service to the end-user and for the developer/provider of the service.

The basic management process of stitching these services together is known as orchestration.

Orchestration & Automation

These are two concepts that are often conflated and used as if they’re equivocal. They’re not but they are certainly related, especially when Automation refers to configuration management (CM; e.g. puppet, chef, etc.).

Nonetheless, what both certainly share is that they are oriented around the idea of software systems that expose an API. With that API, manual processes once conducted through user interfaces or command line interfaces can now be programmed and then directed by higher level supervisory software processes. 

Orchestration goes beyond automation in this regard. Automation (CM) is the process that enables the provisioning and configuration of an individual node without consideration for the dependencies that node might have on others or vice versa. This is where orchestration comes into play. Orchestration, in combination with automation, ensures the phases of: 

  1. “Deploy”: the complete fleet of resources and services are deployed according to a plan. At this stage they are not configured. 

  2. “Provision”: each resource and service is correctly provisioned and configured. This must be done such that one service or resource is not without a required operational dependency (e.g. a php application without its database). 

This process is of course a simplified one and does not include the steps of design, build and runtime management of the orchestrated components (services and/or resources). 

  • Design: where the topology and dependencies of each component is specified. The model here typically takes the form of a graph. 

  • Build: how the deployable artefacts such as VM images, python eggs, Java WAR files are created either from source or pre-existing assets. This usually has a relationship to a continuous build and integration process. 

  • Runtime: once all components of an orchestration are running the next key element is that they are managed. To manage means at the most basic level to monitor the components. Based on metrics extracted, performance indicators can be formulated using logic-based rules. These when notified where an indicator’s threshold is breached, an Orchestrator could take a remedial action ensuring reliability. 

  • Disposal: Where a service is deployed through cloud services (e.g. infrastructure; VMs) it may be required to destroy the complete orchestration to redeploy a new version or indeed part of the orchestration destroyed. 

Ultimately the goal of orchestration is to stitch together (deploy, provision) many components to deliver a functional system (e.g. replicated database system) or service (e.g. a 3-tier web application with API) that operates reliably.

Objectives

The key objective of this initiatives are: 

  • Provide a reactive architecture that covers not only the case of controlling services but also service provider specific resources. What this means is that the architecture will exhibit responsiveness, resiliency, elasticity and be message-oriented. This architecture will accommodate all aspects that answer our identified research challenges. 
  • Deliver an open-source framework that implements orchestration for services in general and more specifically cloud-based services. 
  • Provide orchestration that provides reliable and cloud-native service delivery 

There are other objectives that are more related to delivering other research challenges.

Research Challenges

  • How to best enable and support a SOA, Microservices design patterns? 
  • How to get insight and tracing within each service and across services so problems can be identified, understood? 
  • Efficient management of large-scale composed service and resource instance graphs 
  • Scaling based on ‘useful’ monitoring, resource- and service-level metrics 
    • Consider monitoring system and scaling systems e.g. monasca 
    • How to program the scaling of an orchestrator spanning multiple providers and very different services? 
  • Provision of architectural recommendations and optimisation based on orchestration logic analysis
  • How to exploit orchestration capabilities to ensure reliability? ie, “load balancer for high availability” for cloud applications. How can load balancing service be automatically injected ensuring automatic scaling? 
    • How could a service orchestration framework bring the techniques of netflix and amazon (internal services) to a wider audience? 
    • Snapshot your service, rollback to your service’s previous state 
    • Reliability of the Service Orchestrator – how to implement this? HAProxy? Pacemaker? 
  • Orchestration logic should be able to be written in many popular languages 
  • Continuous integration of orchestration code and assets
  • Provider independent orchestration execution and accomdate many resource/service providers. 
    • Hybrid cloud deployments not well considered. How can this be done? 
    • Adoption of well known standards, openid, openauth and custom providers 
  • Authentication services – how to do this over disparate providers? 
  • How to create market places to offer services. Either the service being orchestrated or that service consuming others. 
  • Integration of business services that service owners can charge clients 
  • Containers for service workloads. Where might CoreOS, Docker, Rocket, Solaris Zones fit in the picture? 
    • If windows is not a hard requirement then it makes sense from a provider’s perspective to utilise container tech. 
    • Do we really need full-blown “traditional” IaaS frameworks to offer orchestration?

Relevance to Current & Future Markets

Many companies’ products aim to provide orchestration of resources in the Cloud, such as Sixsq (Slipstream), Cloudify, ZenOSS ControlCenter, Nirmata… There are also several open source projects, especially related to OpenStack, who touch the orchestration topic: OpenStack Heat, Murano, Solum.

Our market survey established a lack of non-cross domain (different service providers), service-oriented orchestration, with many of them taking the lower-level approach of orchestrating resources directly, and very often on a single provider. One aspect that all these solutions are very different in terms of programming models, however there is a growing interest in leveraging a standards-based orchestration description, with TOSCA being the most talked about. Another identified issue is the lack of reliability of services/resources orchestrated by these products, which is a barrier to adoption this initiative aims to solve. Along with this is that many solutions either have no runtime management or has limited capabilities.

  • In a more general point of view, cloud orchestration brings the following benefits to customers:
    Orchestration reduces the overhead of configuring manually all services comprising a cloud-native application
  • Orchestration allows to get out new updates to a service implementation faster and better tested through continuous testing integration and deployment
  • Reliable orchestration ensures the linkage and composition of services remaining running all the time, even where one or more components fail. This reduces downtime experienced by clients and keeps the service providers service always available.
  • Orchestration brings reproducibility and portability in cloud services, which may run on any cloud provider which the orchestration software controls

Architecture

The key entities of the architecture and their relationships to basic entities are shown in the follow diagram. To understand the complete detailed architecture, click on the picture to get the complete view.

c-orch-arch-entity-model

Related Projects

Contact

Energy Aware Cloud Load Management

Resource Management in Cloud Computing is a topic that has received much interest both within the research community and within the operations of the large cloud providers; naturally, as it has a significant impact on the cloud provider’s bottom line. Much of the work to date on resource management focuses on Service Level Agreements (for different definitions of an SLA); some of the work also considers energy as a factor.

Objectives

The primary objective of this work is to develop an energy aware load management solution for Openstack: variants of this have been proposed before and indeed implemented in other stacks (e.g. Eucalyptus) but no such capability exists for Openstack as yet. As well as realizing the solution, the work will involve deploying a variant of the solution on the cloud platform without impacting the operation of the platform and determining what energy savings can be made. It is worth noting that the classical load balancing approach which is very typical for resource managers in cloud contexts is somewhat contradictory to minimizing energy consumption; consequently, the very standard load management tools are not suitable for minimizing cloud energy consumption.

Research Challenges

The research challenges are the following:

  • How to characterize the load in the system, particularly relating to spikes in demand
  • How much buffer space to maintain to accommodate load spikes
  • How to perform load consolidation – what load should be moved to what machines?
  • When to perform load consolidation – how frequently should it take place?
  • What are the energy gains that can be achieved from such a dynamic system?

Relevance to current and future markets

Advanced resource management mechanisms are a necessity for cloud computing generally. In the case of large deployments, Facebook’s autoscale is an example of how they can be used to achieve energy savings of the order of 15%. In the case of smaller deployments, it is still the case that there are many [[ https://gigaom.com/2013/11/30/the-sorry-state-of-server-utilization-and-the-impending-post-hypervisor-era/ | highly underutilized servers ]] in typical Data Centres and ultimately there will be a need to reduce costs and realize energy efficiencies. The problem is a large, general problem and energy is one specific aspect of it – one of the challenges for this work is how to integrate with other active parts of the ecosystem.

There are some commercial offering which explicitly address energy efficiency in the cloud context. These include:

Impact

Architecture

See the Energy Theme for the larger system architecture.

Implementation Roadmap

The next steps on the implementation roadmap are as follows:

  • Get tunnelled post-copy live migration working with modifications to libvirt (Jan 2015)
  • See if this can be pushed upstream to libvirt
  • Consolidate live migration work into clearer message relating to the potential of live migration (Jan 2015)
  • Devise control mechanism which can be used to provide energy based control (Feb 2015)
  • Deploy and test on Arcus servers (Mar 2015)
  • Determine if it is ready for deployment on Bart/Lisa (April 2015)

Contact

 

Understanding Cloud Energy Consumption

Energy in general and energy consumption in particular is a major issue for the large cloud providers today. Smaller cloud providers – both private and public – also have an interest in reducing their energy consumption, although it is often not their most important concern. With increasing competition and decreasing margins in the IaaS sector, management of energy costs will become increasingly important.

A basic prerequisite of advanced energy management solutions is a good understanding of energy consumption. This is increasingly available in multiple ways as energy meters proliferate: as well as having energy meters on racks, energy meters typically exist in modern hardware and even at subsystem level within today’s hardware. That said, energy metering is something that is commonly coupled to proprietary management systems.

The focus of this initiative is to develop an understanding of cloud energy consumption through measurement and analysis of usage.

Objectives

The objectives of the energy monitoring initiative are:

  • to develop a tool to visualize how energy is being consumed within the cloud resources;
  • to understand the correlation between usage of cloud resources and energy consumption;
  • to understand what level of granularity is appropriate for capturing energy data;
  • to devise mechanisms to disaggregate energy consumption amongst users of cloud platforms.

Research Challenges

Understanding cloud energy consumption does not give rise to fundamental research challenges – indeed, it is more of an enabler for a more advanced energy management system. However, to have a comprehensive understanding of cloud energy consumption, some research effort is required. The following research challenges arise in this context:

  • How to consolidate energy consumption from disparate sources to realize a clear understanding of energy consumption within the cloud environment
  • How to correlate energy consumption with revenue generating services at a fine-grained level (compute, storage and networking)

Relevance to current and future markets

Understanding energy consumption is essential for the large cloud providers as well as for today’s Data Centre providers. Consequently, there are already solutions available which support monitoring of energy consumption of IT resources. Today’s solutions typically do not have specific knowledge of cloud resource utilization and consequently, there is an opportunity for new tools which correlate cloud usage with energy monitoring.

In the Gartner Hype Cycle for Green IT 2014, there are some related technologies which have growth potential over the coming years. Specifically, these are:

  • DCIM Tools
  • Server Digital Power Management Module
  • Demand Response Management Tools

As such, there are future market opportunities for such energy related work. However, we are still evaluating its commercial potential.

Impact

Architecture

TBA.

Implementation Roadmap

This work has largely resulted in a live demonstrator. At present, there is not a significant effort to add more features and capabilities.

The current tasks on the roadmap are:

  • Ensure system is live – maintenance task
  • Periodically review energy consumption
  • Review usage of cloud resources and determine the amount of resources necessary to support this amount of utilization; thus the potential energy saving can be determined.
  • Promote the tool somewhat
  • Presentation at next Openstack Meetup
  • Investigate deployment opportunities

Contact

Apache CloudStack for NFV (ACeN)

Apache CloudStack for NFV (ACeN) is a project that is funded by the Commission for Technology and Innovation.

The ACeN projects seeks to deliver services and prototypes based on the ETSI Network Function Virtualisation (NFV) standard and Apache CloudStack. A novel hybrid load-balancing service (HLBS) will be created and and key NFV demonstrators will be prototyped. All will follow a common architectural approach, on common technology with contributions to open-source communities (under ASL 2.0) by Swiss implementation partners. This work will leverage and can enable access to a market worth up to $2.4 Billion by 2018.

The HLBS will deliver, in a cloud native fashion, the combined two key functionalities of:

  1. Elastic Load Balancing (ELB)
  2. Elastic IPs (EIP)

Proof of CloudStack capabilities will be demonstrated through two NFV prototypes that are of importance to partners, namely:

  1. NFV use case (UC) one through the implementation of an on-demand tenant-based inter-data centre connectivity VPN using CloudStack.
  2. NFV UC five with an on-demand IMS service using CloudStack. IMS.

Contact

 

SafeSwissCloud

Title: Flexible Billing and Cyber Intelligence in Virtual Data Centres

Industry Partner: SafeSwissCloud

Research Partner: ICCLab, ZHAW

Funded byCommission for Technology and Innovation

Summary: Flexible billing in Virtual Data Centers allows Safe Swiss Cloud to easily apply various charging models for its products and services and thus better meet clients needs. Also, it provides its resellers and partners instant access to the flexible billing system and start making business immediately without having to invest in their own billing systems.

Cloud billing models currently in use do not deliver the required flexibility to streamline charging of existing and new cloud services with client’s needs nor do they deliver these capablities to 3rd party providers such as resellers and other partners.
Flexible billing will provide the required flexibility by introducing a highly customizable system. Flexible billing will be provided for Safe Swiss Cloud as well as its resellers and other partners

Cyber intelligence is a powerful security argument which will enable Safe Swiss Cloud to reach a client base with high security requirements. This will lead to additional clients Safe Swiss Cloud would otherwise not be able to acquire. Already today, with its current positioning, Safe Swiss Cloud attracts the interest of clients with high security requirements and consequently has to deliver on that.

The innovations proposed enable new business for resellers and partners to start selling secure Swiss cloud services immediately, with the help of a flexible rating, charging and billing system. The innovations also involve an intelligent mechanisms to detect internal and external security threats by monitoring and learning from resource consumption and network traffic patterns of the cloud.  Together these are designed to create a competitive edge for Safe Swiss Cloud.

safe-swiss-cloud

Giovanni Toffetti Carughi

Giovanni Toffetti Carughi is a senior lecturer in the InIT Cloud Computing Lab and SPLab at the Zurich University of Applied Sciences.

Apart from country hopping, in the last 15 years Giovanni has had is fair share of startup, academic, and large industry research experience.
He graduated in 2001 from Politecnico di Milano (PoliMi) after a 5 years engineering degree, and right away joined WebRatio in the early days of WebML.
He received his PhD in information technology from PoliMi in 2007 with a thesis on modelling and code generation of data-intensive rich internet applications.

He then went on to be a postdoc and a research fellow respectively at the University of Lugano (USI), and University College London (UCL).
In January 2013 he joined the IBM Haifa research labs where he was part of the cloud operating systems team until early December 2014.

During his professional career Giovanni has been involved with different roles in several EU funded projects, namely PLASTIC, RESERVOIR, UNIVERSELF, and FIWARE.

His main research interest are currently cloud-native applications with a focus on elasticity/scalability/availability, web engineering, IaaS/PaaS cloud computing, and cluster schedulers.

At InIT, Giovanni is currently leading the Cloud Robotics initiative.
You can contact him at toff(at)zhaw.ch

FUSECO Forum Asia 2014

FUSECO 2014 BaliICCLab participated in the first FUSECO (Future Seamless Communications) Forum in Asia, held in Bali, Indonesia on the 9th and 10th June 2014. The Forum was organised jointly by PT Telekom Indonesia and Fraunhofer FOKUS. The event was attended by around 300 guests from 14 countries and hosted talks on several technical and business aspects of emerging ecosystems within Smart cities and more.

The two day event featured four keynotes, six technical sessions, one vendor session and a final panel discussion on the challenges and opportunities in the establishment of smart cities infrastructure. In addition to the above, vendor exhibition from Huawei, Fiberhome, ZTE, Cisco and Fraunhofer were presented.

The first day of the Forum began with the opening talk by Prof.Dr. Thomas Magedanz (Fraunhofer FOKUS), Ir. Joddy Hernady and Ir. Indra Utoyo (Telekon Group Indonesia). Following the opening, two keynotes were presented. First by Prof. Dr. Radu P-Zeletin on “ICT Convergence enabling Smart Cities” and the second keynote by Indra Utoyo on “Enabling a Converged World though Ecosystem Solution”.

Three technical sessions followed the keynotes, first being “Digital Lifestyle and Smart City Applications as drivers for mobile broadband network evolution”, chaired by Prof. Alfonso Ehijo (Uni. of Chile), The second session was “Smart City Network- Evolution Path from LTE towards 5G and SDN” chaired by Prof. Rui Aguiar (Universidade de Aveiro, Portugal) and finally the third session “Internet of Things/M2M as backbone of smart cities” was chaired by Dr. Adel Al-Hezmi (Fraunhofer FOKUS, Germany) and Prof. Noel Crespi (Telkom Sued Paris).

The first day ended with a Balinese beach party and buffet for the delegates, giving them a chance to discuss on the subjects opened at the Forum and show a glimpse of the local food, dance and music.

The second day of the Forum started with two keynotes, first on “Mastering the Innovation Challenges of the Future Network Operators in an Emerging IoT World” by Dr. Roberto Minerva (Telekom Italia) and the second by Ir. Rizkan Chandra who talked on the challenges of the transition from a telecommunications company into a digital company.

Like the first day, three technical sessions followed the keynotes. First on “Future Internet Technologies and Enablers as Foundation for Smart Cities” chaired by Serge Fdida (UPMC Sorbonne University, France), sessions 2nd and 3rd (of Day 2) were on “Smart Cities- Best practices and current rollout plans” , 2nd chaired by Dr. Niklas Blum (Fraunhofer FOKUS, Germany) and Prof. Akihiro Nakao (Tokyo University, Japan) and the final session was chaired by Prof. Alfonso Ehijo (Uni. of Chile) and Assist. Prof. Dr. Supavadee Aramvith (Chulalongkorn University, Thailand).

On the 2nd day our own Thomas Bohnert gave a talk on FIWARE – A European Perspective on enabling Future Internet and Smart Cities. He was also one of the panelist on Panel-Smart City Ecosystems, Challenges, and Opportunities.Screen Shot 2015-01-13 at 21.42.00

After the technical sessions concluded, a Dedicated Best Practice Vendor Session brought the Vendor exhibitors (mentioned above) to the stage to present their current engagements in Cloud based IMS deployments, Smart City portals and solutions in China.

The event ended with a final announcement, based on the successful execution of the first FFAsia, that the second FUSECO Forum Asia to be held in Bali again in mid 2015, which will also feature an additional IEEE FUSECO Workshop.

Some pictures to give you a glimpse of the event.Screen Shot 2015-01-13 at 21.41.17

Screen Shot 2015-01-13 at 21.42.34

Managing ceilometer data in openstack

Ceilometer can collect a large amount of data, particularly in a system with large amount of servers and high activity: in such a scenario, the numbers of meters and samples can be large which  affects ceilometer performance and gives rise to quite large databases. In our particular case we are studying energy consumption in servers and how resource utilization (mainly cpu) may relates to overall energy consumption. The energy data is collected through Kwapi and stored in ceilometer every 10 seconds (yes, this is probably too fine-grained!). We had problems that the database accumulated too quickly, filling up the root disk partition on the controller and causing significant problems for the system. In this blog post, we describe the approach we now use for managing ceilometer data which ensures that the resources consumed by ceilometer remain under control. Continue reading

Performance analysis of “post-copy” live migration in Openstack

Previously we described how to  set up post-copy live migration in OpenStack Icehouse (and it should not be a problem to set it up in the same way in Juno). Naturally,  we were curious to see how it performs. In this blog post we focus on performance analysis of post-copy live migration in Openstack Icehouse using QEMU / KVM with libvirt. Continue reading

Swiss FIWARE acceleration conference – ICCLab 5th Dec.14

cabecera_fiware3fi-ware_accelerators

 

 

The first  Swiss FIWARE acceleration Conference was successfully held on 5-Dec 2014 at the Zurich University of Applied Sciences ICCLab. There were around 40 participants from the pubic and private sectors of Switzerland and some from Germany and Italy.
Overall, it was an excellent opportunity to receive relevant introduction on FIWARE enablers and related coming open calls.

The agenda of the day was organised to provide necessary views for understanding the opportunities offered by FIWARE project. The European Commission (Ragnar Bergström) gave an overview of the Future Internet PPP   and its progress in Brazil and Mexico. The speech was followed by a large introduction of FIWARE project, (Thomas M. Bohnert, ICCLab), its eco-system and an introduction on the Generic Enablers.  In the morning Sandro Brunner (ICCLab) gave a first demo on how to utilise FIWARE LAB and how to mash and application up utilising the sensors infrastructure available in Santander city, Spain. The demo was also reprised in the afternoon and followed by a Q&A session.

Since the entire FIWARE platform is strongly based on the cloud, the participants had also the opportunity to see the view of EQUINIX , sponsor of the event.  The speech of Sachin Sony , Equinix UK met the objectives of giving information on the market trends for cloud services and how his company is worldwide involved as a provider of many Over The Top big players. The conference also offered the opportunity to introduce project ideas, from some participants, to the two A16 accelerators  ( Speedup! Europe  and SOUL-FI  ) represented by Olaf-Gerd Gemein and Maria Augusta Mancini respectively. They extensively explained their project goals and the support offered for the market acceleration and for the coming open calls (Figure.1).

Many thanks for attending the event and Special Thanks to the European Commission represented by Ragnar Bergström and to all the speakers.

by Antonio Cimmino (ICCLab)

IMG_20141205_120420 IMG_20141205_120819   IMG_20141205_140540 IMG_20141205_144242

the_journey_to_the_future

Fig. 1 – Open Call planopen call plan

Sponsors :

CSfi-ppp concord_logo_bigger

equinix