Energy Aware Cloud Load Management

Resource Management in Cloud Computing is a topic that has received much interest both within the research community and within the operations of the large cloud providers; naturally, as it has a significant impact on the cloud provider’s bottom line. Much of the work to date on resource management focuses on Service Level Agreements (for different definitions of an SLA); some of the work also considers energy as a factor.

Objectives

The primary objective of this work is to develop an energy aware load management solution for Openstack: variants of this have been proposed before and indeed implemented in other stacks (e.g. Eucalyptus) but no such capability exists for Openstack as yet. As well as realizing the solution, the work will involve deploying a variant of the solution on the cloud platform without impacting the operation of the platform and determining what energy savings can be made. It is worth noting that the classical load balancing approach which is very typical for resource managers in cloud contexts is somewhat contradictory to minimizing energy consumption; consequently, the very standard load management tools are not suitable for minimizing cloud energy consumption.

Research Challenges

The research challenges are the following:

  • How to characterize the load in the system, particularly relating to spikes in demand
  • How much buffer space to maintain to accommodate load spikes
  • How to perform load consolidation – what load should be moved to what machines?
  • When to perform load consolidation – how frequently should it take place?
  • What are the energy gains that can be achieved from such a dynamic system?

Relevance to current and future markets

Advanced resource management mechanisms are a necessity for cloud computing generally. In the case of large deployments, Facebook’s autoscale is an example of how they can be used to achieve energy savings of the order of 15%. In the case of smaller deployments, it is still the case that there are many [[ https://gigaom.com/2013/11/30/the-sorry-state-of-server-utilization-and-the-impending-post-hypervisor-era/ | highly underutilized servers ]] in typical Data Centres and ultimately there will be a need to reduce costs and realize energy efficiencies. The problem is a large, general problem and energy is one specific aspect of it – one of the challenges for this work is how to integrate with other active parts of the ecosystem.

There are some commercial offering which explicitly address energy efficiency in the cloud context. These include:

Impact

Architecture

See the Energy Theme for the larger system architecture.

Implementation Roadmap

The next steps on the implementation roadmap are as follows:

  • Get tunnelled post-copy live migration working with modifications to libvirt (Jan 2015)
  • See if this can be pushed upstream to libvirt
  • Consolidate live migration work into clearer message relating to the potential of live migration (Jan 2015)
  • Devise control mechanism which can be used to provide energy based control (Feb 2015)
  • Deploy and test on Arcus servers (Mar 2015)
  • Determine if it is ready for deployment on Bart/Lisa (April 2015)

Contact