Open Stack’s capabilities to support High Availability are very limited. If a virtual machine crashes, there is no automatic recovery. Clustering software seems a to be a great workaround to allow redundancy and implement High Availability (HA).
Pacemaker is a scalable cluster resource manager developed by Clusterlabs. Its advantages are:
- Support of many different deployment scenarios
- Monitoring of resources
- Recovery from outtages
According to the OpenStack documentation website the OpenStack HA environment builds on Pacemaker and Corosync. Corosync is Pacemaker’s message layer which is responsible for the distribution of clustering messages. The Pacemaker software uses resource agents that manage different ressources and communicate via Corosync. Corosync is responsible for synchronizing DRBD block devices which are virtual devices layered on top of the machine node devices themselves (like hard-disks etc.). The DRBD block device layer allows clustering of different machine nodes, while Corosync organizes the synchronicity of data in these clusters. Pacemaker resource agents control the DRBD devices via Corosync and are therefore able to organize high availability of machine nodes in an OpenStack environment.
Integration of Pacemaker into OpenStack is a major step towards creating a HA cloud environment. There’s an ongoing evaluation how Pacemaker fits into the MobileCloud environment, but it is obvious that there should be a test procedure to evaluate availability of cloud resources in different integration scenarios. Follow up information on this subject will be posted in a further blog post.