In the Linux world, a popular approach to build highly available clusters is with a set of software tools that include pacemaker (as resource manager) and corosync (as the group communication system), plus other libraries on which they depend and some configuration utilities.
On Illumos (and in our particular case, OmniOS), the ihac project is abandoned and I couldn’t find any other platform-specific open source and mature framework for clustering. Porting pacemaker to OmniOS is an option and this post is about our experience with this task.
The objective of the post is to describe how to get an active/passive pacemaker cluster running on OmniOS and to test it with a Dummy resource agent. The use case (or test case) is not relevant, but what should be achieved in a correctly configured cluster is that, if the node of the cluster running the Dummy resource (active node) fails, then that resource should fail-over and be started on the other node (high availability).
This is what we will cover:
- Configuring the machines
- Patching and compiling the tools
- Running pacemaker and corosync from SMF
- Running an active/passive cluster with two nodes to manage the Dummy resource