Use pacemaker and corosync on Illumos (OmniOS) to run a HA active/passive cluster

In the Linux world, a popular approach to build highly available clusters is with a set of software tools that include pacemaker (as resource manager) and corosync (as the group communication system), plus other libraries on which they depend and some configuration utilities.

On Illumos (and in our particular case, OmniOS), the ihac project is abandoned and I couldn’t find any other platform-specific open source and mature framework for clustering. Porting pacemaker to OmniOS is an option and this post is about our experience with this task.

The objective of the post is to describe how to get an active/passive pacemaker cluster running on OmniOS and to test it with a Dummy resource agent. The use case (or test case) is not relevant, but what should be achieved in a correctly configured cluster is that, if the node of the cluster running the Dummy resource (active node) fails, then that resource should fail-over and be started on the other node (high availability).

I will assume to start from a fresh installation of OmniOS 151012 with a working network configuration (and ssh, for your comfort!). Check the general administration guide, if needed.

This is what we will cover:

  • Configuring the machines
  • Patching and compiling the tools
  • Running pacemaker and corosync from SMF
  • Running an active/passive cluster with two nodes to manage the Dummy resource

Installing packages

Some packages are installed from the default repositories, but others need to be retrieved from opencsw.

Install from default repositories:

# pkg install developer/gnu-binutils text/gnu-grep \
 rsync gnu-tar wget text/gnu-sed compatibility/ucb text/gawk\
 autoconf gnu-m4 system/header header-math \
 ipmitool gnu-make developer/build/libtool library/libtool/libltdl \
 library/ncurses library/security/openssl text/gnu-gettext \
 developer/versioning/mercurial \
 developer/versioning/git \
 SUNWcs driver/network/ofk system/header \
 developer/library/lint developer/object-file \
 system/library/mozilla-nss/header-nss library/nspr/header-nspr \
 xz pkg://omnios/developer/swig \
 package/pkg file/gnu-coreutils \
 system/header/header-picl developer/gcc48 \

Install the OpenCSW utility and update the repos:

# pkgadd -d
# /opt/csw/bin/pkgutil -U

Install the packages from CSW:

# /opt/csw/bin/pkgutil -i ggettext pkgconfig libnet gnutls libgnutls_dev libgnutls13 libev_dev libevent_dev libgcrypt11

Configure the environment


Some environment variables will be needed by the tools and scripts, but also during the building process. The easiest thing is to create a file (e.g., pacemaker.rc) and source it to get the pacemaker environment ready. You may want to separate the variables needed only for running the tools from the various flags needed during the build.

NOTE: there should be no particular reason to change the installation prefix (PREFIX), but if you need to, please adapt also to the remaining part of the instructions to that change, where needed.

Content of pacemaker.rc:

export PCMK_ipc_type=socket
export PREFIX=/opt
export CFLAGS='-D__EXTENSIONS__ -D_POSIX_PTHREAD_SEMANTICS -DNAME_MAX=255 -DHOST_NAME_MAX=255 -I/opt/gcc-4.8.1/include -I/usr/include -I${PREFIX}/include -I/opt/ha/include -I/opt/gcc-4.8.1/lib/gcc/i386-pc-solaris2.11/4.8.1/include/ -lsocket -lnsl'
export LDFLAGS='-R/usr/gnu/lib -L${PREFIX}/lib -L/opt/gcc-4.8.1 -L/usr/gnu/lib -L/lib -L/usr/lib'
export PATH=/usr/gnu/bin:/opt/gcc-4.8.1/bin/:/opt/csw/bin:/usr/gnu/bin:/usr/bin:/usr/sbin:/usr/local/bin:$PREFIX/bin:/sbin/:/opt/csw/gnu/:${PREFIX}/sbin
export PKG_CONFIG_PATH='/opt/lib/pkgconfig:/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'
export PKG_CONFIG_LIBDIR='/opt/lib/pkgconfig:/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'
export LCRSODIR=/usr/libexec/lcrso 
export CLUSTER_USER=hacluster
export CLUSTER_GROUP=haclient
export BUILDPATH=/export/builds
export LD_ALTEXEC=/usr/gnu/i386-pc-solaris2.11/bin/ld
export CONFIG_SHELL=/usr/gnu/bin/sh
export PYTHONPATH=${PREFIX}/lib/python2.6/site-packages
export OCF_ROOT=/opt/usr/lib/ocf

Then source the file to have the configuration on the current shell:

# source pacemaker.rc

Now update the library path on the system to include the CSW objects:

# crle -l /opt/csw/lib/ -u


# mkdir -p $BUILDPATH
# mkdir -p $PREFIX/var
# mkdir -p $PREFIX/lib/heartbeat/cores/$CLUSTER_USER

Cluster user and group

We create the hacluster and haclient user and group (respectively), that will run the cluster, then we set some permissions on the folders that we created before.

Note that the corosync and pacemaker processes will be run as hacluster user (as per the SMF script that comes later), so a common problem when using resource agents will be about missing permission on directories or executables.

# getent group ${CLUSTER_GROUP} >/dev/null || groupadd ${CLUSTER_GROUP}
# getent passwd ${CLUSTER_USER} >/dev/null || useradd -g ${CLUSTER_GROUP} -d $PREFIX/lib/heartbeat/cores/$CLUSTER_USER -s /bin/bash -c "cluster user" ${CLUSTER_USER}

Similarly, hacluster won’t have enough rights to run write commands such as ipadm create-addr, so we give him passwordless sudo powers. If some resource agents that you want to run will need sudo permissions in some of their instructions, then they will need to be patched.


# visudo

then append this at the end of the file to have the passwordless sudo:

hacluster    ALL=(ALL) NOPASSWD: ALL

An alternative would be to set appropriate Role-Based Access Control (RBAC) authorizations.

UPDATE: running pacemaker and corosync as root should work without issues. So you can edit the SMF script and use “root” instead of “hacluster” in the “CLUSTER_USER” variable.


Set the hostnames of both machines by appending an entry at the end of /etc/hosts (use the output of uname -n to get the symbolic name), example:     ha-test-1

Check that from each machine you can ping the other with the symbolic name.

Installation of the tools

This section will provide indications on which tools to build and install and how. Install them in the order shown here below.

This information is re-elaborated from Andreas page on libqb (check credits and references).

I will mention the version of the tools that I used on my setup (or explicitly add a checkout command). You are encouraged to try the latest “masters/tips” when available, but that might need patching work not documented here.

Also, the fact that code is compiling really doesn’t mean much. There will be differences, such as expected return values, between Linux and Solaris that will break the code at runtime. With the patches here described, I managed to run correctly the Dummy, IPaddr and ZFS resources, but the line of code that will crash everything will be executed sooner or later, I haven’t just traversed that code yet :D!

General note on and

I had the need to do this change for many packages, so I will document it here as a general note and reference this paragraph if this change is needed to compile a certain package.

So if you see the note “apply the changes described in the general section about and” during the installation instructions of a package, come back to this paragraph and do the two changes described here below.

Add the following line in, after the AC_INIT directive:


If already has an ACLOCAL_AMFLAGS variable, then append


to that line, otherwise add the complete entry



# wget
# tar xf help2man\-1.46.1.tar.xz
# cd help2man\-1.46.1
# ./configure
# make
# make install


# wget
# tar zxf libtool-2.4.2.tar.gz
# cd libtool\-2.4.2
# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./bootstrap
# chmod +x libltdl/config/install\-sh
# ./configure
# make install


# export CFLAGS='-std=c89 -D__EXTENSIONS__ -DNAME_MAX=255 -DHOST_NAME_MAX=255'
# wget
# gtar zxf libesmtp-1.0.6.tar.gz
# cd libesmtp-1.0.6
# mkdir m4
# autoreconf -i
# ./configure --prefix=$PREFIX 
# perl -pi -e 's#// TODO: handle GEN_IPADD##' smtp-tls.c
# gmake
# gmake install 
# cp auth-client.h $PREFIX/include
# cp auth-plugin.h $PREFIX/include
# cp libesmtp.h $PREFIX/include
# unset CFLAGS
# export CFLAGS='-D__EXTENSIONS__ -D_POSIX_PTHREAD_SEMANTICS -DNAME_MAX=255 -DHOST_NAME_MAX=255 -I/opt/gcc-4.8.1/include -I/usr/include -I${PREFIX}/include -I/opt/ha/include -I/opt/gcc-4.8.1/lib/gcc/i386-pc-solaris2.11/4.8.1/include/ -lsocket -lnsl'


# wget
# gtar zxf check-0.9.8.tar.gz
# cd check-0.9.8

Edit the file to add two lines (just after AC_CONFIG_MACRO_DIR([m4]) ):


Then continue with the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 autoreconf --install
# ./configure 
# make
# make install


# wget
# gtar zxf asciidoc-8.6.8.tar.gz
# cd asciidoc-8.6.8
# ./configure
# gmake install

cluster glue

(Note: I used version 2ce85bfab4c1 for my setup)

# wget -O cluster-glue.tar.bz2
# gtar jxf cluster-glue.tar.bz2
# cd Reusable-Cluster-Components-*
# perl -pi -e 's#\$\(XSLTPROC\) \\#\$\(XSLTPROC\) --novalid \\#g' doc/

Search for “solaris” in and match that section with the following (you should only add the CFLAGS line):


Search for “cc_supports_flag()” in and check that it matches the following:

cc_supports_flag() {
       local CFLAGS="$@"
       AC_MSG_CHECKING(whether $CC supports "$@")
       AC_COMPILE_IFELSE([AC_LANG_SOURCE(int main(){return 0;})] ,[RC=0; AC_MSG_RESULT(yes)],[RC=1; AC_MSG_RESULT(no)])
       return $RC


# ./ 
# chmod +x install\-sh
# sed -i 's/-fstack-protector-all//g'

Then apply the changes described in the general section about and

Complete the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 autoreconf --install
# LDFLAGS='-L/opt/csw/lib' ./configure --prefix=$PREFIX --enable-fatal-warnings=no --enable-doc=no --with-daemon-user=${CLUSTER_USER} --with-daemon-group=${CLUSTER_GROUP}
# make
# make install

Resource agents

(Note: I used version b644395 for my setup)

# wget -O resource-agents.tar.gz
# gtar zxvf resource-agents.tar.gz
# cd ClusterLabs-resource-agents-*/
# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./
# chmod +x install\-sh
# ./configure --prefix=$PREFIX
# gmake clean
# gmake
# gmake install


# git clone libqb
# cd libqb
# git checkout v0.17.1

Then apply the changes described in the general section about and

Complete the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./
# LDFLAGS='-R/opt/gcc-4.8.1/lib' CFLAGS="-D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -D__EXTENSIONS__ -march=i486 -mtune=native" ./configure --prefix=$PREFIX --enable-debug --with-check=yes --enable-slow-tests
# make clean
# make
# make install


# wget
# gtar zxvf libstatgrab-0.91.tar.gz
# cd libstatgrab-0.91
# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./configure --prefix=$PREFIX 
# make
# make install


Get the sources:

# git clone corosync
# cd corosync
# git checkout v2.3.4

Set the environment:

# export LDFLAGS='-R/opt/gcc-4.8.1/lib -R/usr/lib/mps -R/opt/lib -L/opt/gcc-4.8.1/lib -L/usr/lib/mps -L/opt/lib -lnss3 -lsmime3 -lssl3 -lnssutil3 -lplds4 -lplc4 -lnspr4 -lpthread -ldl -lposix4'
# export nss_CFLAGS='-I/usr/include/mps'
# export nss_LIBS='-R/usr/lib/mps -L/usr/lib/mps'
# export PKG_CONFIG_PATH='/opt/lib/pkgconfig:/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'
# export PKG_CONFIG_LIBDIR='/opt/lib/pkgconfig:/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'

In case of previous failed attempts, clean the configuration cache:

# rm config.status
# rm -rf autom4te.cache

Apply the changes described in the general section about and

Complete the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./
# ./configure --prefix=$PREFIX --localstatedir=$PREFIX/var --enable-monitoring --enable-snmp --enable-xmlconf --enable-testagents -enable-augeas --enable-debug --enable-coverage
# make
# make install

Now logout and login again from your shell, then source pacemaker.rc to continue from a clean environment.


Get the sources:

# wget
# gtar jxf tip.tar.bz2
# cd Heartbeat-3-0-*/

Apply the changes described in the general section about and (for heartbeat the file is

In case of previous failed attempts, clean the configuration cache:

# rm config.status
# rm -rf autom4te.cache

Complete the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 autoreconf -i
# CFLAGS="-I/opt/include -I/opt/csw/include/ " LDFLAGS='-R/opt/lib -L/opt/lib -L/opt/csw/lib/ -lnsl /opt/csw/lib/ -lsocket' ./configure --prefix=$PREFIX --enable-quorumd
# chmod +x install\-sh
# make CPPFLAGS="-L/opt/csw/lib/ -lgnutls -I/usr/include/glib-2.0/ -I/usr/lib/glib-2.0/include/"
# make install


Get the sources:

# git clone pacemaker
# cd pacemaker
# git checkout 272814b6423d4cdc21a0a83cd9007a4d57bd542d

Set the environment:

# export CFLAGS='-O3 -D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -march=i486 -mtune=native -I/usr/include/ncurses/'
# export LDFLAGS="-R/usr/lib/mps:/opt/gcc-4.8.1/lib -L'/usr/lib/mps:/opt/gcc-4.8.1/lib' -lssp_nonshared"
# export PKG_CONFIG_PATH='/opt/lib/pkgconfig:/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'
# export CONFIG_SHELL=/usr/gnu/bin/sh


# perl -pi -e 's/-Wunsigned-char//g'
# perl -pi -e 's#-Wunused-but-set-variable##'
# perl -pi -e 's/-fstack-protector-all//g'
# sed -i 's/\(ACLOCAL_AMFLAGS\s*=\s*\-I\s*m4\)/\1 \-I\/opt\/csw\/share\/aclocal\//g'
# find . -name "*.c" -o -name "*.h" | xargs sed -i 's/syscall\.h/sys\/syscall\.h/g'
# sed -i 's/reboot(RB_AUTOBOOT)/reboot(RB_AUTOBOOT, \"pacemaker\")/g' lib/common/watchdog.c
# sed -i 's/\(sysrq_init()\)/\/\/\1/g' mcp/pacemaker.c

Apply the hack of shame: this is a terrible workaround for the missing signalfd system call in IllumOS (the patch target is a file named services_linux.c!). We just wait 5 seconds for the forked process to finish providing stdout (instead of listening to signals)…

Get the patch file from the gist, extract it and apply it (this will also do minor changes to the Dummy resource agent):

# wget
# git apply pacemaker.patch

Complete the build:

# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./
# ./configure --prefix=$PREFIX --enable-fatal-warnings=no --with-corosync --with-cs-quorum --with-acl=no --enable-debug
# make CPPFLAGS="-I/usr/include/ -I/usr/include/glib-2.0/ -I/usr/lib/glib-2.0/include/ -I/usr/include/libxml2/ -I/opt/include $CFLAGS"
# make install

Post install:

# mkdir -p $PREFIX/etc/corosync/uidgid.d
# (
echo "uidgid {"
echo " uid: `id -u ${CLUSTER_USER}`"
echo " gid: `id -g ${CLUSTER_USER}`"
echo "}"
) > $PREFIX/etc/corosync/uidgid.d/uid.conf

Now logout and login again from your shell, then source pacemaker.rc to continue from a clean environment.


# git clone crmsh
# cd crmsh
# git checkout 0d631cb36655695a67c940cf02c3fabccff705da
# perl -pi -e 's#ps -e -o pid,command#ps -e -o pid,comm#' ./modules/
# perl -pi -e 's#a2x -f manpage#a2x -L -f manpage#' doc/
# ACLOCAL=aclocal-1.14 AUTOMAKE=automake-1.14 ./
# ./configure --prefix=$PREFIX

Edit the file doc/

# sed -i 's/a2x -L -f manpage $</a2x --no-xmllint -f manpage $</g' doc/

Complete the build:

# make
# make install

Post install:

# mkdir -p /root/.config/crm/
# cp /opt/etc/crm/crm.conf /root/.config/crm/

And this should complete the installation part.

NOTE: if crm will not run, complaining about missing readline module, then you can use crm with the CSW python (for some reason will not appear in /usr/lib/python2.6/lib-dynload).

To do this (only if crm is not working), use CSW python as interpreter and reinstall crm:

# /opt/csw/bin/pkgutil -i libreadline6 libreadline_dev python py_lxml
# cd $BUILDPATH/crmsh
# sed -i 's/\#\!\/usr\/bin\/python/\#\!\/opt\/csw\/bin\/python/p' crm
# make
# make install

Corosync configuration

Get a sample corosync configuration file from this gist and put it in place:

# wget
# mv corosync.conf ${PREFIX}/etc/corosync/corosync.conf

Then edit the file and [change] the following fields:

memberaddr: use the addresses of your members
bindnetaddr: use the address of your network
ring0_addr: set the hostname of each node
nodeid: not really necessary to change it, use values that you prefer

Fixing permissions

All these files should already exist (except for, unless you had already run corosync for some reason). If one of these commands give an error, do not ignore it!

# chown -R hacluster:haclient ${BUILDPATH}/corosync/exec
# chown -R hacluster:haclient ${BUILDPATH}/corosync/common_lib/.libs
# chown -R hacluster:haclient ${PREFIX}/var/log/cluster
# chown hacluster:haclient ${PREFIX}/var/run
# touch ${PREFIX}/var/run/
# chown hacluster:haclient ${PREFIX}/var/run/
# chown -R hacluster:haclient /opt/var/lib
# chown -R hacluster:haclient /opt/var/run/resource-agents/

Setting up Corosync in SMF

To run corosync as a service in SMF, you will need the manifest and the executable script. You can find both of them on gist.

Download the manifest and the script and put them in place:

# wget
# wget
# mkdir ${PREFIX}/etc/smf
# mv corosyncd ${PREFIX}/etc/smf/
# mv corosync.xml ${PREFIX}/etc/smf/
# chmod u+x ${PREFIX}/etc/smf/corosyncd

Validate the SMF manifest, hopefully you will get no errors.

# svccfg validate ${PREFIX}/etc/smf/corosync.xml

Import and enable the service:

# svccfg import ${PREFIX}/etc/smf/corosync.xml
# svcadm enable corosync

Check if the service started:

# svcs | grep corosync

If everything went well, you should see an output like the following:

online 12:35:55 svc:/application/hacluster/corosync:default

If something went wrong, try to get the output of the SMF script with

# cat `svcs -L corosync`

Now corosync and pacemaker should start at boot. You can disable and enable the service with:

# svcadm disable corosync
# svcadm enable corosync

Running the cluster

You can check the cluster status with:

# crm_mon

(remember to source pacemaker.rc if the command is not available!)

The output should look similar to the following (note that for this run I had 3 nodes, 2 of which offline, and I set the expected_votes to 1 to have a partition with quorum):

You need to have curses available at compile time to enable console mode
Last updated: Thu Nov 6 12:44:26 2014
Last change: Thu Nov 6 12:42:01 2014
Stack: corosync
Current DC: ha-test-1 (80) - partition with quorum
Version: 1.1.12-272814b
3 Nodes configured
0 Resources configured
Node omni-pcm (20): UNCLEAN (offline)
Node omni-pcm-2 (40): UNCLEAN (offline)
Online: [ ha-test-1 ]

Administer the Dummy resource!

If you now have configured two nodes to create a pacemaker cluster, the next step is to check that the cluster can administer a resource.

We will use the Dummy resource, which does nothing other than verifying that it is running. When we start the resource, it will run on one of the nodes. If we kill that node in some way, the Dummy resource should fail over and start running on the other node.

This demonstrates that pacemaker operations are working.

First, let’s create some symlinks and set some permissions to be sure that pacemaker will find everything accessible:

# ln -s /usr/lib/ocf/resource.d/pacemaker /opt/usr/lib/ocf/resource.d/pacemaker
# ln -s /opt/usr/lib/ocf/lib/ /usr/lib/ocf/lib
# ln -s /opt/usr/lib/ocf/resource.d/heartbeat /usr/lib/ocf/resource.d/heartbeat
# chown -R hacluster:haclient /usr/lib/ocf
# chown -R hacluster:haclient /opt/usr/lib/ocf

As a basic configuration for our cluster, we disable STONITH and ignore the no-quorum state (with two nodes, we cannot have a quorum if one of them fails!):

# crm configure property stonith-enabled=false
# crm configure property no-quorum-policy=ignore

Verify that our configuration so far is correct with

# crm_verify -L -V

that should print nothing if everything is fine.

Now configure the Dummy resource:

# crm configure primitive dummy ocf:pacemaker:Dummy op monitor interval=120s

and then check its status (it should be running on one of the nodes):

# crm resource status dummy
resource dummy is running on: ha-test-1

Now you can verify that if one node stops working (you can simulate this with svcadm disable corosync), the Dummy resource will be started on the other node.

You can check the status also with crm_mon, that should show something similar to this:

# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
Last updated: Thu Nov 6 16:39:58 2014
Last change: Thu Nov 6 16:34:43 2014
Stack: corosync
Current DC: ha-test-1 (80) - partition with quorum
Version: 1.1.12-272814b
3 Nodes configured
1 Resources configured

Online: [ ha-test-1 ]
OFFLINE: [ omni-pcm omni-pcm-2 ]

dummy (ocf::pacemaker:Dummy): Started ha-test-1


If everything worked, you should now have a pacemaker cluster running on OmniOS.

If you need different (i.e., useful) resource agents now (e.g., IPaddr), some patching may be needed for the fact that RA scripts will be run by the hacluster user, which, according to this post, doesn’t have authorizations to perform system changes. Also expect problems in case the hacluster user won’t have enough rights to read/write/traverse some folders that the RA script will want to access. Check the logs (tail -f /opt/var/log/cluster/corosync.log), debug and fix :). UPDATE: running corosync and pacemaker as root should also work and this simplifies using RAs. Check the  “Cluster user and group” section.

To have a better understanding on pacemaker, please refer to the official documentation.


I want to thank Andreas Grüninger for his great support in helping me with this setup and his contributions to the pacemaker tools that made it possible to run them on Illumos. I re-used the SMF manifest and script for corosync that he shared with me (with his permission :)). Also, the general procedure was made available by Andreas at this page.

Many thanks also to Sašo Kiselkov for his in-depth blog post on building a HA ZFS storage appliance.



  1. Hi,
    It’s Awesome guide, to install the HA-cluster on OmniOS. Btw, I follow your guide and successfully add the Dummy resource. but when I’ve tried to add IPaddr2 resource its return with error on “crm status”
    and also I Can’t find the ZFS resource, can you share the modification you made on the resource script ?

    • piiv

      11. March 2015 at 11:16

      Hi Adhi, I’m very happy to hear that this helped :)!

      One first big recommendation to have RAs working more easily is to run pacemaker and corosync as root.

      Then, I can share my modified IPaddr script, hoping that this would also work for you, otherwise you may need to debug :).

      You can find it here:

      There are still debug lines (commented out) in there :D.

      • thanks piiv for your response. More question , how about using stonith ?
        just like in Sašo Kiselkov blog post, configure stonith for ipmi resource :
        crm(live)configure# primitive head1-stonith stonith:external/ipmi

        with this guide my crm only show this option:
        crm(live)configure# primitive test_stonith stonith:fence_
        stonith:fence_legacy stonith:fence_pcmk

        even I link stonith plugins resource from :
        ln -s /opt/lib/stonith /opt/usr/lib/

        can you help me ?

        • piiv

          16. April 2015 at 14:41

          Hi Adhi,

          I have used resource-level fencing on my setup (, as stonith on IPMI was not an option for us.

          • I’m Sorry for the next question, but actually I’m not expert in clustering.

            How to enable or add the “node-level fencing” ?

          • piiv

            17. April 2015 at 9:36

            You need to get STONITH to run on your pacemaker setup, you can follow any of the available tutorials.

            I haven’t covered this, so I don’t know what kind of issues you may run into and if more patching will be needed.

  2. Hi,
    Very nice writeup. I have tried to follow all te steps litterally but got stuck building pacemaker.
    I get error messages like:
    ../lib/cluster/.libs/ undefined reference to `cl_get_string’
    ../lib/cluster/.libs/ undefined reference to `ha_msg_expand’
    ../lib/cluster/.libs/ undefined reference to `ha_msg_addstruct_compress’
    . etc
    Now, I am not a programmer so troubleshooting this will take me a long time probably. Any tips?
    Since this page has been here for a while, might it be that following your instructions, I get a newer version of pacemaker then the one that you are using? I get version 1.1.12. If so, can you please tell me which versions of the tools you have used to accomplish this? I will try to download them then.

    Thanks in advance.

    • piiv

      29. June 2015 at 9:58

      The linker cannot find the libraries that define those symbols.

      Most likely it is one of two possible causes:

      1. You have the libraries, but they are in a path that the linker is not aware of (missing -L option) or the library is not used for linking (missing -l option)
      2. You don’t have the libraries/object files and you need to compile the source code that provides them

      I see that all those functions are defined in the heartbeat package, so that’s what you are missing here (
      Did heartbeat built and installed correctly?

      About the version of the tools, the git checkout command on the pacemaker source code should bring your working copy to the same state as I used.

      What version of OmniOS are you using?

      • Hi,
        Thank you very much for your reply!
        I have used the commands copy paste from above when building pacemaker. Do you mean a -L or -l option in the make command of pacemaker? Pardon the question, since I haven’t looked at your link regarding the heartbeat package yet. There was one diff while building heartbeat though. The source does not contain a as stated in your red message. I had to apply the changes to the I am using omnios r12 (same as you stated in this post), though I downloaded one of the first images available. Maybe I should use a more recent r12 if still available? Do you mind if I send you a mail with some questions about your own implementation to your normal mail account? They might clutter your nice page if I did it here 🙂 and are not speciffic to building the source but more to your experience.
        High regards!

        • piiv

          29. June 2015 at 11:57

          The -L and/or -l should go in the LDFLAGS, with the right parameters (e.g., -L/path/to/libs or -llibname).

          But most likely, something went wrong with the installation of heartbeat.

          I have checked the download link and that actually points to the tip of the branch, so your first hypothesis is correct, that is not exactly the same archive that I used (but a most recent one).

          You can find the package that I used here:

          The OmniOS version that you use should not give problems, I have done this setup there as well.

          For the last question, feel free to write to me at “|my four letters username as you can read it on top of this message|” at zhaw dot ch 🙂

  3. Thank you!
    I must start with my job now but will keep you posted as soon as I can

    • l find that my stonith:external/ipmi resource can’t start, and the error log was :stonith-ng: error: get_agent_metadata: Could not retrieve metadata for fencing agent fence_legacy.
      could you help to give some light?

Leave a Reply

Your email address will not be published. Required fields are marked *