Manage instance startup order in OpenStack Heat Templates

In many applications it is necessary to create virtual resources in a certain order. As an orchestration engine, Heat is able to support such a requirement, but how it is actually done in a template can be tricky. Recently I had to write such a Heat template, which seemed pretty easy as there is a number of examples on the OpenStack/heat-templates github. My requirements and the relative lack of explanation on how the templates are written made this a bit more difficult than expected, but after finding information dispersed over several websites I solved my issues: This post is a summary of my findings. My application was made of three servers which had to be started and configured in a specific order, each server needing to be ready before the next one can be started as it automatically connects to the previously started servers. This was really the main concern of the application. In the following examples I will use the names service1, service2 and service3, with startup order being service1 > service2 > service3. I had three requirements:

  1. I wanted to follow the Heat Orchestration Template (HOT) format, which is the latest template format meant to replace Heat CloudFormation-compatible format (CFN) as the native format supported by Heat over time, so my template is still usable in the next Heat versions.
  2. To support my startup order I needed to use WaitConditions, which are directly issued from the CFN format but normally HOT still supports the usage of CFN resources, in the new format.
  3. My image did not have the cfn tools installed and thus I could not use cfn calls directly from inside the machine during the post-boot phase. This is an issue as from the templates which can be found on github, they all use these tools when WaitConditions are used.

The idea of WaitConditions is that they have to be declared and linked to one resource, and when this resource is configured and ready it sends a signal back to Heat. Another resource depending on this signal can then be started. The template which met my requirements can be found on github, I will explain the relevant parts here:


  service1: 
    type: "OS::Nova::Server"
    properties: 
      flavor: m1.medium
      image: ubuntu_cloud
      key_name: 
        get_param: key_name
      user_data: 
        str_replace: 
          template: |
              #!/bin/bash
              curl -X PUT -H 'Content-Type:application/json' \
                   -d '{"Status" : "SUCCESS","Reason" : "Configuration OK","UniqueId" : "SERVICE1","Data" : "Service1 Configured."}' \
                   "$wait_handle$"
          params: 
            $wait_handle$: 
              get_resource: service1_wait_handle

  service1_wait: 
    type: "AWS::CloudFormation::WaitCondition"
    depends_on: service1
    properties: 
      Handle: 
        get_resource: service1_wait_handle
      Timeout: 1000

  service1_wait_handle: 
    type: "AWS::CloudFormation::WaitConditionHandle"

A first resource “service1” is declared, with the WaitCondition and WaitConditionHandle declared as separate resources linked together with a dependence on service1 in the case of the WaitCondition. The interesting part is in the post-boot script of service1: user-data. Here you can a curl with a specific JSON data blob (details on CloudFormation’s website) sent through a PUT on an address retrieved from the WaitConditionHandle designed as service1_wait_handle. This is what signals the success to the wait condition. Now how is it possible to specify that the next virtual instance has to wait for this success signal before being started?


  service2: 
    type: "OS::Nova::Server"
    depends_on: service1_wait
    properties: 
      flavor: 
        get_param: instance_type
      image: ubuntu_cloud
      key_name: 
        get_param: key_name
      user_data: 
        str_replace: 
          template: |
              #!/bin/bash
              curl -X PUT -H 'Content-Type:application/json' \
                -d '{"Status" : "SUCCESS","Reason" : "Configuration OK","UniqueId" : "SERVICE2","Data" : "Service2 Configured."}' \
                "$wait_handle$"
          params: 
            $data$: 
              get_attr: 
                - service1_wait
                - Data
            $wait_handle$: 
              get_resource: service2_wait_handle
		
  service2_wait: 
    type: "AWS::CloudFormation::WaitCondition"
    depends_on: service2
    properties: 
      Handle: 
        get_resource: service2_wait_handle
      Timeout: 1000

  service2_wait_handle: 
    type: "AWS::CloudFormation::WaitConditionHandle"

Here you can see a structure similar to the one shown on the previous code snippet, with a new WaitCondition and Handle. This is because this server will in turn need to be configured before the final server can be started. The service2 resource differs on two points:

depends_on: service1_wait

This specifies that this resource depends on the completion of the service1_wait WaitCondition. Intuitively this should be enough as one might think that this will only happen when the success signal previously described is sent. Unfortunately it is not sufficient, at least in the Havana Release where this template was tested the resource did not wait at all and was started as soon as the template was created. A work-around to this problem is implemented in this code snippet:


  params: 
    $data$: 
      get_attr: 
        - service1_wait
        - Data

This specifically tells Heat that service2 needs to retrieve the data (in our case, a string) sent through the curl call in the service1 post-boot script. This requirement is what actually makes service2 wait for service1 to be ready, even if in the actual post-boot script of service2, there is no reference to this data at all: it is sufficient to retrieve it in the params sections of str_replace and not use it at all in the actual script. With this template, you can now start and configure you instances in whatever order fits your application’s requirements, and even combine wait conditions so that instance C waits for instance B which in turn waits for instance A. It is also possible to actually use the data sent through the success signal in other templates if this actually makes sense if your application configuration scheme.



Leave a Reply

Your email address will not be published. Required fields are marked *