Managing ceilometer data in openstack

Ceilometer can collect a large amount of data, particularly in a system with large amount of servers and high activity: in such a scenario, the numbers of meters and samples can be large which  affects ceilometer performance and gives rise to quite large databases. In our particular case we are studying energy consumption in servers and how resource utilization (mainly cpu) may relates to overall energy consumption. The energy data is collected through Kwapi and stored in ceilometer every 10 seconds (yes, this is probably too fine-grained!). We had problems that the database accumulated too quickly, filling up the root disk partition on the controller and causing significant problems for the system. In this blog post, we describe the approach we now use for managing ceilometer data which ensures that the resources consumed by ceilometer remain under control. Continue reading

Advanced Queries to Ceilometer with a mongo backend

We are changing our ceilometer backend from mysql to mongodb in one of our experimental Openstack installations. The reason for this change is that mongo seems to deliver better ceilometer performance than mysql; further ceilometer data structures are a more natural fit with mongo (and indeed, this is largely where they came from). This can be seen in a typical record below which is clearly hierarchical and contains so-called embedded documents (in mongo terminology):

"_id" : ObjectId("53bbe7ea926fc4597b42aafc"),
"counter_name" : "instance",
"resource_metadata" : {
    "status" : "active",
    "display_name" : "Test",
    "name" : "instance-00000001",
    "image" : {
        "id" : "bdfaab74-6542-4cbb-94f1-5306662208a7",
        "name" : "cirros-0.3.2-x86_64-uec"
    "host" : "7c261f3a33c099538d448be797e5ce0c0d8cf8bf9f75dd59ce04df86",

Mongo natively provides support for queries of data structured in this fashion. More specifically, mongo enables data at different levels of the hierarchy to be queried – something which is difficult in SQL.

In python, this can be done simply as follows:

query = [{'field': 'timestamp', 'op': 'gt', 'value': date},
{'field': 'metadata.status', 'op': 'eq', 'value': 'active'}]

sample_list = ceilometer.samples.list(meter_name='instance', q=query)

The interesting point, which we did not understand clearly until now, is that ceilometer with a mongo backend supports exactly this type of query. Thus, the following command line query can obtain all instances that were active for a certain period of time:

ceilometer sample-list -m instance -q “timestamp>date; metadata.status=’active’” 

Then ceilometer will return all the samples of instances active in this time range.

| Resource ID                          | Name      | Type     | Timestamp           |
| 7535b9f6-01a6-410e-980d-338031e7a2c4 | instance  | instance | 2014-07-09T09:30:05 |
| 7535b9f6-01a6-410e-980d-338031e7a2c4 | instance  | instance | 2014-07-09T09:20:05 |
| 7535b9f6-01a6-410e-980d-338031e7a2c4 | instance  | instance | 2014-07-09T09:10:04 |

Querying ceilometer with a mysql database in this fashion results in an error.