For reliable, user-controlled and trustworthy file storage in the cloud, free software prototypes like NubiSave have become great tools to investigate and lift the barrier towards acceptable migration paths. For structured data storage and processing, several approaches to database-as-a-service (DBaaS) have been proposed by researchers and developers but a clear recommendation of how to best manage rows or records of data in the cloud from a practicality angle is still absent. Partially, the question about how to do this is due to the different pricing structures and availability guarantees by the providers which are not trivial to compare. Often, running the database system as set of replicated or sharded containers being part of the application appears to be a valid alternative to the binding of existing commercial DBaaS, if done correctly. After all, cloud providers would offer the same technical guarantees for any of their services. An analysis of which configuration works better and is less expensive would thus be needed.
Cyclops, ICCLab’s Rating-Charging-Billing solution for Cloud providers, capable of collecting usage records from both OpenStack and CloudStack, has been working with time series data from the very beginning. Our choice for database technology that is highly optimised to handle such data was InfluxDB written in GO.
Time series data is generally a sequence of data points – in our case either Usage Data Records or Charge Data Records. These datasets often have hundreds of millions of rows, including timestamps and large quantities of immutable fields. Entries do not typically change after they are added to the database, where new entries are being appended rather than operated on. Continue reading
Ceilometer can collect a large amount of data, particularly in a system with large amount of servers and high activity: in such a scenario, the numbers of meters and samples can be large which affects ceilometer performance and gives rise to quite large databases. In our particular case we are studying energy consumption in servers and how resource utilization (mainly cpu) may relates to overall energy consumption. The energy data is collected through Kwapi and stored in ceilometer every 10 seconds (yes, this is probably too fine-grained!). We had problems that the database accumulated too quickly, filling up the root disk partition on the controller and causing significant problems for the system. In this blog post, we describe the approach we now use for managing ceilometer data which ensures that the resources consumed by ceilometer remain under control. Continue reading