OpenStack Nova Events are now Billable with Cyclops

Nova manages all compute resources in OpenStack. Today, the Cyclops team is announcing support for compute events such as VM Creation, Deletion, and Modification for billing purposes in Cyclops. All this by directly tapping into the OpenStack message bus for processing critical events in real-time. Continue reading

Openstack Cells and nova-network: Enabling floating ip association

In our previous blog post we presented an overview of Nova Cells describing its architecture and how a basic configuration can be set up. After some further investigation it is clear why this is still considered experimental and unstable; some basic operations are not supported as yet e.g. floating ip association as well as inconsistencies in management of security groups between API and Compute Cells. Here, we focused on using only the key projects in OpenStack i.e nova, glance and keystone and avoided adding extra complexity to the system; for this reason legacy networking (nova-network) was chosen instead of Neutron – Neutron is generally more complex and we had seen problems reported with between neutron and cells. In this blog post we describe our experience enabling floating ips in an Openstack Cells architecture using nova network which required making some small modifications to the nova python libraries.

Continue reading

Initial Experience with Openstack Nova Cells

In the GEYSER project, we are examining suitable Openstack architectures for our pilot deployments. In an earlier blog post we described different ways to architect  an Openstack deployment mostly focusing on AZ (Availability Zone) and Cells (those were the only options available back in 2013). Much has changed since then and new concepts were added such as regions and host aggregates. Even though Cells have been available since Grizzly they are still considered experimental due to lack of maturity and instability. In this blog post we describe our experience enabling Cells in an experimental Openstack deployment.

Why Cells?

The documentation says that “Cells functionality enables you to scale an OpenStack Compute cloud in a more distributed fashion without having to use complicated technologies like database and message queue clustering. It supports very large deployments”. Although we don’t have a large deployment this is pretty much in line with our requirements for our pilot – a distributed system with a single public API exposed. Comparing with other architectural approaches currently available the one which gets closer to this design are regions, but even then is not desirable as it exposes a public API for each region. Continue reading

Managing hosts in a running OpenStack environment

How does one remove a faulty/un/re-provisioned physical machine from the list of managed physical nodes in OpenStack nova? Recently we had to remove a compute node in our cluster for management reasons (read, it went dead on us). But nova perpetually maintains the host entry hoping at some point in time, it will come back online and start reporting its willingness to host new jobs.

Normally, things will not break if you simply leave the dead node entry in place. But it will mess up the overall view of the cluster if you wish to do some capacity planning. The resources once reported by the dead node will continue to show up in the statistics and things will look all ”blue” when in fact they should be ”red”.

There is no straight forward command to fix this problem, so here is a quick and dirty fix.

  1. log on as administrator on the controller node
  2. locate the nova configuration file, typically found at /etc/nova/nova.conf
  3. location the ”connection” parameter – this will tell you the database nova service uses

Depending on whether the database is mysql or sqlite endpoint, modify your queries. The one shown next are for mysql endpoint.

# mysql -u root
mysql> use nova;
mysql> show tables;

The tables of interest to us are ”compute_nodes” and ”services”. Next find the ”host” entry of the dead node from ”services” table.

mysql> select * from services;
+---------------------+---------------------+------------+----+-------------------+------------------+-------------+--------------+----------+---------+-----------------+
| created_at          | updated_at          | deleted_at | id | host              | binary           | topic       | report_count | disabled | deleted | disabled_reason |
+---------------------+---------------------+------------+----+-------------------+------------------+-------------+--------------+----------+---------+-----------------+
| 2013-11-15 14:25:48 | 2014-04-29 06:20:10 | NULL       |  1 | stable-controller | nova-consoleauth | consoleauth |      1421475 |        0 |       0 | NULL            |
| 2013-11-15 14:25:49 | 2014-04-29 06:20:05 | NULL       |  2 | stable-controller | nova-scheduler   | scheduler   |      1421421 |        0 |       0 | NULL            |
| 2013-11-15 14:25:49 | 2014-04-29 06:20:06 | NULL       |  3 | stable-controller | nova-conductor   | conductor   |      1422189 |        0 |       0 | NULL            |
| 2013-11-15 14:25:52 | 2014-04-29 06:20:05 | NULL       |  4 | stable-compute-1  | nova-compute     | compute     |      1393171 |        0 |       0 | NULL            |
| 2013-11-15 14:25:54 | 2014-04-29 06:20:06 | NULL       |  5 | stable-compute-2  | nova-compute     | compute     |      1393167 |        0 |       0 | NULL            |
| 2013-11-15 14:25:56 | 2014-04-29 06:20:05 | NULL       |  6 | stable-compute-4  | nova-compute     | compute     |      1392495 |        0 |       0 | NULL            |
| 2013-11-15 14:26:34 | 2013-11-15 15:06:09 | NULL       |  7 | 002590628c0c      | nova-compute     | compute     |          219 |        0 |       0 | NULL            |
| 2013-11-15 14:27:14 | 2014-04-29 06:20:10 | NULL       |  8 | stable-controller | nova-cert        | cert        |      1421467 |        0 |       0 | NULL            |
| 2013-11-15 15:48:53 | 2014-04-29 06:20:05 | NULL       |  9 | stable-compute-3  | nova-compute     | compute     |      1392736 |        0 |       0 | NULL            |
+---------------------+---------------------+------------+----+-------------------+------------------+-------------+--------------+----------+---------+-----------------+

The output for one of our test cloud is shown above, clearly the node that we want to remove is ”002590628c0c”. Note down the corresponding id for the erring host entry. This ”id” value will be used for ”service_id” in the following queries. Modify the example case with your own specific data. It is important that you first remove the corresponding entry from the ”compute_nodes” table and then in the ”services” table, otherwise due to foreign_key dependencies, the deletion will fail.

mysql> delete from compute_nodes where service_id=7;
mysql> delete from services where host='002590628c0c';

Change the values above with corresponding values in your case. Voila! The erring compute entries are gone in the dashboard view and also from the resource consumed metrics.