The Convergence of HPC, AI and Cloud

For optimal reading, please switch to desktop mode.

Tracking resource usage, and charging for it, is a requirement for many cloud deployments. Public clouds obviously need to bill their customers, but private clouds can also use chargeback and showback policies to encourage more efficient use of resources. In the OpenStack world, CloudKitty is the standard rating solution. It works by applying rating rules, which turn metric measurements into rated usage information.

For several years, gathering metrics in OpenStack has been implemented by two separate project teams: Telemetry and, more recently, Monasca. The future of Telemetry, which produces the Ceilometer software, is uncertain: historical contributors have stopped working on the project and its de-facto back end for measurements, Gnocchi, is also seeing low activity. Although Telemetry users have volunteered to maintain the project, the Monasca project appears to be healthier and more active.

Since deploying Monasca is our preferred choice to monitor OpenStack, we asked ourselves: can we use CloudKitty to charge for usage without deploying a full Telemetry software stack?

Ceilometer + Monasca = Ceilosca

Ceilometer is well integrated in OpenStack and can collect usage data from various OpenStack services, either by polling or listening for notifications. Ceilometer is designed to publish this data to the Gnocchi time series database for storage and querying.

In Monasca, metrics collected by the Monasca Agent focus more on monitoring the health and performance of the infrastructure and its services, rather than resource usage from end users (although it can gather instance metrics via the Libvirt plugin). Monasca stores these metrics in a time series database, with support for InfluxDB and Cassandra.

Despite this, we are not required to deploy and maintain Gnocchi just to collect usage data via Ceilometer: monasca-ceilometer, also known as Ceilosca, enables Ceilometer to publish data to the Monasca API for storage in its metrics database. Although Ceilosca currently lives in its own repository and must be installed by adding it to the Ceilometer source tree, there is an ongoing effort to integrate it directly into Ceilometer.

By default, Ceilosca will push several metrics based on instance detailed information, such as disk.root.size, memory, and vcpus, to Monasca under the service tenant. Each metric will be associated with a specific instance ID via the resource_id dimension. Metric dimensions also include user and project IDs. For example, to retrieve metrics associated with the p3 project, we can use the Monasca Python client:

monasca metric-list \
--tenant-id $(openstack project show service -c id -f value) \
--dimensions project_id=$(openstack project show p3 -c id -f value)

Once stored in Monasca, these metrics can be used by CloudKitty, thanks to the inclusion of a Monasca collector since the Queens release.

Let's see how we can apply a charge to the vcpus metric. We need to configure CloudKitty with the metrics.yml file to know about our metric:

metrics:
  vcpus:
    unit: vcpus
    groupby:
      - resource_id
    extra_args:
      resource_key: resource_id

Then, we configure the hashmap rating rules to apply a rate to CPU usage. We create a vcpus service and then create a mapping with a cost of 0.5 per CPU hour:

$ cloudkitty hashmap service create vcpus
+-------+--------------------------------------+
| Name  | Service ID                           |
+-------+--------------------------------------+
| vcpus | cb72cd89-43ef-46b9-b047-58e0b5335992 |
+-------+--------------------------------------+
$ cloudkitty hashmap mapping create 0.5 -s cb72cd89-43ef-46b9-b047-58e0b5335992 -t flat
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+
| Mapping ID                           | Value | Cost       | Type | Field ID | Service ID                           | Group ID | Project ID |
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+
| 68465dad-7c68-4f8e-a256-6a62735c1e3b | None  | 0.50000000 | flat | None     | cb72cd89-43ef-46b9-b047-58e0b5335992 | None     | None       |
+--------------------------------------+-------+------------+------+----------+--------------------------------------+----------+------------+

We then launch an instance. Once the instance becomes active, a notification is processed by Ceilometer and published to Monasca, recording that instance b7d926a8-cd63-4205-8f90-e3c610aeaad5 has 64 vCPUs.

$ monasca metric-statistics --tenant-id $(openstack project show service -c id -f value) vcpus avg "2019-07-30T14:00:00" --merge_metrics --group_by resource_id --period 1
+-------+---------------------------------------------------+----------------------+--------------+
| name  | dimensions                                        | timestamp            | avg          |
+-------+---------------------------------------------------+----------------------+--------------+
| vcpus | resource_id: b7d926a8-cd63-4205-8f90-e3c610aeaad5 | 2019-07-30T14:43:01Z |       64.000 |
+-------+---------------------------------------------------+----------------------+--------------+

With the default Kolla configuration, Nova also sends a report notification every hour, which is also stored in Monasca. Similarly, when an instance is terminated, a notification is published and converted into a final measurement in Monasca. However, using the default CloudKitty configuration, every instance measurement is interpreted as if the associated instance ran for the whole hour. For example, an instance launched at 10:45 and terminated at 11:15 would result in two whole hours being charged, instead of just 30 minutes. This can be mitigated by reducing the [collect]/period setting in cloudkitty.conf, for example down to one minute, and adjusting the charge rate to match the new period. For this approach to work, we need to have at least one measurement stored for each period. This isn't possible with audit notifications sent by Nova because one hour is the lowest possible period. An alternative is to rely on continously updated metrics collected by Ceilometer, such as CPU utilisation. However, this kind of Ceilometer metrics is unavailable in our bare metal environment.

Once CloudKitty has analysed usage metrics, we can extract rated data to CSV format. As can be seen below, two whole hours have been charged for 0.5 each. In this case, the instance had been launched around 14:45 and terminated around 15:20. We have compared using pure Ceilometer and Gnocchi instead of Ceilosca and Monasca and noticed the exact same issue.

$ cloudkitty dataframes get -f df-to-csv --format-config-file cloudkitty-csv.yml
Begin,End,Metric Type,Qty,Cost,Project ID,Resource ID,User ID
2019-07-30T14:00:00,2019-07-30T15:00:00,vcpus,64.0,32.0,35be5437552f40cba2aa6e5cb47df613,b7d926a8-cd63-4205-8f90-e3c610aeaad5,53ed408e5a7a4e79baa76803e1df61d6
2019-07-30T15:00:00,2019-07-30T16:00:00,vcpus,64.0,32.0,35be5437552f40cba2aa6e5cb47df613,b7d926a8-cd63-4205-8f90-e3c610aeaad5,53ed408e5a7a4e79baa76803e1df61d6

A downside of using Ceilosca instead of Ceilometer with Gnocchi is that metadata such as instance flavour is not available for CloudKitty to use for rating by default, at least in the Rocky release that we used. We will update this post if we can develop a configuration for Ceilosca that supports this feature.

OpenStack usage metrics without Ceilometer

Monasca has plans to capture OpenStack notifications and store them with the Monasca Events API, although this is not yet implemented. CloudKitty would require changes to support charging based on these events, since it is currently designed around metrics. It is worth pointing out that an ElasticSearch storage driver has just been proposed in CloudKitty, so these two new designs may line up in the future.

In the meantime, an alternative is to bypass Ceilometer completely and rely on another mechanism to publish metrics to Monasca. As mentioned earlier in this article, Monasca can provide instance metrics via the Libvirt plugin. However, this won't cover other services for which we may want to charge, such as volume usage.

Since the Monasca Agent can scrape metrics from Prometheus exporters, we are exploring whether we can leverage openstack-exporter to provide metrics to be rated by CloudKitty. Stay tuned for the next blog post on this topic!

StackHPC

CloudKitty and Monasca: OpenStack charging without Telemetry

Ceilometer + Monasca = Ceilosca

OpenStack usage metrics without Ceilometer