Integrating Monasca and Ceilometer seemed like a very good idea from the start. It would integrate all the OpenStack resources notifications and metrics as well as provide a unified storage layer for Monitoring and Metering, simplifying deployment at scale as well as opening the door for new solutions that weren't possible before.
So the team set out to make it real. The implementation was carried on a three months period and all the code, unit tests and load simulator are open-source and available in the official OpenStack repo at: https://github.com/openstack/monasca-ceilometer
You can also find a replay of the presentation given at the OpenStack Summit in Tokyo: https://www.youtube.com/watch?v=5-IvVwIoCzM
There are at least two aspects that made this "marriage" titillating:
So, we embarked in this experiment and found a neat way to integrate the two services. The first part was the ingestion. Ceilometer has two main types of agents:
Our integration with Monasca was trying to address all the cases and support both the Push and Pull model. The Compute Agent is probably the part where there is the most overlap with the Monasca Agent that is capable of polling from libvirt or other virtualization layers. We decided to extend the Publisher code in Ceilometer to integrate with Monasca client and send the "measurements" to the Monasca API.
Current Ceilometer architecture:
This brought two distinct advantages to the solution: the first is that we can integrate with any of the Ceilometer agents out of the box, so we can integrate data from all the data sources that Ceilometer supports now and in the future; the second is that we remove the RabbitMQ re-publishing of Samples.
This latter aspect is particularly problematic in large deployments. RabbitMQ clustering has some limitations around the 20M-mark load; this can slow down the queue performance and impact other services relying on the queue. In the Ceilosca case the samples are sent as "measurements" directly to the Monasca API, which stores them into Kafka. This allows for a different publishing rate from the storage rate increasing performance and optimization at the distinct layers.
Ceilosca Architecture:
The Monasca Publisher in the Ceilometer agent also leverages another important aspect in publishing to Monasca, batching. The Monasca Publisher for Ceilometer has three parameters that can be set to control the batching behavior and performance:
In several of our tests we found out the batch of 1000 messages and a timeout of 15s with a polling interval of 5s is a good compromise for a mix of Central Agent loads and OpenStack Notifications.
We all know that deploying and running OpenStack services is not the easiest thing on Earth. For this reason we wanted to move away from sophisticated deployments and make sure the deployment was well understood and one command deployment. We wanted that everybody was able to get Ceilosca to run either on a single VM (or box), so we thought to leverage DevStack.... We know, DevStack is for development and not for scale and performance testing, but guess what; if it scales in DevStack it will scale everywhere else. Hmm, not sure you should keep this statement.
What we needed next was a deployment script; a single unified script to install everything and have it running. Fortunately, both Monasca and DevStack had already deployment scripts that we could run and leverage, the only difference? Monasca uses Ansible and DevStack uses bash ... so; we created a new bash script that installs devstack and then runs ansible to deploy Monasca on top of Devstack and that did the trick. Once you download the repo just go and execute:
/deployer/ceilosca.sh
and (depending on your env) after some time you will get a full DevStack with Ceilosca in it and you are: Ready to GO!
The Devstack+Ceilosca+Monasca is the environment where we run all the tests and we had it running both on virtual machines and baremetals.
Note, we now have a complete DevStack plugin for Monasca.
As we mentioned before the tests were running on DevStack. This is to make sure that the tests are repeatable from anyone that is interested in running them. Clearly DevStack brought some restrictions that we had to deal with it. Moreover, some of us decided to run these tests in OpenStack VM and that made it even more challenging ... (hey, we may even try stuff on containers later on, maybe using Kolla...). I will post the results of the these tests in 2 separate blogs relating to Private and Public Cloud.
Ceilosca turned out to be a significant improvement over Ceilometer both during data ingestion as well as querying. The performance gain is quite staggering going from 2x to 4x in ingestion speed and throughput as well as 11x to 18x in querying. These are the main takeaways from the extensive testing we performed:
Cisco: Fabio Giannetti, Ken Owens, Srinivas Sakhamuri, Pauline Yeung, Steven Irvin
HP:Roland Hockmuth, Dan Dyer, Atul Aggarway, Jenny Wei, Putta Challa, Rohit Jaisway