[Hawkular-dev] First round of perf tests with more cassandra nodes - results

Fri Jan 15 10:04:54 EST 2016

Replied inline.

----- Original Message -----
> Thank you for the detailed summary. It is very insightful. I have a few
> comments. First, I am not all that interested in the tests in which client
> are submitting one data point per request. We have two primary feeds right
> now that we with which we are concerned - Heapster and the Hawkular agent.
> Neither will be submitting a single data point per request. It will more
> likely be in the hundreds or even thousands.

It's very easy to increase the number of data points per request up to ~700. Bigger messages are currently a problem because of the perfcake limitation - https://github.com/PerfCake/PerfCake/issues/260
I will update test configurations to use higher values.

> Secondly, we perform an additional write per metric on each request and wait
> for all those writes finish being replying. Whether a request contains 1 or
> 100 data points for a metrics in a request, we perform an additional write.
> The overhead of that additional write is probably offset more when there are
> more data points per metric. We could make those additional writes
> completely async, potentially sending a response before they complete and
> see if it yields and gains.

If you want to experiment with this, please create a PR. We can easily compare results then.

> Third, it doesn’t sound like we are even coming close maxing out resources on
> Cassandra nodes in most of the test configurations. It sounds more like
> H-Metrics is the limiting factor. When the numbers plateau or drop and
> adding more C* nodes doesn’t help, I would be interested to see how things
> look if we add an additional H-Metrics nodes. Is is possible with the load
> generator tool submit requests to multiple hosts in a round robin manner?

This is possible. Added to my todo list.

Filip

> > On Jan 14, 2016, at 8:48 AM, Filip Brychta <fbrychta at redhat.com> wrote:
> > 
> > Hello,
> > I did first quick perf testing of haw metrics STANDALONE with more
> > cassandra nodes and it showed some interesting results.
> > 
> > Important note is that hawkular and cassandra cluster were running on VMs
> > with shared storage. Which is very poor design for cassandra cluster but
> > it still showed some patterns which will be true for every setup.
> > Witch proper cassandra cluster (dedicated disks, CommitLog and SSTables on
> > different disks, ...) the results should be definitely better.
> > 
> > Summary of what was found (something is obvious even without testing):
> > - small messages (1 datapoint per request) utilize heavily cpu on hawkular
> > host and cassandra hosts are utilized gently
> > - bigger messages (100 datapoints per request) are less demanding on
> > hawkular host's cpu, cassandra hosts are utilized little bit more
> > - with week cpu on hawkular host, adding more cassandra nodes makes
> > performance even worst
> > - for small messages (1 datapoint per request) even with sufficient cpu on
> > hawkular host the performance improvement was only ~ 25% when number of
> > nodes in the cluster was increased from 1 to 2
> > - for bigger messages (100 datapoints per request) with sufficient cpu on
> > hawkular host the performance improvement was ~ 75% when number of nodes
> > in the cluster was increased from 1 to 2
> > - for small messages (1 datapoint per request) even with sufficient cpu on
> > hawkular host the performance does NOT scale up with more cassandra nodes
> > (see results - performance dropped when 4th node was added)
> > - for bigger messages (100 datapoints per request) with sufficient cpu on
> > hawkular host the performance scales up but not linearly (this could be
> > caused by shared storage and with proper cassandra cluster the results
> > will be better)
> > 
> > Questions:
> > - why is the performance getting worst when adding 3th and 4th storage
> > nodes when sending small messages and having sufficient cpu on hawkular
> > host?
> > 
> > 
> > About the test:
> > - load generator was hitting following endpoint
> > http://${server.host}:${server.port}/hawkular/metrics/gauges/data
> > - one test run takes 4 minutes
> > - message with one datapoint looks like this
> > [{"id":"gaugeID","data":[{"timestamp": "@{CurrentTimestamp}", "value":
> > 10.12}]}]
> > - load generator was using 300 threads (each thread is acting like single
> > client) and was sending messages containing 1 or 100 datapoints
> > - hawkular metrics is deployed on wildfly-9.0.2.Final
> > - metrics version:
> > "Implementation-Version":"0.12.0-SNAPSHOT","Built-From-Git-SHA1":"c35deda5d6d03429e97f1ed4a6e4ef12cf7f3a00"
> > 
> > 
> > Results:
> > 
> > ===================================================
> > VMs with 2 cores and shared storage, 4GB of memory.
> > ===================================================
> > 300 threads, 1 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra node  ~ 3945 req/sec
> > 2 cassandra nodes ~ 3751 req/sec
> > 3 cassandra nodes ~ 3318 req/sec
> > 4 cassandra nodes ~ 2726 req/sec
> > 
> > In this case the cpu on hawkular VM was fully used and adding more
> > cassandra nodes actually made performance worst.
> > Cpu on cassandra nodes was never fully used
> > 
> > 
> > 300 threads, 100 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra nodes ~ 102 req/sec
> > 2 cassandra nodes ~ 138 req/sec
> > 3 cassandra nodes ~ 188 req/sec
> > 4 cassandra nodes ~ 175 req/sec
> > 
> > 
> > With week cpu on hawkular VM and big messages (100 datapoints in each)
> > there is still some improvement when adding more cassandra nodes.
> > Cpu on cassandra nodes was never fully used
> > 
> > ===================================================
> > Hawkular VM with 4 cores, cassandra VMs 2 cores and shared storage,4GB of
> > memory.
> > ===================================================
> > 300 threads, 1 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra node  ~ 5150 req/sec
> > 2 cassandra nodes ~ 5667 req/sec
> > 3 cassandra nodes ~ 5799 req/sec
> > 4 cassandra nodes ~ 5476 req/sec
> > 
> > With stronger cpu on hawkular VM adding more cassandra nodes improves
> > performance but there is a drop when 4th node is added.
> > Cpu on cassandra nodes was never fully used
> > 
> > 300 threads, 100 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra nodes ~ 111 req/sec
> > 2 cassandra nodes ~ 173 req/sec
> > 3 cassandra nodes ~ 206 req/sec
> > 4 cassandra nodes ~ 211 req/sec
> > 
> > With stronger cpu on hawkular VM adding more cassandra nodes improves
> > performance.
> > Cpu on cassandra nodes was never fully used
> > 
> > ===================================================
> > Hawkular VM with 8 cores, cassandra VMs 2 cores and shared storage,4GB of
> > memory. Cpu on hawkular machine is used 30-40%
> > ===================================================
> > 300 threads, 1 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra node  ~ 5424 req/sec
> > 2 cassandra nodes ~ 6810 req/sec
> > 3 cassandra nodes ~ 6576 req/sec
> > 4 cassandra nodes ~ 6094 req/sec
> > 
> > Why there is a drop for 3th and 4th node?
> > 
> > 300 threads, 100 datapoint, hawkular_metrics |
> > org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"
> > ++++++++++++++++++++++++++++++++
> > 1 cassandra nodes ~ 97 req/sec
> > 2 cassandra nodes ~ 168 req/sec
> > 3 cassandra nodes ~ 222 req/sec
> > 4 cassandra nodes ~ 241 req/sec
> > 
> > 
> > Please let me know what you would like to see in next rounds of testing.
> > 
> > Filip
> > _______________________________________________
> > hawkular-dev mailing list
> > hawkular-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/hawkular-dev
> 
> 
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>