[Hawkular-dev] Metrics performance testing PR#520 - big perf improvement

Wed Jun 22 07:33:53 EDT 2016

This is awesome. Nice work Thomas!

> On Jun 22, 2016, at 6:56 AM, Filip Brychta <fbrychta at redhat.com> wrote:
> 
> This PR increased throughput 0 -- 200% depending on msg size!!!
> 
> Performance increase based on msg size:
> 1 datapoint per request - no change
> 10 datapoints per request - 100% throughput increase
> 100 datapoints per request - 200% throughput increase
> 500 datapoints per request - 180% throughput increase
> 5000 datapoints per request - 130% throughput increase
> 
> CPU usage is significantly higher, before it was 70% idle, after it's just 12% idle.
> 
> Filip
> 
> ----- Original Message -----
>> Hi,
>> 
>> Today I've been looking at Metrics insertion performance.
>> 
>> My setup is the following:
>> - on my laptop I run the Gatling load scenario (which is very similar to
>> the perf test job scenario)
>> - I also run a Metrics standalone instance
>> - on another machine connected to my LAN, I run a single node C* cluster
>> 
>> In order to avoid problems due to memory constraints, I set min and max
>> heap size to 2048m. Below this value the server spends significant time
>> in garbage collection (with high number of concurrent clients).
>> 
>> I ran Gatling 3 times with different number of clients (think agents):
>> 100,1000,2000
>> ----
>> mvn gatling:execute -Dclients=X -Dramp=0 -Dloops=50
>> ----
>> 
>> From 100 to 1000 virtual clients, the throughput raised accordingly
>> (x10). From 1000 to 2000, I hit the same kind of plateau Filip was
>> observing: I got barely 15% throughput increase. None of the machines
>> had reached cpu/io/memory limits.
>> 
>> So I ran the 2000 virtual clients test again and jstack (result
>> attached). As you can see, most of the task handler threads are in state
>> WAITING inside the
>> com.datastax.driver.core.HostConnectionPool.awaitAvailableConnection method.
>> 
>> Then, following the "Monitoring and tuning the pool" section of the
>> driver doc, I added some code to print the number of open connections,
>> active requests, and maximum capacity. It confirmed that the maximum
>> capacity was reached.
>> 
>> Note that by default with the V3 protocol, the driver creates just one
>> connection per host and allows a maximum of 1024 concurrent requests.
>> 
>> After that I added new options in the code to let the user define the
>> max number of connections per host as well as requests per connection.
>> 
>> In my tests, I set the maximum connection limit to 10 and maximum
>> concurrency to 5000. These are arbitrary values, they are simply bigger
>> than the default. Best values would depend on the environment and should
>> be determined with testing.
>> 
>> Then I ran Gatling 3 times again (100, 1000, 2000 virtual clients). This
>> time throughput increased linearly with the number of clients, from 100
>> to 1000 and then 2000. I also tried with 3000, but this time throughput
>> did not increase as much. This was certainly due to both machines now
>> using more than 80% cpu.
>> 
>> I sent a pull request [2] with the changes needed for the new
>> configuration parameters.
>> 
>> Hopefully it can help in Filip's tests as well.
>> 
>> Regards,
>> 
>> --
>> Thomas Segismont
>> JBoss ON Engineering Team
>> 
>> [1]
>> http://datastax.github.io/java-driver/manual/pooling/#monitoring-and-tuning-the-pool
>> [2] https://github.com/hawkular/hawkular-metrics/pull/520
>> 
>> _______________________________________________
>> hawkular-dev mailing list
>> hawkular-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hawkular-dev
>> 
> _______________________________________________
> hawkular-dev mailing list
> hawkular-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hawkular-dev