[infinispan-issues] [JBoss JIRA] (ISPN-4841) TopologyAwareConsistentHashFactory is slow for large cluster

Wed Oct 15 03:16:35 EDT 2014

    [ https://issues.jboss.org/browse/ISPN-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011899#comment-13011899 ] 

Takayoshi Kimura commented on ISPN-4841:
----------------------------------------

A perf test for this issue:

https://github.com/nekop/infinispan/blob/e96c8b1071b2ba74606d8b93c9d567da09fd1963/core/src/test/java/org/infinispan/distribution/RebalancePerfTest.java

To execute with hprof:

$ cd core
$ mvn test -Dtest=distribution.RebalancePerfTest#testTopologyAwareConsistentHash -DforkJvmArgs="-agentlib:hprof=cpu=samples" 

> TopologyAwareConsistentHashFactory is slow for large cluster
> ------------------------------------------------------------
>
>                 Key: ISPN-4841
>                 URL: https://issues.jboss.org/browse/ISPN-4841
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Core
>    Affects Versions: 7.0.0.CR1
>            Reporter: Takayoshi Kimura
>
> A user observed 100% CPU usage for a long time on coordinator node when booting 500 nodes with 500 caches defined.
> It looks like the TopologyAwareConsistentHashFactory performs O(n^2), it has double loop for all Machines. It takes 50 sec to compute rebalance with 1 cache 500 nodes. This calculation is performed on every cache, so it eats 25000 sec CPU times with 500 nodes 500 caches.
> The hprof shows 90% of the time is consumed in the TopologyInfo.computeMaxSegmentsForMachine().

--
This message was sent by Atlassian JIRA
(v6.3.1#6329)