[JBoss JIRA] (ISPN-4843) org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4843?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4843:
------------------------------------------
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1102058, https://bugzilla.redhat.com/show_bug.cgi?id=1138572 (was: https://bugzilla.redhat.com/show_bug.cgi?id=1102058)
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-4843
> URL: https://issues.jboss.org/browse/ISPN-4843
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 5.2.10.Final
> Reporter: Michal Vinkler
> Assignee: Dan Berindei
> Labels: 5.2.x
>
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails, so far, I've seen it on Windows (seems random there) and on RHEL7 (seems consistent - every JDK).
> Error Message
> {code}
> Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> Stacktrace
> org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217)
> at org.infinispan.CacheImpl.start(CacheImpl.java:582)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:518)
> at org.infinispan.manager.CacheManagerTest$4$1.call(CacheManagerTest.java:374)
> at org.infinispan.test.TestingUtil.withCacheManager(TestingUtil.java:1237)
> at org.infinispan.manager.CacheManagerTest$4.call(CacheManagerTest.java:370)
> at org.infinispan.test.TestingUtil.withCacheManagers(TestingUtil.java:1252)
> at org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations(CacheManagerTest.java:356)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache ___defaultcache on CacheManagerTest-NodeA-49570
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
> ... 36 more
> ... Removed 19 stack frames
> {code}
> Unrelated to https://issues.jboss.org/browse/ISPN-3963.
> Jenkins job link:
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-manu-infinisp...
> Downstream BZ was: https://bugzilla.redhat.com/show_bug.cgi?id=1102058
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4843) org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4843?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4843:
-----------------------------------------------
Michal Vinkler <mvinkler(a)redhat.com> changed the Status of [bug 1102058|https://bugzilla.redhat.com/show_bug.cgi?id=1102058] from NEW to CLOSED
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-4843
> URL: https://issues.jboss.org/browse/ISPN-4843
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 5.2.10.Final
> Reporter: Michal Vinkler
> Assignee: Dan Berindei
> Labels: 5.2.x
>
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails, so far, I've seen it on Windows (seems random there) and on RHEL7 (seems consistent - every JDK).
> Error Message
> {code}
> Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> Stacktrace
> org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217)
> at org.infinispan.CacheImpl.start(CacheImpl.java:582)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:518)
> at org.infinispan.manager.CacheManagerTest$4$1.call(CacheManagerTest.java:374)
> at org.infinispan.test.TestingUtil.withCacheManager(TestingUtil.java:1237)
> at org.infinispan.manager.CacheManagerTest$4.call(CacheManagerTest.java:370)
> at org.infinispan.test.TestingUtil.withCacheManagers(TestingUtil.java:1252)
> at org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations(CacheManagerTest.java:356)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache ___defaultcache on CacheManagerTest-NodeA-49570
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
> ... 36 more
> ... Removed 19 stack frames
> {code}
> Unrelated to https://issues.jboss.org/browse/ISPN-3963.
> Jenkins job link:
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-manu-infinisp...
> Downstream BZ was: https://bugzilla.redhat.com/show_bug.cgi?id=1102058
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4843) org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4843?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4843:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1102058
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
> ---------------------------------------------------------------------------------------------------
>
> Key: ISPN-4843
> URL: https://issues.jboss.org/browse/ISPN-4843
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 5.2.10.Final
> Reporter: Michal Vinkler
> Assignee: Dan Berindei
> Labels: 5.2.x
>
> org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails, so far, I've seen it on Windows (seems random there) and on RHEL7 (seems consistent - every JDK).
> Error Message
> {code}
> Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> Stacktrace
> org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217)
> at org.infinispan.CacheImpl.start(CacheImpl.java:582)
> at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686)
> at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)
> at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:518)
> at org.infinispan.manager.CacheManagerTest$4$1.call(CacheManagerTest.java:374)
> at org.infinispan.test.TestingUtil.withCacheManager(TestingUtil.java:1237)
> at org.infinispan.manager.CacheManagerTest$4.call(CacheManagerTest.java:370)
> at org.infinispan.test.TestingUtil.withCacheManagers(TestingUtil.java:1252)
> at org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations(CacheManagerTest.java:356)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache ___defaultcache on CacheManagerTest-NodeA-49570
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
> at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
> ... 36 more
> ... Removed 19 stack frames
> {code}
> Unrelated to https://issues.jboss.org/browse/ISPN-3963.
> Jenkins job link:
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-manu-infinisp...
> Downstream BZ was: https://bugzilla.redhat.com/show_bug.cgi?id=1102058
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4842) Shared consistent hash
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-4842?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura updated ISPN-4842:
-----------------------------------
Description:
A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and allow other caches to refer to the master cache for resource sharing.
was:
A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and refer to the cache to share resources.
> Shared consistent hash
> ----------------------
>
> Key: ISPN-4842
> URL: https://issues.jboss.org/browse/ISPN-4842
> Project: Infinispan
> Issue Type: Feature Request
> Components: Core
> Affects Versions: 7.0.0.CR1
> Reporter: Takayoshi Kimura
>
> A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
> Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
> It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and allow other caches to refer to the master cache for resource sharing.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4843) org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
by Michal Vinkler (JIRA)
Michal Vinkler created ISPN-4843:
------------------------------------
Summary: org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails randomly
Key: ISPN-4843
URL: https://issues.jboss.org/browse/ISPN-4843
Project: Infinispan
Issue Type: Bug
Components: Test Suite - Core
Affects Versions: 5.2.10.Final
Reporter: Michal Vinkler
Assignee: Dan Berindei
org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations fails, so far, I've seen it on Windows (seems random there) and on RHEL7 (seems consistent - every JDK).
Error Message
{code}
Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
Stacktrace
org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete() throws java.lang.InterruptedException on object of type StateTransferManagerImpl
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:205)
at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:886)
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:657)
at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:646)
at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:549)
at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:217)
at org.infinispan.CacheImpl.start(CacheImpl.java:582)
at org.infinispan.manager.DefaultCacheManager.wireAndStartCache(DefaultCacheManager.java:686)
at org.infinispan.manager.DefaultCacheManager.createCache(DefaultCacheManager.java:649)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:545)
at org.infinispan.manager.DefaultCacheManager.getCache(DefaultCacheManager.java:518)
at org.infinispan.manager.CacheManagerTest$4$1.call(CacheManagerTest.java:374)
at org.infinispan.test.TestingUtil.withCacheManager(TestingUtil.java:1237)
at org.infinispan.manager.CacheManagerTest$4.call(CacheManagerTest.java:370)
at org.infinispan.test.TestingUtil.withCacheManagers(TestingUtil.java:1252)
at org.infinispan.manager.CacheManagerTest.testCacheManagerRestartReusingConfigurations(CacheManagerTest.java:356)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.infinispan.CacheException: Initial state transfer timed out for cache ___defaultcache on CacheManagerTest-NodeA-49570
at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:216)
at org.infinispan.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:203)
... 36 more
... Removed 19 stack frames
{code}
Unrelated to https://issues.jboss.org/browse/ISPN-3963.
Jenkins job link:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-manu-infinisp...
Downstream BZ was: https://bugzilla.redhat.com/show_bug.cgi?id=1102058
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4842) Shared consistent hash
by Takayoshi Kimura (JIRA)
Takayoshi Kimura created ISPN-4842:
--------------------------------------
Summary: Shared consistent hash
Key: ISPN-4842
URL: https://issues.jboss.org/browse/ISPN-4842
Project: Infinispan
Issue Type: Feature Request
Components: Core
Affects Versions: 7.0.0.CR1
Reporter: Takayoshi Kimura
A user is testing 500 nodes cluster with 500 dist caches defined, and plans to expand it to 3000 caches.
Infinispan manages consistent hash per cache, uses a JGroups channel per cache and uses several threads per cache. It gives significant overhead with this large size cluster. When tested with this size, Infinispan easily exhausted all threads in the thread pools and deadlocks, and requires several thousands threads to handle massive JOIN requests - the coord receives 499 * 3000 JOIN requests.
It would be great if we can share the consistent hash and resources between caches. For example, define a "master" dist cache and refer to the cache to share resources.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4841) TopologyAwareConsistentHashFactory is slow for large cluster
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-4841?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura commented on ISPN-4841:
----------------------------------------
A perf test for this issue:
https://github.com/nekop/infinispan/blob/e96c8b1071b2ba74606d8b93c9d567da...
To execute with hprof:
$ cd core
$ mvn test -Dtest=distribution.RebalancePerfTest#testTopologyAwareConsistentHash -DforkJvmArgs="-agentlib:hprof=cpu=samples"
> TopologyAwareConsistentHashFactory is slow for large cluster
> ------------------------------------------------------------
>
> Key: ISPN-4841
> URL: https://issues.jboss.org/browse/ISPN-4841
> Project: Infinispan
> Issue Type: Enhancement
> Components: Core
> Affects Versions: 7.0.0.CR1
> Reporter: Takayoshi Kimura
>
> A user observed 100% CPU usage for a long time on coordinator node when booting 500 nodes with 500 caches defined.
> It looks like the TopologyAwareConsistentHashFactory performs O(n^2), it has double loop for all Machines. It takes 50 sec to compute rebalance with 1 cache 500 nodes. This calculation is performed on every cache, so it eats 25000 sec CPU times with 500 nodes 500 caches.
> The hprof shows 90% of the time is consumed in the TopologyInfo.computeMaxSegmentsForMachine().
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months
[JBoss JIRA] (ISPN-4841) TopologyAwareConsistentHashFactory is slow for large cluster
by Takayoshi Kimura (JIRA)
Takayoshi Kimura created ISPN-4841:
--------------------------------------
Summary: TopologyAwareConsistentHashFactory is slow for large cluster
Key: ISPN-4841
URL: https://issues.jboss.org/browse/ISPN-4841
Project: Infinispan
Issue Type: Enhancement
Components: Core
Affects Versions: 7.0.0.CR1
Reporter: Takayoshi Kimura
A user observed 100% CPU usage for a long time on coordinator node when booting 500 nodes with 500 caches defined.
It looks like the TopologyAwareConsistentHashFactory performs O(n^2), it has double loop for all Machines. It takes 50 sec to compute rebalance with 1 cache 500 nodes. This calculation is performed on every cache, so it eats 25000 sec CPU times with 500 nodes 500 caches.
The hprof shows 90% of the time is consumed in the TopologyInfo.computeMaxSegmentsForMachine().
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
11 years, 5 months