[JBoss JIRA] (ISPN-4480) Messages sent to leavers can clog the JGroups bundler thread
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4480?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4480:
-------------------------------
Description:
In a stress test that repeatedly kills nodes while performing read/write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
{noformat}
06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
java.lang.Thread.sleep(Native Method)
org.jgroups.util.Util.sleep(Util.java:1504)
org.jgroups.util.Util.sleepRandom(Util.java:1574)
org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
org.jgroups.protocols.TP.doSend(TP.java:1670)
org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
java.lang.Thread.run(Thread.java:744)
{noformat}
There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
We can work around the problem by changing the JGroups configuration:
* TP.logical_addr_cache_expiration=86400000
** Only expire addresses after 1 day
* TP.physical_addr_max_fetch_attempts=1
** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
* UNICAST3_conn_close_timeout=10000
** Drop the pending messages to leavers sooner
was:
In a stress test that repeatedly kills nodes while performing write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
{noformat}
06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
java.lang.Thread.sleep(Native Method)
org.jgroups.util.Util.sleep(Util.java:1504)
org.jgroups.util.Util.sleepRandom(Util.java:1574)
org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
org.jgroups.protocols.TP.doSend(TP.java:1670)
org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
java.lang.Thread.run(Thread.java:744)
{noformat}
There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
We can work around the problem by changing the JGroups configuration:
* TP.logical_addr_cache_expiration=86400000
** Only expire addresses after 1 day
* TP.physical_addr_max_fetch_attempts=1
** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
* UNICAST3_conn_close_timeout=10000
** Drop the pending messages to leavers sooner
> Messages sent to leavers can clog the JGroups bundler thread
> ------------------------------------------------------------
>
> Key: ISPN-4480
> URL: https://issues.jboss.org/browse/ISPN-4480
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Core
> Affects Versions: 6.0.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
>
> In a stress test that repeatedly kills nodes while performing read/write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
> {noformat}
> 06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
> java.lang.Thread.sleep(Native Method)
> org.jgroups.util.Util.sleep(Util.java:1504)
> org.jgroups.util.Util.sleepRandom(Util.java:1574)
> org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
> org.jgroups.protocols.TP.doSend(TP.java:1670)
> org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
> org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
> org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
> java.lang.Thread.run(Thread.java:744)
> {noformat}
> There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
> There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
> We can work around the problem by changing the JGroups configuration:
> * TP.logical_addr_cache_expiration=86400000
> ** Only expire addresses after 1 day
> * TP.physical_addr_max_fetch_attempts=1
> ** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
> * UNICAST3_conn_close_timeout=10000
> ** Drop the pending messages to leavers sooner
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4480) Messages sent to leavers can clog the JGroups bundler thread
by Dan Berindei (JIRA)
Dan Berindei created ISPN-4480:
----------------------------------
Summary: Messages sent to leavers can clog the JGroups bundler thread
Key: ISPN-4480
URL: https://issues.jboss.org/browse/ISPN-4480
Project: Infinispan
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Core
Affects Versions: 6.0.2.Final
Reporter: Dan Berindei
Assignee: Dan Berindei
In a stress test that repeatedly kills nodes while performing write operations, the TransferQueueBundler thread seems to spend a lot of time waiting for physical addresses:
{noformat}
06:40:10,316 WARN [org.radargun.utils.Utils] (pool-5-thread-1) Stack for thread TransferQueueBundler,default,apex953-14666:
java.lang.Thread.sleep(Native Method)
org.jgroups.util.Util.sleep(Util.java:1504)
org.jgroups.util.Util.sleepRandom(Util.java:1574)
org.jgroups.protocols.TP.sendToSingleMember(TP.java:1685)
org.jgroups.protocols.TP.doSend(TP.java:1670)
org.jgroups.protocols.TP$TransferQueueBundler.sendBundledMessages(TP.java:2476)
org.jgroups.protocols.TP$TransferQueueBundler.sendMessages(TP.java:2392)
org.jgroups.protocols.TP$TransferQueueBundler.run(TP.java:2383)
java.lang.Thread.run(Thread.java:744)
{noformat}
There are 2 bugs related to this already fixed in JGroups 3.5.0.Beta2+: JGRP-1814, JGRP-1815
There is also a special case where the physical address could be removed from the cache too soon, exacerbating the effect of JGRP-1815: JGRP-1858
We can work around the problem by changing the JGroups configuration:
* TP.logical_addr_cache_expiration=86400000
** Only expire addresses after 1 day
* TP.physical_addr_max_fetch_attempts=1
** Sleep for only 20ms waiting for the physical address (default 3 - 1500ms)
* UNICAST3_conn_close_timeout=10000
** Drop the pending messages to leavers sooner
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4479) Remote executor thread pool configuration is ignored
by Dan Berindei (JIRA)
Dan Berindei created ISPN-4479:
----------------------------------
Summary: Remote executor thread pool configuration is ignored
Key: ISPN-4479
URL: https://issues.jboss.org/browse/ISPN-4479
Project: Infinispan
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Core
Affects Versions: 7.0.0.Alpha4
Reporter: Dan Berindei
Assignee: Dan Berindei
Fix For: 7.0.0.Alpha5
Currently NamedExecutorsFactory uses the replication queue executor's configuration to build the remote executor's thread pool.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4471) MapReduceTask: memory leak with useIntermediateSharedCache = true
by Vladimir Blagojevic (JIRA)
[ https://issues.jboss.org/browse/ISPN-4471?page=com.atlassian.jira.plugin.... ]
Vladimir Blagojevic updated ISPN-4471:
--------------------------------------
Git Pull Request: https://github.com/infinispan/infinispan/pull/2693/
Workaround Description: Users can clean tmp cache themselves. The name of the cache is MapReduceTask.DEFAULT_TMP_CACHE_CONFIGURATION_NAME
> MapReduceTask: memory leak with useIntermediateSharedCache = true
> -----------------------------------------------------------------
>
> Key: ISPN-4471
> URL: https://issues.jboss.org/browse/ISPN-4471
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Distributed Execution and Map/Reduce
> Affects Versions: 6.0.2.Final
> Reporter: Rich DiCroce
> Assignee: Vladimir Blagojevic
>
> When using an intermediate shared cache for the reduce phase, MapReduceTask puts the entries into the cache with no expiration and apparently never removes them. This eventually results in OutOfMemoryErrors.
> One workaround is to disable use of the intermediate shared cache, which causes a new cache to be created and destroyed for every task, which "fixes" the problem of not removing intermediate values. However, it causes a ton of log spam:
> {noformat}
> 2014-07-02 11:55:10,014 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-21) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,016 INFO [org.jboss.as.clustering.infinispan] (transport-thread-21) JBAS010281: Started e71dddc0-60ce-4cb9-ac8c-615d60866393 cache from GamingPortal container
> 2014-07-02 11:55:10,023 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-5) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,024 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-4) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,025 INFO [org.jboss.as.clustering.infinispan] (transport-thread-5) JBAS010281: Started 22d387d6-69c6-48b2-9701-ea64c08d66ad cache from GamingPortal container
> 2014-07-02 11:55:10,026 INFO [org.jboss.as.clustering.infinispan] (transport-thread-4) JBAS010281: Started bfaf92a0-a030-4624-93a7-0fee097415d7 cache from NMS container
> 2014-07-02 11:55:10,037 INFO [org.jboss.as.clustering.infinispan] (EJB default - 2) JBAS010282: Stopped 22d387d6-69c6-48b2-9701-ea64c08d66ad cache from GamingPortal container
> 2014-07-02 11:55:10,040 INFO [org.jboss.as.clustering.infinispan] (EJB default - 1) JBAS010282: Stopped bfaf92a0-a030-4624-93a7-0fee097415d7 cache from NMS container
> 2014-07-02 11:55:10,047 INFO [org.jboss.as.clustering.infinispan] (EJB default - 6) JBAS010282: Stopped e71dddc0-60ce-4cb9-ac8c-615d60866393 cache from GamingPortal container
> 2014-07-02 11:55:10,047 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-0) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,048 INFO [org.jboss.as.clustering.infinispan] (transport-thread-0) JBAS010281: Started bed74bd3-a227-43e0-b262-62c19dd444a7 cache from GamingPortal container
> 2014-07-02 11:55:10,052 INFO [org.jboss.as.clustering.infinispan] (EJB default - 2) JBAS010282: Stopped bed74bd3-a227-43e0-b262-62c19dd444a7 cache from GamingPortal container
> 2014-07-02 11:55:10,063 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-7) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,064 INFO [org.jboss.as.clustering.infinispan] (transport-thread-7) JBAS010281: Started 63cce570-0169-40c2-bc9f-e045c2864702 cache from GamingPortal container
> 2014-07-02 11:55:10,068 INFO [org.jboss.as.clustering.infinispan] (EJB default - 2) JBAS010282: Stopped 63cce570-0169-40c2-bc9f-e045c2864702 cache from GamingPortal container
> 2014-07-02 11:55:10,072 INFO [org.infinispan.jmx.CacheJmxRegistration] (transport-thread-19) ISPN000031: MBeans were successfully registered to the platform MBean server.
> 2014-07-02 11:55:10,073 INFO [org.jboss.as.clustering.infinispan] (transport-thread-19) JBAS010281: Started 83f7b355-d4c6-4a0a-aade-ce2509293d77 cache from GamingPortal container
> 2014-07-02 11:55:10,077 INFO [org.jboss.as.clustering.infinispan] (EJB default - 2) JBAS010282: Stopped 83f7b355-d4c6-4a0a-aade-ce2509293d77 cache from GamingPortal container
> {noformat}
> I also observed one NullPointerException with distributeReducePhase = true and useIntermediateSharedCache = false. This could be related to ISPN-4460, but I'm not sure.
> {noformat}
> Caused by: org.infinispan.commons.CacheException: java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:348) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:634) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$3.call(MapReduceTask.java:652) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapReduceTaskFuture.get(MapReduceTask.java:760) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> ... 63 more
> Caused by: java.util.concurrent.ExecutionException: org.infinispan.commons.CacheException: java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) [rt.jar:1.7.0_45]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) [rt.jar:1.7.0_45]
> at org.infinispan.distexec.mapreduce.MapReduceTask$TaskPart.get(MapReduceTask.java:845) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.executeMapPhase(MapReduceTask.java:439) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask.execute(MapReduceTask.java:342) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> ... 66 more
> Caused by: org.infinispan.commons.CacheException: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:100) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.invokeMapCombineLocally(MapReduceTask.java:967) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart.access$200(MapReduceTask.java:894) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:916) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceTask$MapTaskPart$1.call(MapReduceTask.java:912) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_45]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_45]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_45]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]
> Caused by: java.lang.NullPointerException
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapKeysToNodes(MapReduceManagerImpl.java:355) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.migrateIntermediateKeys(MapReduceManagerImpl.java:264) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.combine(MapReduceManagerImpl.java:258) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.mapAndCombineForDistributedReduction(MapReduceManagerImpl.java:98) [infinispan-core-6.0.2.Final.jar:6.0.2.Final]
> ... 10 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4478) ServerFailureRetrySingleOwnerTest can have issues creating cache managers
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-4478?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4478:
-----------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/2692
> ServerFailureRetrySingleOwnerTest can have issues creating cache managers
> -------------------------------------------------------------------------
>
> Key: ISPN-4478
> URL: https://issues.jboss.org/browse/ISPN-4478
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Remote Protocols
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
> Fix For: 7.0.0.Alpha5
>
>
> {code}
> java.lang.RuntimeException: Timed out before caches had complete views. Expected 3 members in each view. Views are as follows: [[ServerFailureRetrySingleOwnerTest-NodeD-62958, ServerFailureRetrySingleOwnerTest-NodeE-52896], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197]]
> at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:268)
> at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:258)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:250)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:291)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:922)
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:226)
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:233)
> at org.infinispan.client.hotrod.retry.AbstractRetryTest.createCacheManagers(AbstractRetryTest.java:63)
> at org.infinispan.test.MultipleCacheManagersTest.callCreateCacheManagers(MultipleCacheManagersTest.java:70)
> at org.infinispan.test.MultipleCacheManagersTest.createBeforeMethod(MultipleCacheManagersTest.java:80)
> at org.infinispan.client.hotrod.HitsAwareCacheManagersTest.createBeforeMethod(HitsAwareCacheManagersTest.java:35)
> {code}
> In the logs, you see messages like this:
> {code}
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 0 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 1 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 2 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 3 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 4 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> {code}
> The fact that after each method a cache manager is created might be causing issues here.
> According to test ServerFailureRetrySingleOwnerTest.testRetryPutIfAbsent runs fine and the next time createBeforeMethod is called, the issue appears.
> By doing creating the cache managers a single time, on test class startup, we should be able to speed up execution too.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4478) ServerFailureRetrySingleOwnerTest can have issues creating cache managers
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-4478?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño commented on ISPN-4478:
----------------------------------------
Actually, once I removed clearing after each test method, the bug showed up :)
We add a listener via:
{code}
server.getCacheManager().getCache().addListener(listener);
{code}
But remove it via:
{code}
server.getCacheManager().removeListener(listener);
{code}
which does nothing because the listener is a cache listener, not a cache manager listener.
> ServerFailureRetrySingleOwnerTest can have issues creating cache managers
> -------------------------------------------------------------------------
>
> Key: ISPN-4478
> URL: https://issues.jboss.org/browse/ISPN-4478
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Remote Protocols
> Reporter: Galder Zamarreño
> Assignee: Galder Zamarreño
> Labels: testsuite_stability
> Fix For: 7.0.0.Alpha5
>
>
> {code}
> java.lang.RuntimeException: Timed out before caches had complete views. Expected 3 members in each view. Views are as follows: [[ServerFailureRetrySingleOwnerTest-NodeD-62958, ServerFailureRetrySingleOwnerTest-NodeE-52896], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197]]
> at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:268)
> at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:258)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:250)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:291)
> at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:922)
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:226)
> at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:233)
> at org.infinispan.client.hotrod.retry.AbstractRetryTest.createCacheManagers(AbstractRetryTest.java:63)
> at org.infinispan.test.MultipleCacheManagersTest.callCreateCacheManagers(MultipleCacheManagersTest.java:70)
> at org.infinispan.test.MultipleCacheManagersTest.createBeforeMethod(MultipleCacheManagersTest.java:80)
> at org.infinispan.client.hotrod.HitsAwareCacheManagersTest.createBeforeMethod(HitsAwareCacheManagersTest.java:35)
> {code}
> In the logs, you see messages like this:
> {code}
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 0 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 1 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 2 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 3 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> [22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 4 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
> {code}
> The fact that after each method a cache manager is created might be causing issues here.
> According to test ServerFailureRetrySingleOwnerTest.testRetryPutIfAbsent runs fine and the next time createBeforeMethod is called, the issue appears.
> By doing creating the cache managers a single time, on test class startup, we should be able to speed up execution too.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4478) ServerFailureRetrySingleOwnerTest can have issues creating cache managers
by Galder Zamarreño (JIRA)
Galder Zamarreño created ISPN-4478:
--------------------------------------
Summary: ServerFailureRetrySingleOwnerTest can have issues creating cache managers
Key: ISPN-4478
URL: https://issues.jboss.org/browse/ISPN-4478
Project: Infinispan
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Remote Protocols
Reporter: Galder Zamarreño
Assignee: Galder Zamarreño
Fix For: 7.0.0.Alpha5
{code}
java.lang.RuntimeException: Timed out before caches had complete views. Expected 3 members in each view. Views are as follows: [[ServerFailureRetrySingleOwnerTest-NodeD-62958, ServerFailureRetrySingleOwnerTest-NodeE-52896], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197], [ServerFailureRetrySingleOwnerTest-NodeE-52896, ServerFailureRetrySingleOwnerTest-NodeF-14197]]
at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:268)
at org.infinispan.test.TestingUtil.viewsTimedOut(TestingUtil.java:258)
at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:250)
at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:291)
at org.infinispan.test.TestingUtil.blockUntilViewsReceived(TestingUtil.java:922)
at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:226)
at org.infinispan.test.MultipleCacheManagersTest.waitForClusterToForm(MultipleCacheManagersTest.java:233)
at org.infinispan.client.hotrod.retry.AbstractRetryTest.createCacheManagers(AbstractRetryTest.java:63)
at org.infinispan.test.MultipleCacheManagersTest.callCreateCacheManagers(MultipleCacheManagersTest.java:70)
at org.infinispan.test.MultipleCacheManagersTest.createBeforeMethod(MultipleCacheManagersTest.java:80)
at org.infinispan.client.hotrod.HitsAwareCacheManagersTest.createBeforeMethod(HitsAwareCacheManagersTest.java:35)
{code}
In the logs, you see messages like this:
{code}
[22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 0 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
[22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 1 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
[22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 2 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
[22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 3 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
[22:43:33] : [org.infinispan:infinispan-client-hotrod] 22:43:33,977 ERROR [StateConsumerImpl] (transport-thread-ServerFailureRetrySingleOwnerTest-NodeE-p848-t1:) ISPN000208: No live owners found for segment 4 of cache __cluster_registry_cache__. Current owners are: [ServerFailureRetrySingleOwnerTest-NodeD-62958]. Faulty owners: [ServerFailureRetrySingleOwnerTest-NodeD-62958]
{code}
The fact that after each method a cache manager is created might be causing issues here.
According to test ServerFailureRetrySingleOwnerTest.testRetryPutIfAbsent runs fine and the next time createBeforeMethod is called, the issue appears.
By doing creating the cache managers a single time, on test class startup, we should be able to speed up execution too.
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months
[JBoss JIRA] (ISPN-4447) RHQ server plugin: dumpkeys operation on cache fails as no operation (old node)
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4447?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4447:
-----------------------------------------------
Tomas Sykora <tsykora(a)redhat.com> changed the Status of [bug 1113647|https://bugzilla.redhat.com/show_bug.cgi?id=1113647] from NEW to CLOSED
> RHQ server plugin: dumpkeys operation on cache fails as no operation (old node)
> --------------------------------------------------------------------------------
>
> Key: ISPN-4447
> URL: https://issues.jboss.org/browse/ISPN-4447
> Project: Infinispan
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JMX, reporting and management
> Affects Versions: 7.0.0.Alpha4
> Reporter: Tomas Sykora
> Assignee: Mircea Markus
> Labels: rhq
>
> RHQ server plugins provides possibility for issuing 3 vital rolling upgrades operations.
> However, the first operation record-known-global-keyset fails when invokend from RHQ GUI on a particular cache for old server with this error:
> java.lang.Exception: JBAS014884: No operation named 'record-known-global-keyset' exists at address [
> ("subsystem" => "infinispan"),
> ("cache-container" => "local"),
> ("local-cache" => "default")
> ], rolled-back=true
> at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:722)
--
This message was sent by Atlassian JIRA
(v6.2.6#6264)
10 years, 6 months