[JBoss JIRA] (ISPN-3035) Members can re-appear by itself in the consistent hash after leaving
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3035?page=com.atlassian.jira.plugin.... ]
Mircea Markus commented on ISPN-3035:
-------------------------------------
[~dan.berindei] what's the impact on the state? I imagine a new state transfer is initiated but fails and is rolledback as all the communication with C is dropped.
> Members can re-appear by itself in the consistent hash after leaving
> --------------------------------------------------------------------
>
> Key: ISPN-3035
> URL: https://issues.jboss.org/browse/ISPN-3035
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.5.Final, 5.3.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
> Attachments: dret.log, dret2.log
>
>
> Seen as an intermittent failure in DataRehashedEventTest:
> {noformat}
> 2013-04-23 14:07:45,459 DEBUG (testng-DataRehashedEventTest) [org.infinispan.manager.DefaultCacheManager] Stopping cache manager ISPN on NodeC-58711
> 2013-04-23 14:07:45,468 INFO (testng-DataRehashedEventTest) [org.infinispan.remoting.transport.jgroups.JGroupsTransport] ISPN000080: Disconnecting and closing JGroups Channel
> 2013-04-23 14:07:46,469 DEBUG (testng-DataRehashedEventTest) [org.jgroups.protocols.pbcast.GMS] NodeC-58711: sending LEAVE request to NodeA-28008
> 2013-04-23 14:07:46,489 DEBUG (Incoming-2,ISPN,NodeA-28008) [org.jgroups.protocols.pbcast.GMS] NodeA-28008: installing [NodeA-28008|4] [NodeA-28008, NodeB-46156, NodeC-58711]
> 2013-04-23 14:07:46,491 DEBUG (asyncTransportThread-0,NodeA) [org.infinispan.topology.ClusterTopologyManagerImpl] Starting cluster-wide rebalance for cache ___defaultcache, topology = CacheTopology{id=8, currentCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-28008, NodeB-46156]}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=2, members=[NodeA-28008, NodeB-46156, NodeC-58711]}}
> 2013-04-23 14:07:49,493 ERROR (testng-DataRehashedEventTest) [org.infinispan.test.fwk.UnitTestTestNGListener] Test testJoinAndLeave(org.infinispan.statetransfer.DataRehashedEventTest) failed.
> java.lang.AssertionError: expected [2] but found [6]
> at org.testng.Assert.fail(Assert.java:94)
> at org.testng.Assert.failNotEquals(Assert.java:494)
> at org.testng.Assert.assertEquals(Assert.java:123)
> at org.testng.Assert.assertEquals(Assert.java:370)
> at org.testng.Assert.assertEquals(Assert.java:380)
> at org.infinispan.statetransfer.DataRehashedEventTest.testJoinAndLeave(DataRehashedEventTest.java:114)
> {noformat}
> The initial cluster has 3 nodes: A, B, C. C is killed, but somehow remains in the ClusterCacheStatus on the coordinator.
> Then C re-appears in the JGroups view (possibly a JGroups issue). The problem in Infinispan is that the coordinator now sees C as a joiner, and it rebalances the cache to include C in the consistent hash again.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-2802) Cache recovery fails due to missing responses
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2802?page=com.atlassian.jira.plugin.... ]
Mircea Markus resolved ISPN-2802.
---------------------------------
Resolution: Cannot Reproduce Bug
as suggested by Radim.
> Cache recovery fails due to missing responses
> ---------------------------------------------
>
> Key: ISPN-2802
> URL: https://issues.jboss.org/browse/ISPN-2802
> Project: Infinispan
> Issue Type: Bug
> Components: State transfer
> Affects Versions: 5.2.0.CR3
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Fix For: 6.0.0.Alpha1
>
>
> When the cache recovery is started, the new coordinator sends CacheTopologyControlCommand.GET_STATUS to all nodes and waits for responses. However, I have a reproducible test-case where it always times out waiting for the responses.
> Here are the logs (TRACE is not doable here, but I added some byteman traces - see topology.btm in the archive): http://dl.dropbox.com/u/103079234/recovery.zip
> The problematic spot is on node3 at 05:37:57 receiving cluster view 34.
> All nodes (except the one which is killed, in this case node1) respond quickly to the GET_STATUS command (see BYTEMAN Receiving - Received pairs, these are bound to command execution in CommandAwareRpcDispatcher), but some responses are not received on node3 (look for Receiving rsp bound to GroupRequest).
> JGroups tracing could be useful here but it is not available (intensive logging often blocks on internal log4j locks and the node becomes unresponsive).
> As mentioned above, the case is reproducible, therefore if you can suggest any particular BYTEMAN hook, I can try it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-2735) The global component registry sometimes fails to start components injected on the fly
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2735?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2735:
--------------------------------
Fix Version/s: (was: 6.0.0.Alpha1)
> The global component registry sometimes fails to start components injected on the fly
> -------------------------------------------------------------------------------------
>
> Key: ISPN-2735
> URL: https://issues.jboss.org/browse/ISPN-2735
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.2.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Final
>
>
> If a global component is used by an incoming command, it could be created before the component registry is in the "running" state, yet after it has started invoking the components' start methods.
> If this happens, the component's start method(s) will never be invoked - neither "inline", in ACR.registerComponentInternal, nor in the regular startup procedure (ACR.internalStart).
> Specifically, the problem ocurred with LocalTopologyManager, which is injected in incoming CacheTopologyControlCommands. Since these commands are broadcasted to the entire cluster, a node can receive them before it had finished starting its global component registry.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-2735) The global component registry sometimes fails to start components injected on the fly
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2735?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2735:
--------------------------------
Fix Version/s: 6.0.0.Final
> The global component registry sometimes fails to start components injected on the fly
> -------------------------------------------------------------------------------------
>
> Key: ISPN-2735
> URL: https://issues.jboss.org/browse/ISPN-2735
> Project: Infinispan
> Issue Type: Bug
> Components: Locking and Concurrency
> Affects Versions: 5.2.0.Beta1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Alpha1, 6.0.0.Final
>
>
> If a global component is used by an incoming command, it could be created before the component registry is in the "running" state, yet after it has started invoking the components' start methods.
> If this happens, the component's start method(s) will never be invoked - neither "inline", in ACR.registerComponentInternal, nor in the regular startup procedure (ACR.internalStart).
> Specifically, the problem ocurred with LocalTopologyManager, which is injected in incoming CacheTopologyControlCommands. Since these commands are broadcasted to the entire cluster, a node can receive them before it had finished starting its global component registry.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-2709) Lib dir in distribution archive does not contain the proper versions for some dependencies
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2709?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2709:
--------------------------------
Fix Version/s: 6.0.0.Final
> Lib dir in distribution archive does not contain the proper versions for some dependencies
> -------------------------------------------------------------------------------------------
>
> Key: ISPN-2709
> URL: https://issues.jboss.org/browse/ISPN-2709
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 6.0.0.Alpha1, 6.0.0.Final
>
> Attachments: actual.txt, should_be.txt
>
>
> Not all the jars referenced by runtime-classpath.txt files of modules are actually present in the lib dir. In some cases the jar is present but not the needed version. Some of the jars are there but are not actually used.
> It all happens because the set of dependencies for runtime-classpath.txt is computed for each individual module while the lib dir in the distro is created by assembly plugin after 'merging' the dependencies of all modules which means that only the highest version will be included. Also, maven dependency plugin is known to miss some dependencies.
> To avoid the version problem we could define globally a single version for each of these dependencies in parent pom dependencyManagement and also explicitly add the dependency in the respective modules.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months
[JBoss JIRA] (ISPN-2709) Lib dir in distribution archive does not contain the proper versions for some dependencies
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-2709?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-2709:
--------------------------------
Assignee: Tristan Tarrant (was: Adrian Nistor)
> Lib dir in distribution archive does not contain the proper versions for some dependencies
> -------------------------------------------------------------------------------------------
>
> Key: ISPN-2709
> URL: https://issues.jboss.org/browse/ISPN-2709
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Adrian Nistor
> Assignee: Tristan Tarrant
> Fix For: 6.0.0.Alpha1, 6.0.0.Final
>
> Attachments: actual.txt, should_be.txt
>
>
> Not all the jars referenced by runtime-classpath.txt files of modules are actually present in the lib dir. In some cases the jar is present but not the needed version. Some of the jars are there but are not actually used.
> It all happens because the set of dependencies for runtime-classpath.txt is computed for each individual module while the lib dir in the distro is created by assembly plugin after 'merging' the dependencies of all modules which means that only the highest version will be included. Also, maven dependency plugin is known to miss some dependencies.
> To avoid the version problem we could define globally a single version for each of these dependencies in parent pom dependencyManagement and also explicitly add the dependency in the respective modules.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 10 months