[
https://issues.jboss.org/browse/ISPN-2612?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on ISPN-2612:
------------------------------------
[~mlinhard], can you post some TRACE logs? I have run the test on my machine but I
haven't noticed anything unusual.
[~an1310], I have tried looking at your logs, but you have way too many clusters running
and my eyes just glazed over (also, your truncated logs don't show any
TimeoutException).
Problem broadcasting CH_UPDATE command
--------------------------------------
Key: ISPN-2612
URL:
https://issues.jboss.org/browse/ISPN-2612
Project: Infinispan
Issue Type: Bug
Components: RPC
Affects Versions: 5.2.0.Beta5
Reporter: Michal Linhard
Assignee: Dan Berindei
Priority: Blocker
Fix For: 5.2.0.CR2
Attachments: session-cluster.xml, test.zip
Infinispan 5.2.0.Beta5
JGroups 3.2.4.Final
Steps to reproduce (I'm using two virtual interfaces test1, test2)
1. Start org.jboss.qa.jdg.Test with -Djgroups.udp.bind_addr=test1
-Djava.net.preferIPv4Stack=true
2. wait 10 sec
3. Start org.jboss.qa.jdg.Test with -Djgroups.udp.bind_addr=test2
-Djava.net.preferIPv4Stack=true
After 5 seconds there should be this timeout exception:
{code}
19:42:14,146 WARN [org.infinispan.topology.CacheTopologyControlCommand]
(OOB-2,mlinhard-work-37329) ISPN000071: Caught exception when handling command
CacheTopologyControlCommand{cache=___defaultcache, type=REBALANCE_CONFIRM,
sender=mlinhard-work-47337, joinInfo=null, topologyId=1, currentCH=null, pendingCH=null,
throwable=null, viewId=1}
java.util.concurrent.ExecutionException: org.infinispan.CacheException:
org.jgroups.TimeoutException: TimeoutException
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:232)
at java.util.concurrent.FutureTask.get(FutureTask.java:91)
at
org.infinispan.topology.ClusterTopologyManagerImpl.executeOnClusterSync(ClusterTopologyManagerImpl.java:563)
at
org.infinispan.topology.ClusterTopologyManagerImpl.broadcastConsistentHashUpdate(ClusterTopologyManagerImpl.java:349)
at
org.infinispan.topology.ClusterTopologyManagerImpl.handleRebalanceCompleted(ClusterTopologyManagerImpl.java:213)
at
org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:160)
at
org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:252)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:219)
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483)
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598)
at org.jgroups.JChannel.up(JChannel.java:703)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
at org.jgroups.protocols.RSVP.up(RSVP.java:172)
at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736)
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414)
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:143)
at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:187)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:205)
at org.jgroups.protocols.Discovery.up(Discovery.java:359)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1287)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.infinispan.CacheException: org.jgroups.TimeoutException: TimeoutException
at org.infinispan.util.Util.rewrapAsCacheException(Util.java:532)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:152)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:518)
at
org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:545)
at
org.infinispan.topology.ClusterTopologyManagerImpl$2.call(ClusterTopologyManagerImpl.java:542)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more
Caused by: org.jgroups.TimeoutException: TimeoutException
at org.jgroups.util.Promise._getResultWithTimeout(Promise.java:145)
at org.jgroups.util.Promise.getResultWithTimeout(Promise.java:40)
at org.jgroups.util.AckCollector.waitForAllAcks(AckCollector.java:93)
at org.jgroups.protocols.RSVP$Entry.block(RSVP.java:287)
at org.jgroups.protocols.RSVP.down(RSVP.java:118)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:1025)
at org.jgroups.JChannel.down(JChannel.java:718)
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:616)
at org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:173)
at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:360)
at org.jgroups.blocks.GroupRequest.sendRequest(GroupRequest.java:103)
at org.jgroups.blocks.Request.execute(Request.java:83)
at org.jgroups.blocks.MessageDispatcher.cast(MessageDispatcher.java:335)
at org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:249)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processCalls(CommandAwareRpcDispatcher.java:330)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommands(CommandAwareRpcDispatcher.java:145)
... 8 more
{code}
Analysis:
These are the messages sent after view change:
{code}
test1 test2
<--- JOIN ----
---- REBALANCE_START --->
<--- StateRequestCommand ----
---- StateResponseCommand --->
<--- REBALANCE_CONFIRM ----
---- CH_UPDATE --->
{code}
The last CH_UPDATE message is broadcast, test2 successfully processes it, but test1 stays
in waiting state, because it for some reason awaits response also from itself - local
variable entry in the method RSVP.down
(
https://github.com/belaban/JGroups/blob/master/src/org/jgroups/protocols/...)
contained local address.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira