[
https://issues.jboss.org/browse/JGRP-1545?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on JGRP-1545:
------------------------------------
This happens in Infinispan when we start a replicated cache on one node, we start another
thread that runs transactions on that cache, and in parallel we start another node.
Normally replicated-mode commands are broadcasted to the entire cluster, so they would use
NAKACK2. But because of an optimization in Infinispan, when the cluster size is 2, we send
even broadcast requests as unicasts. Thus it's possible that a command is sent to a
joiner via unicast after the view had been updated on the coordinator but before the view
has been updated on the joiner.
UNICAST2: message sent before channel is connected leads to
exception
---------------------------------------------------------------------
Key: JGRP-1545
URL:
https://issues.jboss.org/browse/JGRP-1545
Project: JGroups
Issue Type: Bug
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 3.3
When a *unicast* request is received before the first view is received (which sets the
channel to connected), and a response to the request is sent, then the response sending
throws an exception as the channel hasn't yet been connected (see stack trace below).
Reason:
What happens when a channel is created is:
- JChannel.connect() is called by the user
- As part of this, start() is called in every protocol
- UDP.start() creates the sockets and starts the socket listener which will receive
messages from now on and pass them up
- Connect() now sends a JOIN request to the coordinator and receives the JOIN response or
become singleton member
- In either case, a new view is installed, which sets the channel to connected
If, between starting the socket listener and setting the channel to connected, a
*unicast* message is received, and it sends a response then you'll get the behavior
below.
In NAKACK2, this doesn't happen, as all messages received in the same time frame are
queued and replayed when the channel becomes connected.
I could implement the same behavior for UNICAST2, but I'd like to know more about how
you got into this situation. What were the steps executed ?
The only way this could happen I can think of is that the coordinator sent a unicast
message to P right after it sent the JOIN response to P, e.g. in the view change which
includes P. Does this happen in Infinispan code ?
{quote}
13:29:45,467 ERROR (OOB-1,ISPN,NodeB-50197:) [UNICAST2] couldn't deliver OOB message
[dst: NodeB-50197, src: NodeA-40526 (3 headers), size=118 bytes, flags=OOB|DONT_BUNDLE]
java.lang.IllegalStateException: channel is not connected
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:621)
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:535)
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390)
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248)
at
org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:604)
at org.jgroups.JChannel.up(JChannel.java:688)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020)
at org.jgroups.protocols.FRAG2.up(FRAG2.java:181)
at org.jgroups.protocols.FC.up(FC.java:479)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244)
at org.jgroups.protocols.UNICAST2.handleDataReceived(UNICAST2.java:736)
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:414)
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:606)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:205)
at org.jgroups.protocols.Discovery.up(Discovery.java:359)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1294)
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1857)
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1830)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
{quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira