TIMED_WAITING loop prevents message propagation to cluster due to FC
blockage
-----------------------------------------------------------------------------
Key: JGRP-1334
URL:
https://issues.jboss.org/browse/JGRP-1334
Project: JGroups
Issue Type: Bug
Environment: Sparc Solaris 10
Reporter: Simone Scarduzio
Assignee: Bela Ban
Labels: FC, credits
Cannot send messages to cluster because node is waiting to have FC credit which never
arrives.
JGroups sender thread stays constantly in TIMED_WAITING loop having not enough credit to
send any bytes.
We have been able to reproduce the issue which occurs after some days of activity, but
the actual triggering event is unclear, no clear test case available.
Cluster is made of about 25 nodes, we're using Oracle jdk 1.6.0_24. No GC time is
observed being bigger than 3.6s around the cluster nodes during problem happening. Still
we read a lot of VERIFY_SUSPECT in the logs, mentioning many different node address.
See the stack trace in which we see the blockage:
"QuotaDistributor" daemon prio=3 tid=0x00a9cc00 nid=0x71 waiting on condition
[0x63f1f000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x7b8392e0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116)
at org.jgroups.protocols.FC.handleDownMessage(FC.java:550)
at org.jgroups.protocols.FC.down(FC.java:424)
at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)
at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:216)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:836)
at org.jgroups.JChannel.down(JChannel.java:1626)
at org.jgroups.JChannel.send(JChannel.java:721)
at org.jgroups.JChannel.send(JChannel.java:737)
at
com.firsthop.common.platform.jgroups.JGroupsManager.send(JGroupsManager.java:193)
Please advise or give us some pointers on how to progress the investigation
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: