[jboss-jira] [JBoss JIRA] (JGRP-1334) TIMED_WAITING loop prevents message propagation to cluster due to FC blockage

Bela Ban (JIRA) jira-events at lists.jboss.org
Thu Jun 7 00:42:20 EDT 2012


     [ https://issues.jboss.org/browse/JGRP-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban deleted JGRP-1334:
---------------------------

    
> TIMED_WAITING loop prevents message propagation to cluster due to FC blockage
> -----------------------------------------------------------------------------
>
>                 Key: JGRP-1334
>                 URL: https://issues.jboss.org/browse/JGRP-1334
>             Project: JGroups
>          Issue Type: Bug
>         Environment: Sparc Solaris 10
>            Reporter: Simone Scarduzio
>            Assignee: Bela Ban
>              Labels: FC, credits
>
> Cannot send messages to cluster because node is waiting to have FC credit which never arrives.
> JGroups sender thread stays constantly in TIMED_WAITING loop having not enough credit to send any bytes.
> We have been able to reproduce the issue which occurs after some days of activity, but the actual triggering event is unclear, no clear test case available.
> Cluster is made of about 25 nodes, we're using Oracle jdk 1.6.0_24. No GC time is observed being bigger than 3.6s around the cluster nodes during problem happening. Still we read a lot of VERIFY_SUSPECT in the logs, mentioning many different node address.
> See the stack trace in which we see the blockage:
> "QuotaDistributor" daemon prio=3 tid=0x00a9cc00 nid=0x71 waiting on condition [0x63f1f000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x7b8392e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2116)
>         at org.jgroups.protocols.FC.handleDownMessage(FC.java:550)
>         at org.jgroups.protocols.FC.down(FC.java:424)
>         at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)
>         at org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:216)
>         at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:836)
>         at org.jgroups.JChannel.down(JChannel.java:1626)
>         at org.jgroups.JChannel.send(JChannel.java:721)
>         at org.jgroups.JChannel.send(JChannel.java:737)
>         at com.firsthop.common.platform.jgroups.JGroupsManager.send(JGroupsManager.java:193)
> Please advise or give us some pointers on how to progress the investigation

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list