[jboss-jira] [JBoss JIRA] Created: (JGRP-1272) FC: all threads are blocked at credits_available.await
Victor N (JIRA)
jira-events at lists.jboss.org
Mon Jan 17 05:21:50 EST 2011
FC: all threads are blocked at credits_available.await
------------------------------------------------------
Key: JGRP-1272
URL: https://issues.jboss.org/browse/JGRP-1272
Project: JGroups
Issue Type: Bug
Affects Versions: 2.11
Environment: ubuntu 10.04.1
Reporter: Victor N
Assignee: Bela Ban
Attachments: jgroups-tcp.xml
After several days of working one node can not rejoin because of FC protocol issue. The situation is reproduced once or several times per week.
The problematic node is called "gate9", its view is outdated:
[gate10.mydomain|1175] [gate10.mydomain, gate7.mydomain, gate8.mydomain, gate9.mydomain, gate11.mydomain, gate14.mydomain, gate5.mydomain, gate12.mydomain, gate2.mydomain, gate4.mydomain, gate3.mydomain, gate6.mydomain]
Actual view (seen by all other nodes) is: [gate10.mydomain|1176] [gate10.mydomain, gate7.mydomain, gate8.mydomain, gate11.mydomain, gate14.mydomain, gate5.mydomain, gate12.mydomain, gate2.mydomain, gate4.mydomain, gate3.mydomain, gate6.mydomain]
In log file at gate9 I can see that it sends CREDIT_REQUEST constantly:
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.TP.down(): sending msg to gate10.mydomain, src=gate9.mydomain, headers are FC: CREDIT_REQUEST, UNICAST: DATA, seqno=4308, conn_id=14, TCP: [channel_name=GateCluster]
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.BasicTCP.sendUnicast(): dest=192.168.1.10:40001 (94 bytes)
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.UNICAST.retransmit(): gate9.mydomain --> XMIT(gate10.mydomain: #2014)
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.TP.down(): sending msg to gate10.mydomain, src=gate9.mydomain, headers are FC: CREDIT_REQUEST, UNICAST: DATA, seqno=2014, conn_id=14, TCP: [channel_name=GateCluster]
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.BasicTCP.sendUnicast(): dest=192.168.1.10:40001 (94 bytes)
17/01/2011 09:55:52 UTC| TRACE | org.jgroups.protocols.UNICAST.retransmit(): gate9.mydomain --> XMIT(gate10.mydomain: #1124)
...
At gate10 (the coordinator) I see that it receives the requests and responds:
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.TP.passMessageUp(): received [dst: gate10.mydomain, src: gate9.mydomain (3 headers), size=9 bytes], headers are FC: CREDIT_REQUEST, UNICAST: DATA, seqno=2397, conn_id=14, TCP: [channel_name=GateCluster]
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.UNICAST.handleDataReceived(): gate10.mydomain <-- DATA(gate9.mydomain: #2397, conn_id=14)
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.UNICAST.sendRequestForFirstSeqno(): gate10.mydomain --> SEND_FIRST_SEQNO(gate9.mydomain)
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.TP.down(): sending msg to gate9.mydomain, src=gate10.mydomain, headers are UNICAST: SEND_FIRST_SEQNO, seqno=0, TCP: [channel_name=GateCluster]
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.TP.passMessageUp(): received [dst: gate10.mydomain, src: gate9.mydomain (3 headers), size=9 bytes], headers are FC: CREDIT_REQUEST, UNICAST: DATA, seqno=1202, conn_id=14, TCP: [channel_name=GateCluster]
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.UNICAST.handleDataReceived(): gate10.mydomain <-- DATA(gate9.mydomain: #1202, conn_id=14)
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.UNICAST.sendRequestForFirstSeqno(): gate10.mydomain --> SEND_FIRST_SEQNO(gate9.mydomain)
17/01/2011 09:55:55 UTC| TRACE | org.jgroups.protocols.TP.down(): sending msg to gate9.mydomain, src=gate10.mydomain, headers are UNICAST: SEND_FIRST_SEQNO, seqno=0, TCP: [channel_name=GateCluster]
...
but I do not see any answer to this at gate9.
My config and stack traces are attached below. I do not know how to reproduce the problem in tests. But it occurs, so I can provide you with any details, just let me know.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list