]
Bela Ban commented on JGRP-787:
-------------------------------
Removing FC also makes the problem go away
UNICAST over TCP with xmit_off=true: sending message in synchronized
block leads to deadlocks
---------------------------------------------------------------------------------------------
Key: JGRP-787
URL:
http://jira.jboss.com/jira/browse/JGRP-787
Project: JGroups
Issue Type: Bug
Reporter: Bela Ban
Assigned To: Bela Ban
Fix For: 2.7, 2.6.3
Same issue as
http://jira.jboss.com/jira/browse/JGRP-303: that's why we moved the
send() outside the synchronized block.
The problem with xmit_off though is that we need to know the message was passed to TCP/IP
successfully, or else we CANNOT increment the sequence number !
Stack trace:
Found one Java-level deadlock:
=============================
"Incoming-27,UnicastTest-Group,192.168.1.5:7500":
waiting for ownable synchronizer 0x00002aaac0921168, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "Incoming-4,UnicastTest-Group,192.168.1.5:7500"
"Incoming-4,UnicastTest-Group,192.168.1.5:7500":
waiting to lock monitor 0x00002aaacc8e9cf0 (object 0x00002aaac09e3a88, a
org.jgroups.protocols.UNICAST$Entry),
which is held by "main"
"main":
waiting for ownable synchronizer 0x00002aaac0921168, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "Incoming-4,UnicastTest-Group,192.168.1.5:7500"
Java stack information for the threads listed above:
===================================================
"Incoming-27,UnicastTest-Group,192.168.1.5:7500":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002aaac0921168> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:635)
at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
at org.jgroups.protocols.Discovery.up(Discovery.java:244)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
at org.jgroups.protocols.TP.access$100(TP.java:49)
at org.jgroups.protocols.TP$1.run(TP.java:1169)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
"Incoming-4,UnicastTest-Group,192.168.1.5:7500":
at org.jgroups.protocols.UNICAST.down(UNICAST.java:357)
- waiting to lock <0x00002aaac09e3a88> (a
org.jgroups.protocols.UNICAST$Entry)
at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:316)
at org.jgroups.protocols.VIEW_SYNC.down(VIEW_SYNC.java:204)
at org.jgroups.protocols.pbcast.GMS.down(GMS.java:859)
at org.jgroups.protocols.FC.sendCredit(FC.java:740)
at org.jgroups.protocols.FC.up(FC.java:416)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:788)
at org.jgroups.protocols.VIEW_SYNC.up(VIEW_SYNC.java:192)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:233)
at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:645)
at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
at org.jgroups.protocols.Discovery.up(Discovery.java:244)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
at org.jgroups.protocols.TP.access$100(TP.java:49)
at org.jgroups.protocols.TP$1.run(TP.java:1169)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
"main":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00002aaac0921168> (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:635)
at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
at org.jgroups.protocols.Discovery.up(Discovery.java:244)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
at org.jgroups.protocols.TP.access$100(TP.java:49)
at org.jgroups.protocols.TP$1.run(TP.java:1169)
at
java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:1737)
at
org.jgroups.util.ShutdownRejectedExecutionHandler.rejectedExecution(ShutdownRejectedExecutionHandler.java:39)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.jgroups.protocols.TP.down(TP.java:1167)
at org.jgroups.protocols.Discovery.down(Discovery.java:349)
at org.jgroups.protocols.MERGE2.down(MERGE2.java:175)
at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:373)
at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:95)
at org.jgroups.protocols.BARRIER.down(BARRIER.java:107)
at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:660)
at org.jgroups.protocols.UNICAST.send(UNICAST.java:484)
at org.jgroups.protocols.UNICAST.down(UNICAST.java:373)
- locked <0x00002aaac09e3a88> (a org.jgroups.protocols.UNICAST$Entry)
at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:316)
at org.jgroups.protocols.VIEW_SYNC.down(VIEW_SYNC.java:204)
at org.jgroups.protocols.pbcast.GMS.down(GMS.java:859)
at org.jgroups.protocols.FC.handleDownMessage(FC.java:526)
at org.jgroups.protocols.FC.down(FC.java:365)
at org.jgroups.protocols.FRAG2.down(FRAG2.java:175)
at
org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.down(STREAMING_STATE_TRANSFER.java:303)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:457)
at org.jgroups.JChannel.down(JChannel.java:1443)
at org.jgroups.JChannel.send(JChannel.java:620)
at org.jgroups.tests.UnicastTest.sendMessages(UnicastTest.java:241)
at org.jgroups.tests.UnicastTest.eventLoop(UnicastTest.java:198)
at org.jgroups.tests.UnicastTest.main(UnicastTest.java:355)
Found 1 deadlock.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: