[jboss-jira] [JBoss JIRA] Resolved: (JGRP-787) UNICAST over TCP with xmit_off=true: sending message in synchronized block leads to deadlocks
Bela Ban (JIRA)
jira-events at lists.jboss.org
Mon Jun 16 04:40:34 EDT 2008
[ http://jira.jboss.com/jira/browse/JGRP-787?page=all ]
Bela Ban resolved JGRP-787.
---------------------------
Resolution: Done
Moved sending of the message out of the synchronized block. If send() throws an exception, we will be hosed because the seqno assigned to that message will not get used, so the receiver has a gap and will not deliver any messages higher than the seqno of the failed message.
However, if message sending throws an exception, unless the destination crashed (which means we won't send messages to the dest anymore anyway), this would be considered a bug in JGroups.
> UNICAST over TCP with xmit_off=true: sending message in synchronized block leads to deadlocks
> ---------------------------------------------------------------------------------------------
>
> Key: JGRP-787
> URL: http://jira.jboss.com/jira/browse/JGRP-787
> Project: JGroups
> Issue Type: Bug
> Reporter: Bela Ban
> Assigned To: Bela Ban
> Fix For: 2.6.3, 2.7
>
>
> Same issue as http://jira.jboss.com/jira/browse/JGRP-303: that's why we moved the send() outside the synchronized block.
> The problem with xmit_off though is that we need to know the message was passed to TCP/IP successfully, or else we CANNOT increment the sequence number !
> Stack trace:
> Found one Java-level deadlock:
> =============================
> "Incoming-27,UnicastTest-Group,192.168.1.5:7500":
> waiting for ownable synchronizer 0x00002aaac0921168, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "Incoming-4,UnicastTest-Group,192.168.1.5:7500"
> "Incoming-4,UnicastTest-Group,192.168.1.5:7500":
> waiting to lock monitor 0x00002aaacc8e9cf0 (object 0x00002aaac09e3a88, a org.jgroups.protocols.UNICAST$Entry),
> which is held by "main"
> "main":
> waiting for ownable synchronizer 0x00002aaac0921168, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "Incoming-4,UnicastTest-Group,192.168.1.5:7500"
> Java stack information for the threads listed above:
> ===================================================
> "Incoming-27,UnicastTest-Group,192.168.1.5:7500":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00002aaac0921168> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:635)
> at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
> at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
> at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
> at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
> at org.jgroups.protocols.Discovery.up(Discovery.java:244)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
> at org.jgroups.protocols.TP.access$100(TP.java:49)
> at org.jgroups.protocols.TP$1.run(TP.java:1169)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> "Incoming-4,UnicastTest-Group,192.168.1.5:7500":
> at org.jgroups.protocols.UNICAST.down(UNICAST.java:357)
> - waiting to lock <0x00002aaac09e3a88> (a org.jgroups.protocols.UNICAST$Entry)
> at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:316)
> at org.jgroups.protocols.VIEW_SYNC.down(VIEW_SYNC.java:204)
> at org.jgroups.protocols.pbcast.GMS.down(GMS.java:859)
> at org.jgroups.protocols.FC.sendCredit(FC.java:740)
> at org.jgroups.protocols.FC.up(FC.java:416)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:788)
> at org.jgroups.protocols.VIEW_SYNC.up(VIEW_SYNC.java:192)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:233)
> at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:645)
> at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
> at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
> at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
> at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
> at org.jgroups.protocols.Discovery.up(Discovery.java:244)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
> at org.jgroups.protocols.TP.access$100(TP.java:49)
> at org.jgroups.protocols.TP$1.run(TP.java:1169)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> at java.lang.Thread.run(Thread.java:619)
> "main":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00002aaac0921168> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:635)
> at org.jgroups.protocols.UNICAST.up(UNICAST.java:292)
> at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:735)
> at org.jgroups.protocols.BARRIER.up(BARRIER.java:136)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:309)
> at org.jgroups.protocols.MERGE2.up(MERGE2.java:144)
> at org.jgroups.protocols.Discovery.up(Discovery.java:244)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1266)
> at org.jgroups.protocols.TP.access$100(TP.java:49)
> at org.jgroups.protocols.TP$1.run(TP.java:1169)
> at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:1737)
> at org.jgroups.util.ShutdownRejectedExecutionHandler.rejectedExecution(ShutdownRejectedExecutionHandler.java:39)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
> at org.jgroups.protocols.TP.down(TP.java:1167)
> at org.jgroups.protocols.Discovery.down(Discovery.java:349)
> at org.jgroups.protocols.MERGE2.down(MERGE2.java:175)
> at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:373)
> at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:95)
> at org.jgroups.protocols.BARRIER.down(BARRIER.java:107)
> at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:660)
> at org.jgroups.protocols.UNICAST.send(UNICAST.java:484)
> at org.jgroups.protocols.UNICAST.down(UNICAST.java:373)
> - locked <0x00002aaac09e3a88> (a org.jgroups.protocols.UNICAST$Entry)
> at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:316)
> at org.jgroups.protocols.VIEW_SYNC.down(VIEW_SYNC.java:204)
> at org.jgroups.protocols.pbcast.GMS.down(GMS.java:859)
> at org.jgroups.protocols.FC.handleDownMessage(FC.java:526)
> at org.jgroups.protocols.FC.down(FC.java:365)
> at org.jgroups.protocols.FRAG2.down(FRAG2.java:175)
> at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.down(STREAMING_STATE_TRANSFER.java:303)
> at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:457)
> at org.jgroups.JChannel.down(JChannel.java:1443)
> at org.jgroups.JChannel.send(JChannel.java:620)
> at org.jgroups.tests.UnicastTest.sendMessages(UnicastTest.java:241)
> at org.jgroups.tests.UnicastTest.eventLoop(UnicastTest.java:198)
> at org.jgroups.tests.UnicastTest.main(UnicastTest.java:355)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list