[jboss-jira] [JBoss JIRA] Commented: (JGRP-1003) deadlock in TP.send - socket write hangs inside a Lock, preventing other threads to do TP.send()

Bela Ban (JIRA) jira-events at lists.jboss.org
Fri Apr 30 07:44:06 EDT 2010


    [ https://jira.jboss.org/jira/browse/JGRP-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12528225#action_12528225 ] 

Bela Ban commented on JGRP-1003:
--------------------------------

You've listed many issues here, ranging from TCP send blocking, to merge issues, to unicast issues etc. The underlying issue might, however, still be the hang in the TCP send.

I suggest try out JGroups 2.9 (or even better, 2.10 (Beta1)). 

2.9 got rid of the send lock (you see in the first comment on this case).

2.10 also got rid of the lock when sending message bundles; it now uses a bounded transfer queue.

I'm closing this case, I suggest you open a new one with (ideally) reproduceable steps to re-create the original issue on 2.9 or 2.10...

> deadlock in TP.send - socket write hangs inside a Lock, preventing other threads to do TP.send()
> ------------------------------------------------------------------------------------------------
>
>                 Key: JGRP-1003
>                 URL: https://jira.jboss.org/jira/browse/JGRP-1003
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.7
>         Environment: Linux (Debian), kernel 2.6.x
>            Reporter: Victor N
>            Assignee: Bela Ban
>             Fix For: 2.10
>
>         Attachments: j1__.txt, jgroups_s2.txt, jgroups_stacktrace_09feb2010.txt, stacktrace.txt
>
>
> I am using JGroups 2.7.0 GA with the typical protocols stack based on TCP (taken from tcp.xml).
> Sometimes it occurs that socket write operation hangs inside TP.send() -- this is not so untypical for blocking I/O approach! -- after that JGroups is not working at all because most of threads are blocked inside TP.send() waiting for that first thread releasing "out_stream_lock". Below is a fragment from my stack trace.
> 1) the thread that hangs in socket write until forever (maybe due to a broken socket or something else); this thread owns  "out_stream_lock in TP class":
> "Timer-1,name,IP:port" daemon prio=10 tid=0x084e5c00 nid=0x15f0 runnable [0x211c1000..0x211c2040]                                                             
>    java.lang.Thread.State: RUNNABLE                                                                                                                                              
>         at java.net.SocketOutputStream.socketWrite0(Native Method)                                                                                                               
>         at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)                                                                                                   
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)                                                                                                        
>         at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)                                                                                                
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)                                                                                                     
>         - locked <0x43427300> (a java.io.BufferedOutputStream)                                                                                                                   
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)                                                                                                             
>         at org.jgroups.blocks.TCPConnectionMap$TCPConnection.doSend(TCPConnectionMap.java:472)                                                                                   
>         at org.jgroups.blocks.TCPConnectionMap$TCPConnection._send(TCPConnectionMap.java:453)                                                                                    
>         at org.jgroups.blocks.TCPConnectionMap$TCPConnection.send(TCPConnectionMap.java:436)                                                                                     
>         at org.jgroups.blocks.TCPConnectionMap$TCPConnection.access$100(TCPConnectionMap.java:341)                                                                               
>         at org.jgroups.blocks.TCPConnectionMap.send(TCPConnectionMap.java:137)                                                                                                   
>         at org.jgroups.protocols.TCP.send(TCP.java:53)                                                                                                                           
>         at org.jgroups.protocols.BasicTCP.sendToSingleMember(BasicTCP.java:141)                                                                                                  
>         at org.jgroups.protocols.TP.doSend(TP.java:1105)                                                                                                                         
>         at org.jgroups.protocols.TP.send(TP.java:1088)                                                                                                                           
>         at org.jgroups.protocols.TP.down(TP.java:907)                                                                                                                            
>         at org.jgroups.protocols.Discovery.down(Discovery.java:363)                                                                                                              
>         at org.jgroups.protocols.MERGE2.down(MERGE2.java:169)                                                                                                                    
>         at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:333)                                                                                                                  
>         at org.jgroups.protocols.FD.down(FD.java:327)                                                                                                                            
>         at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:72)                                                                                                     
>         at org.jgroups.protocols.BARRIER.down(BARRIER.java:96)                                                                                                                   
>         at org.jgroups.protocols.pbcast.NAKACK.retransmit(NAKACK.java:1530)                                                                                                      
>         at org.jgroups.protocols.pbcast.NAKACK.retransmit(NAKACK.java:1476)                                                                                                      
>         at org.jgroups.stack.Retransmitter$Task.run(Retransmitter.java:207)                                                                                                      
>         at org.jgroups.util.TimeScheduler$TaskWrapper.run(TimeScheduler.java:218)                                                                                                
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)                                                                                               
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)                                                                                                    
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)                                                                                                              
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)                                                  
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:207)                                                        
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)                                                                                   
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)                                                                                       
>         at java.lang.Thread.run(Thread.java:619)
>    Locked ownable synchronizers:                                                                                                                                                 
>         - <0x2b5522e0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)                                                                                                  
>         - <0x2b5550f8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)                                                                                                  
>         - <0x434273f0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 2) and there are many threads like this blocked "on out_stream_lock.lock()" call inside TP class:
> "Connection.Receiver [ip:port - ip:port],name,ip:port" prio=10 tid=0x020dc000 nid=0x1fab waiting on condition [0x1365f000..0x1365ffc0]
>    java.lang.Thread.State: WAITING (parking)                                                                                                                                     
>         at sun.misc.Unsafe.park(Native Method)                                                                                                                                   
>         - parking to wait for  <0x2b5550f8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)                                                                             
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)                                                                                                     
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:747)                                                      
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:778)                                                              
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1114)                                                                   
>         at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)                                                                                     
>         at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)                                                                                                 
>         at org.jgroups.protocols.TP.send(TP.java:1082)                                                                                                                           
>         at org.jgroups.protocols.TP.down(TP.java:907)                                                                                                                            
>         at org.jgroups.protocols.Discovery.up(Discovery.java:274)                                                                                                                
>         at org.jgroups.protocols.TP.passMessageUp(TP.java:995)                                                                                                                   
>         at org.jgroups.protocols.TP.access$100(TP.java:52)                                                                                                                       
>         at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1451)                                                                                                 
>         at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1427)                                                                                                             
>         at java.util.concurrent.ThreadPoolExecutor$CallerRunsPolicy.rejectedExecution(ThreadPoolExecutor.java:1738)                                                              
>         at org.jgroups.util.ShutdownRejectedExecutionHandler.rejectedExecution(ShutdownRejectedExecutionHandler.java:34)                                                         
>         at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)                                                                                           
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)                                                                                          
>         at org.jgroups.protocols.TP.dispatchToThreadPool(TP.java:1061)                                                                                                           
>         at org.jgroups.protocols.TP.receive(TP.java:1038)                                                                                                                        
>         at org.jgroups.protocols.BasicTCP.receive(BasicTCP.java:180)                                                                                                             
>         at org.jgroups.blocks.TCPConnectionMap$TCPConnection$ConnectionPeerReceiver.run(TCPConnectionMap.java:553)                                                               
>         at java.lang.Thread.run(Thread.java:619)                                                                                                                                 
>                                                                                                                                                                                  
>    Locked ownable synchronizers:                                                                                                                                                 
>         - None
> (Since all threads from JGroups' ThreadPool are locked, you can see "rejectedExecution" in the last stack trace).
> Maybe we could implement some solution for TCP (blocking I/O)? Maybe it is possible to release the lock before socket write?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list