[jboss-jira] [JBoss JIRA] Commented: (JGRP-770) Concurrent startup of many channels doesn't stabilize

Vladimir Blagojevic (JIRA) jira-events at lists.jboss.org
Tue Jun 10 15:01:16 EDT 2008


    [ http://jira.jboss.com/jira/browse/JGRP-770?page=comments#action_12416332 ] 
            
Vladimir Blagojevic commented on JGRP-770:
------------------------------------------

Here is why there is a deadlock using flush-udp.xml:

Incoming thread T goes up the stack, locks window in NAKACK or UNICAST and then attempts locking of a lock L in GroupRequest#viewChange (for example sake). At the same time user thread pushing a message down through castMessage already locked L through GroupRequest#doExecute but gets blocked in FLUSH#down. However, FLUSH cannot get unblocked since T already holds a lock on a window in NAKACK and thus FLUSH message cannot arrive to unblock FLUSH since it cannot obtain a lock on that same window in NAKACK held by thread T.Bingo, a deadlock!

This is why ChannelConcurrencyTest using a MessageDispatcher works with udp but does not with flush-udp. Also for the same reason basic ChannelConcurrencyTest without a MessageDispatcher works on both flush-udp and udp.

For the sake of proof - I reorganized GroupRequest#execute so that message is sent down without holding a lock L. As expected ChannelConcurrencyTest started to work - consistently!


> Concurrent startup of many channels doesn't	stabilize
> -----------------------------------------------------
>
>                 Key: JGRP-770
>                 URL: http://jira.jboss.com/jira/browse/JGRP-770
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>         Assigned To: Vladimir Blagojevic
>             Fix For: 2.7, 2.6.3
>
>
> [Brian Goose]
> Jgroups 2.6.2, jdk1.6u4, and OS "Linux bubba 2.6.24-16-generic #1 SMP
> Thu Apr 10 13:23:42 UTC 2008 i686 GNU/Linux"
> I have 8 machines that all join a channel when they startup.  Usually,
> they eventually merge into one view with 8 machines in it, but sometimes
> I end up with one view of 7 members and one view of 1 member.  It never
> merges, even after waiting for an hour or two.
> I've created a unit test that demonstrates this problem while using the
> flush-udp stack.  About 20% of the time the views will never merge into
> one big view, but instead stays as two separate views, continually
> trying to merge and failing.  Do I have something set up wrong here?
> Should this test work reliably?
> I'm running with these java switches:
> 	-ea
> 	-Djava.net.preferIPv4Stack=true
> 	
> -Dorg.apache.commons.logging.LogFactory=org.apache.commons.logging.impl.
> LogFactoryImpl
> The test class is attached.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list