[jboss-jira] [JBoss JIRA] Commented: (JGRP-770) Concurrent startup of many channels doesn't stabilize
Vladimir Blagojevic (JIRA)
jira-events at lists.jboss.org
Tue Jun 10 15:01:16 EDT 2008
[ http://jira.jboss.com/jira/browse/JGRP-770?page=comments#action_12416332 ]
Vladimir Blagojevic commented on JGRP-770:
------------------------------------------
Here is why there is a deadlock using flush-udp.xml:
Incoming thread T goes up the stack, locks window in NAKACK or UNICAST and then attempts locking of a lock L in GroupRequest#viewChange (for example sake). At the same time user thread pushing a message down through castMessage already locked L through GroupRequest#doExecute but gets blocked in FLUSH#down. However, FLUSH cannot get unblocked since T already holds a lock on a window in NAKACK and thus FLUSH message cannot arrive to unblock FLUSH since it cannot obtain a lock on that same window in NAKACK held by thread T.Bingo, a deadlock!
This is why ChannelConcurrencyTest using a MessageDispatcher works with udp but does not with flush-udp. Also for the same reason basic ChannelConcurrencyTest without a MessageDispatcher works on both flush-udp and udp.
For the sake of proof - I reorganized GroupRequest#execute so that message is sent down without holding a lock L. As expected ChannelConcurrencyTest started to work - consistently!
> Concurrent startup of many channels doesn't stabilize
> -----------------------------------------------------
>
> Key: JGRP-770
> URL: http://jira.jboss.com/jira/browse/JGRP-770
> Project: JGroups
> Issue Type: Bug
> Reporter: Bela Ban
> Assigned To: Vladimir Blagojevic
> Fix For: 2.7, 2.6.3
>
>
> [Brian Goose]
> Jgroups 2.6.2, jdk1.6u4, and OS "Linux bubba 2.6.24-16-generic #1 SMP
> Thu Apr 10 13:23:42 UTC 2008 i686 GNU/Linux"
> I have 8 machines that all join a channel when they startup. Usually,
> they eventually merge into one view with 8 machines in it, but sometimes
> I end up with one view of 7 members and one view of 1 member. It never
> merges, even after waiting for an hour or two.
> I've created a unit test that demonstrates this problem while using the
> flush-udp stack. About 20% of the time the views will never merge into
> one big view, but instead stays as two separate views, continually
> trying to merge and failing. Do I have something set up wrong here?
> Should this test work reliably?
> I'm running with these java switches:
> -ea
> -Djava.net.preferIPv4Stack=true
>
> -Dorg.apache.commons.logging.LogFactory=org.apache.commons.logging.impl.
> LogFactoryImpl
> The test class is attached.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list