[JBoss JIRA] (JGRP-1610) LockingService and rpc on the same cluster, tryLock() hangs

Tuesday, 26 March 2013

Bela Ban created JGRP-1610:
------------------------------

             Summary: LockingService and rpc on the same cluster, tryLock() hangs
                 Key: JGRP-1610
                 URL: https://issues.jboss.org/browse/JGRP-1610
             Project: JGroups
          Issue Type: Feature Request
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 3.3
         Attachments: RpcLockingTest.java

Hi,

Yes, the sequence diagram only depicted the second part of my description. 

Anyway, I've attached a test file that reproduce the problem. 

It contains two test cases, one where the coordinator of the lock is the one who 
sends the message first, and a second case where the non-coordinator sends 
the message first.

In the first case the receiver, non-coordinator, will hang in tryLock. In the second 
case though, everything works fine. 

Regards,
Daniel Olausson

On 25 March 2013 16:15, Bela Ban <belaban(a)yahoo.com&gt; wrote:

    Hi Daniel,

    the sequence diagram differs from your description, can you submit a
    test case (e.g. copy MessageDispatcherRSVPTest and modify it), so I can
    take a look ?

    I assume your RPCs are blocking (sync) and non-OOB ? Could be a
    recursive invocation, where FIFO order (default) leads to a distributed
    deadlock.

    A test case would clarify what you want to do, and if I can reproduce
    the problem, I can fix it.

    On 3/25/13 1:54 PM, Daniel Olausson wrote:
...
 Hi,

 We trying to use the same channel for our lockingService and
 rpcDispatcher. But we are noticing some weird behavior.

 The end result is that lock.tryLock(lockName) never returns, which it
 should always do.

 This happens when we do the following:

 On computer A, we lock the lock.
 Do a rpc to a function on computer B, this function tries to take the
 lock(lock.tryLock(lockName)), but it can't because the lock is locked.
 This is correct behavior.
 Computer A unlocks the lock.

 On computer B we now do the same procedure, we lock the lock and do a
 rpc to computer A, but here is when the strange thing happens. Computer
 A tries to take the lock by executing tryLock, but it never returns.

 Here is a sequence diagram:

http://www.websequencediagrams.com/cgi-bin/cdraw?lz=dGl0bGUgQXV0aGVudGljY...

 In this example we use the standard udp.xml with <CENTRAL_LOCK/> added
 on the top of the stack. Everything works if we use PEER_LOCK but then
 we need the messages to arrive in the same order everywhere, e.g. atomic
 broadcast.

 It also works if we use different clusters for locking and rpc, but it
 would be convenient if we could use the same cluster.

 Is it recommended to use the same channel for different services?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006