[jboss-jira] [JBoss JIRA] (JGRP-1610) LockingService and rpc on the same cluster, tryLock() hangs

Tue Mar 26 07:38:42 EDT 2013

Bela Ban created JGRP-1610:
------------------------------

             Summary: LockingService and rpc on the same cluster, tryLock() hangs
                 Key: JGRP-1610
                 URL: https://issues.jboss.org/browse/JGRP-1610
             Project: JGroups
          Issue Type: Feature Request
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 3.3
         Attachments: RpcLockingTest.java

Hi,

Yes, the sequence diagram only depicted the second part of my description. 

Anyway, I've attached a test file that reproduce the problem. 

It contains two test cases, one where the coordinator of the lock is the one who 
sends the message first, and a second case where the non-coordinator sends 
the message first.

In the first case the receiver, non-coordinator, will hang in tryLock. In the second 
case though, everything works fine. 

Regards,
Daniel Olausson

On 25 March 2013 16:15, Bela Ban <belaban at yahoo.com> wrote:

    Hi Daniel,

    the sequence diagram differs from your description, can you submit a
    test case (e.g. copy MessageDispatcherRSVPTest and modify it), so I can
    take a look ?

    I assume your RPCs are blocking (sync) and non-OOB ? Could be a
    recursive invocation, where FIFO order (default) leads to a distributed
    deadlock.

    A test case would clarify what you want to do, and if I can reproduce
    the problem, I can fix it.

    On 3/25/13 1:54 PM, Daniel Olausson wrote:
    > Hi,
    >
    > We trying to use the same channel for our lockingService and
    > rpcDispatcher. But we are noticing some weird behavior.
    >
    > The end result is that lock.tryLock(lockName) never returns, which it
    > should always do.
    >
    > This happens when we do the following:
    >
    > On computer A, we lock the lock.
    > Do a rpc to a function on computer B, this function tries to take the
    > lock(lock.tryLock(lockName)), but it can't because the lock is locked.
    > This is correct behavior.
    > Computer A unlocks the lock.
    >
    > On computer B we now do the same procedure, we lock the lock and do a
    > rpc to computer A, but here is when the strange thing happens. Computer
    > A tries to take the lock by executing tryLock, but it never returns.
    >
    > Here is a sequence diagram:
    > http://www.websequencediagrams.com/cgi-bin/cdraw?lz=dGl0bGUgQXV0aGVudGljYXRpb24gU2VxdWVuY2UKCkNoYW5uZWwgMSAtPiAABAk6IGNlbnRyYWxMb2NrLnRyeUxvY2soKQAiDS0-KwAoCTI6IHJwY0Rpc3BhdGhlci5jYWxsbWV0aG9kKCJmb28iKQBfCTIAXAwyAFAYbm90ZSByaWdodCBvZiAiAF4JIjogAIEECSBibG9ja3MgZm9yZXZlcgBWDC0-LQCBQAxmb28gcmV0dXJucwoK&s=default
    >
    >
    > In this example we use the standard udp.xml with <CENTRAL_LOCK/> added
    > on the top of the stack. Everything works if we use PEER_LOCK but then
    > we need the messages to arrive in the same order everywhere, e.g. atomic
    > broadcast.
    >
    > It also works if we use different clusters for locking and rpc, but it
    > would be convenient if we could use the same cluster.
    >
    >
    > Is it recommended to use the same channel for different services?
    >

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira