]
Bela Ban commented on JGRP-1610:
--------------------------------
OK, it is clear why this fails:
- A invokes a *synchronous* (blocking) RPC on B
- UNICAST delivers the RPC from A at B
- The RPC tries to acquire the lock from A (who already holds it)
- A sends back a LOCK-DENIED unicast to B
- However, because B is already processing the RPC from A; the LOCK-DENIED messages
won't get processed until the RPC returns. However, the RPC will only return when the
lock has been granted or denied
==> A classic deadlock !
SOLUTION:
#1 Invoke the initial RPC as ASYNC RPC, or hold a future to it
#2 JGroups sends all messages which have no ordering contraints as OOBs, e.g. LOCK-GRANTED
/ LOCK-DENIED
LockingService and rpc on the same cluster, tryLock() hangs
-----------------------------------------------------------
Key: JGRP-1610
URL:
https://issues.jboss.org/browse/JGRP-1610
Project: JGroups
Issue Type: Feature Request
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 3.3
Attachments: RpcLockingTest.java
Hi,
Yes, the sequence diagram only depicted the second part of my description.
Anyway, I've attached a test file that reproduce the problem.
It contains two test cases, one where the coordinator of the lock is the one who
sends the message first, and a second case where the non-coordinator sends
the message first.
In the first case the receiver, non-coordinator, will hang in tryLock. In the second
case though, everything works fine.
Regards,
Daniel Olausson
On 25 March 2013 16:15, Bela Ban <belaban(a)yahoo.com> wrote:
Hi Daniel,
the sequence diagram differs from your description, can you submit a
test case (e.g. copy MessageDispatcherRSVPTest and modify it), so I can
take a look ?
I assume your RPCs are blocking (sync) and non-OOB ? Could be a
recursive invocation, where FIFO order (default) leads to a distributed
deadlock.
A test case would clarify what you want to do, and if I can reproduce
the problem, I can fix it.
On 3/25/13 1:54 PM, Daniel Olausson wrote:
> Hi,
>
> We trying to use the same channel for our lockingService and
> rpcDispatcher. But we are noticing some weird behavior.
>
> The end result is that lock.tryLock(lockName) never returns, which it
> should always do.
>
> This happens when we do the following:
>
> On computer A, we lock the lock.
> Do a rpc to a function on computer B, this function tries to take the
> lock(lock.tryLock(lockName)), but it can't because the lock is locked.
> This is correct behavior.
> Computer A unlocks the lock.
>
> On computer B we now do the same procedure, we lock the lock and do a
> rpc to computer A, but here is when the strange thing happens. Computer
> A tries to take the lock by executing tryLock, but it never returns.
>
> Here is a sequence diagram:
>
http://www.websequencediagrams.com/cgi-bin/cdraw?lz=dGl0bGUgQXV0aGVudGljY...
>
>
> In this example we use the standard udp.xml with <CENTRAL_LOCK/> added
> on the top of the stack. Everything works if we use PEER_LOCK but then
> we need the messages to arrive in the same order everywhere, e.g. atomic
> broadcast.
>
> It also works if we use different clusters for locking and rpc, but it
> would be convenient if we could use the same cluster.
>
>
> Is it recommended to use the same channel for different services?
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: