]
Howard Gao closed JBMESSAGING-1547.
-----------------------------------
Resolution: Done
done
Potential Dead Lock in MessageSucker.stop()
-------------------------------------------
Key: JBMESSAGING-1547
URL:
https://jira.jboss.org/jira/browse/JBMESSAGING-1547
Project: JBoss Messaging
Issue Type: Bug
Components: Messaging Core Distributed Support
Affects Versions: 1.4.0.SP3.CP07, 1.4.2.GA.SP1
Reporter: Howard Gao
Assignee: Howard Gao
Fix For: 1.4.0.SP3.CP08, 1.4.4.GA
The stop() is synchronized on MessageSucker, in this method, it tries to unregister
itself from the localQueue, as
...
localQueue.unregisterSucker(this)
...
the unregisterSucker(MessageSucker) requires synchronization on 'lock' object in
MessagingQueue.
On the other hand, the MessagingQueue.deliverInternal() method is synchronized on
'lock' object and inside the method the informSuckers() is called. If there are
some suckers, the sucker.setConsuming() will be called upon the sucker. The setConsuming()
method requires locking on the messaging sucker.
So dead lock may happen when a message sucker is being stopped due to remote queue
leaving (the hosting node left the cluster) while the corresponding localQueue is
delivering messages and trying to push messages to its suckers. Take two node cluster as
an example,
1. A Node with a distributed queue (say remoteQueue) has left the cluster, causing the
message sucker (remoteQueue-sucker) on the other node to be stopped and unregistered from
localQueue.
2. remoteQueue-sucker.stop() acquires the lock on remotQueue-sucker, and then goes on
till localQueue.unregisterSucker(this), at which point it tries to acquire the
localQueue.lock object.
3. At that time the calling thread of localQueue.deliverInternal() has already held the
localQueue.lock object, which prevents thread of step2 from getting the lock. Then it
proceeds to informSuckers() where the remoteQueue-sucker.setConsuming() is trying to get
the lock on remoteQueue-sucker. But this lock has been held by thread of step 2. So dead
lock.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: