[
https://issues.jboss.org/browse/JBMESSAGING-1842?page=com.atlassian.jira....
]
Yong Hao Gao commented on JBMESSAGING-1842:
-------------------------------------------
The general picture could be described as this:
We have a cluster of n nodes, at a time m nodes (m < n) of the cluster leaves the
cluster but they are still alive. We should guarantee:
1. Each of the m nodes can continue work as a standalone server.
2. The rest of the (n-m) nodes work as a cluster.
3. No duplicate message delivery, no stuck messages in such a situation.
4. any of the m nodes should be able to re-join the cluster.
Live node dropping out of cluster can cause duplicate message
delivery
----------------------------------------------------------------------
Key: JBMESSAGING-1842
URL:
https://issues.jboss.org/browse/JBMESSAGING-1842
Project: JBoss Messaging
Issue Type: Bug
Components: JMS Clustering
Affects Versions: 1.4.0.SP3.CP10
Reporter: Justin Bertram
Assignee: Yong Hao Gao
When a live node is kicked out of the cluster (for whatever reason) its JBoss Messaging
ServerPeer remains active which means the node is still available to send messages to
clients. However, when the node is kicked out of the cluster another node in the cluster
performs fail-over for that node and takes ownership of that node's messages in the
database. The "dead" node may know nothing about this and might believe it
still owns those messages and therefore will deliver those messages to clients. After
delivery it tries to remove the message from the database and can't (because it
doesn't actually own that message anymore). When this happens the "dead"
node issues a WARN like this:
WARN [JDBCPersistenceManager] Failed to remove row for:
Reference[23318958991900672]:RELIABLE
Of course, the node which performed the fail-over and actually owns the message now may
also deliver the message to a client.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira