[jboss-jira] [JBoss JIRA] Updated: (JBMESSAGING-1842) Live node dropping out of cluster can cause duplicate message delivery

Justin Bertram (JIRA) jira-events at lists.jboss.org
Sat Jan 15 10:38:49 EST 2011


     [ https://issues.jboss.org/browse/JBMESSAGING-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Bertram updated JBMESSAGING-1842:
----------------------------------------

    Description: 
When a live node is kicked out of the cluster (for whatever reason) its JBoss Messaging ServerPeer remains active which means the node is still available to send messages to clients.  However, when the node is kicked out of the cluster another node in the cluster performs fail-over for that node and takes ownership of that node's messages in the database.  The "dead" node knows nothing about this and believes it still owns those messages and therefore will deliver those messages to clients.  After delivery it tries to remove the message from the database and can't (because it doesn't actually own that message anymore).  When this happens the "dead" node issues a WARN like this:

  WARN  [JDBCPersistenceManager] Failed to remove row for: Reference[23318958991900672]:RELIABLE

Of course, the node which performed the fail-over and actually owns the message now may also deliver the message to client.

  was:
When a live node is kicked out of the cluster (for whatever reason) its JBoss Messaging ServerPeer remains active which means the node is still available to receive messages and, if necessary, persist them to the database.  However, the CHANNEL_ID values used for the persisted messages will be invalid since the fail-over procedure would have removed the "dead" node's channels from the JBM_POSTOFFICE and new CHANNEL_ID values will be used when the "dead" node is finally restarted.  This essentially orphans any persistent messages sent to the "dead" node after it has been kicked out of the cluster.  Recovering the orphaned messages would require manual intervention with the database, but even then it might not be possible to match the old CHANNEL_ID values with the new ones.

The simplest way to solve this problem is to add a constraint to the database that forces the inserted message's CHANNEL_ID to correspond to a CHANNEL_ID from JBM_POSTOFFICE.  Of course, such a constraint will hurt performance a bit, but it could be optional - customers could trade robustness for speed.

The node would still need to be restarted in order to rejoin the cluster appropriately, but the constraint would avoid orphaned messages.




> Live node dropping out of cluster can cause duplicate message delivery
> ----------------------------------------------------------------------
>
>                 Key: JBMESSAGING-1842
>                 URL: https://issues.jboss.org/browse/JBMESSAGING-1842
>             Project: JBoss Messaging
>          Issue Type: Bug
>          Components: JMS Clustering
>    Affects Versions: 1.4.0.SP3.CP10
>            Reporter: Justin Bertram
>            Assignee: Yong Hao Gao
>
> When a live node is kicked out of the cluster (for whatever reason) its JBoss Messaging ServerPeer remains active which means the node is still available to send messages to clients.  However, when the node is kicked out of the cluster another node in the cluster performs fail-over for that node and takes ownership of that node's messages in the database.  The "dead" node knows nothing about this and believes it still owns those messages and therefore will deliver those messages to clients.  After delivery it tries to remove the message from the database and can't (because it doesn't actually own that message anymore).  When this happens the "dead" node issues a WARN like this:
>   WARN  [JDBCPersistenceManager] Failed to remove row for: Reference[23318958991900672]:RELIABLE
> Of course, the node which performed the fail-over and actually owns the message now may also deliver the message to client.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list