[jboss-jira] [JBoss JIRA] Commented: (JBMESSAGING-1842) Live node dropping out of cluster can cause duplicate message delivery

Wednesday, 19 January 2011

    [
https://issues.jboss.org/browse/JBMESSAGING-1842?page=com.atlassian.jira....
] 

Yong Hao Gao commented on JBMESSAGING-1842:
-------------------------------------------

--- The Test Plan ---

I - Verify the old failover works ( keepOldFailoverModel = true )

    All existing tests should pass. All example works.

    JBM_CLUSTER table can be created, but no actions against it (timestamp updates and
state changes).

II - New failover model (keepOldFailoverModel = false)

0. verify All examples

1. Timestamp update

• timestamp refresh interval: timestamp should be updated after each interval.

2. Normal Failover

• Node shutdown, crash should cause failover to happen (if failoverOnNodeLeave is true)
• Multiple nodes shutdown and crash

3. Node leaving cluster alive (Using JGroups util to simulate)

• one node leaves the cluster: failover should not happen. the node serves as standalone
server. the cluster quarantined it as expected.
• if the node later shutdown/crash: failover should happen (if failoverOnNodeLeave is
true)
• if the node later re-joins the cluster: its quarantined state will be back to
clustered.

Repeat the above 3 steps with multiple node leaving, shutdown/crashing and rejoining.

All message producers and consumers in the above test should work properly. The following
should be checked:

• if a cluster has some nodes (>1) in quarantined state, will a client side failover be
limited to those normal nodes? or the client may failover to /from a quarantined node?
What happens if the quarantined node rejoins the cluster? or what if it crashed/shutdown?

• if client side failover works consistently in a dynamic situation where nodes
consistently leave, rejoin, shutdown and crash?

...
 Live node dropping out of cluster can cause duplicate message
delivery
 ----------------------------------------------------------------------

                 Key: JBMESSAGING-1842
                 URL: https://issues.jboss.org/browse/JBMESSAGING-1842
             Project: JBoss Messaging
          Issue Type: Bug
          Components: JMS Clustering
    Affects Versions: 1.4.0.SP3.CP10
            Reporter: Justin Bertram
            Assignee: Yong Hao Gao

 When a live node is kicked out of the cluster (for whatever reason) its JBoss Messaging
ServerPeer remains active which means the node is still available to send messages to
clients.  However, when the node is kicked out of the cluster another node in the cluster
performs fail-over for that node and takes ownership of that node's messages in the
database.  The "dead" node may know nothing about this and might believe it
still owns those messages and therefore will deliver those messages to clients.  After
delivery it tries to remove the message from the database and can't (because it
doesn't actually own that message anymore).  When this happens the "dead"
node issues a WARN like this:
   WARN  [JDBCPersistenceManager] Failed to remove row for:
Reference[23318958991900672]:RELIABLE
 Of course, the node which performed the fail-over and actually owns the message now may
also deliver the message to a client. 
-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] Commented: (JBMESSAGING-1842) Live node dropping out of cluster can cause duplicate message delivery