[jboss-jira] [JBoss JIRA] (WFWIP-122) Clients do not failover to backup even it if backup started and keep retrying connection

Thu Sep 6 08:57:00 EDT 2018

     [ https://issues.jboss.org/browse/WFWIP-122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Miroslav Novak updated WFWIP-122:
---------------------------------
    Description: 
Test Scenario:
* Start 2 servers in colocated topology with shared store
* Start procuder and consumer sending/receiving messages to/from queue on 1st server(to live)
* Kill 1st server and wait for backup on 2nd server to start
* Clients failover to backup to 2nd server and wait for them to finish
* Check that number of sent/received messages is the same

Result:
Sometimes happens that clients do not failover to backup to 2nd server and retry connection. I can see that backup on 2nd server started and created cluster with colocated live server but clients did not connect to it.

Clients are retrying connection:
{code}
07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:794] Trying reconnection attempt 163/-1
07:32:16,676 Thread-39 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:446] Connector NettyConnector [host=rhel7-large-42748, port=5445, httpEnabled=fa
lse, httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true] using native epoll
07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1080] Trying to connect with connectorFactory = org.apache.activemq.artemis.co
re.remoting.impl.netty.NettyConnectorFactory at 6568a014, connectorConfig=TransportConfiguration(name=null, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&useNio=true
&host=rhel7-large-42748&useNioGlobalWorkerPool=true&blockOnNonDurableSend=false&retryIntervalMultiplier=1-0&maxRetryInterval=2000&producerMaxRate=-1&blockOnDurableSend=true&connectionTTL=60000&compressLargeMessa
ge=false&reconnectAttempts=-1&ignoreJTA=false&cacheLargeMessagesClient=false&scheduledThreadPoolMaxSize=5&useGlobalPools=true&callFailoverTimeout=60000&initialConnectAttempts=1&clientFailureCheckPeriod=30000&blo
ckOnAcknowledge=true&consumerWindowSize=1048576&minLargeMessageSize=102400&finalizeChecks=false&autoGroup=false&threadPoolMaxSize=30&confirmationWindowSize=-1&transactionBatchSize=1048576&callTimeout=30000&preAc
knowledge=false&enable1xPrefixes=true&cacheDestinations=false&connectionLoadBalancingPolicyClassName=org-apache-activemq-artemis-api-core-client-loadbalance-RoundRobinConnectionLoadBalancingPolicy&dupsOKBatchSiz
e=1048576&incomingInterceptorList=&initialMessagePacketSize=1500&consumerMaxRate=-1&enableSharedClientID=true&HA=true&retryInterval=2000&factoryType=0&failoverOnInitialConnection=false&outgoingInterceptorList=&p
roducerWindowSize=65536
07:32:16,676 Thread-39 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:600] AMQ211002: Started EPOLL Netty Connector version 4.1.16.Final to rhel7-larg
e-42748:5445
07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:446] Connector NettyConnector [host=rhel7-large-42748, port=5445, httpEnabled=fa
lse, httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true] using native epoll
07:32:16,676 Thread-38 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:794] Trying reconnection attempt 163/-1
07:32:16,676 Thread-38 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1080] Trying to connect with connectorFactory = org.apache.activemq.artemis.co
re.remoting.impl.netty.NettyConnectorFactory at 43d7f9fb, connectorConfig=TransportConfiguration(name=null, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&useNio=true
&host=rhel7-large-42748&useNioGlobalWorkerPool=true&blockOnNonDurableSend=false&retryIntervalMultiplier=1-0&maxRetryInterval=2000&producerMaxRate=-1&blockOnDurableSend=true&connectionTTL=60000&compressLargeMessa
ge=false&reconnectAttempts=-1&ignoreJTA=false&cacheLargeMessagesClient=false&scheduledThreadPoolMaxSize=5&useGlobalPools=true&callFailoverTimeout=60000&initialConnectAttempts=1&clientFailureCheckPeriod=30000&blo
ckOnAcknowledge=true&consumerWindowSize=1048576&minLargeMessageSize=102400&finalizeChecks=false&autoGroup=false&threadPoolMaxSize=30&confirmationWindowSize=-1&transactionBatchSize=1048576&callTimeout=30000&preAc
knowledge=false&enable1xPrefixes=true&cacheDestinations=false&connectionLoadBalancingPolicyClassName=org-apache-activemq-artemis-api-core-client-loadbalance-RoundRobinConnectionLoadBalancingPolicy&dupsOKBatchSiz
e=1048576&incomingInterceptorList=&initialMessagePacketSize=1500&consumerMaxRate=-1&enableSharedClientID=true&HA=true&retryInterval=2000&factoryType=0&failoverOnInitialConnection=false&outgoingInterceptorList=&p
roducerWindowSize=65536
{code}

  was:
Test Scenario:
* Start 2 servers in colocated topology with shared store
* Start procuder and consumer sending/receiving messages to/from queue on 1st server(to live)
* Kill 1st server and wait for backup on 2nd server to start
* Clients failover to backup to 2nd server and wait for them to finish
* Check that number of sent/received messages is the same

Result:
Sometimes happens that clients do not failover to backup to 2nd server and retry connection.

> Clients do not failover to backup even it if backup started and keep retrying connection 
> -----------------------------------------------------------------------------------------
>
>                 Key: WFWIP-122
>                 URL: https://issues.jboss.org/browse/WFWIP-122
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: Artemis
>            Reporter: Miroslav Novak
>            Assignee: Martyn Taylor
>            Priority: Critical
>         Attachments: NettyColocatedClusterFailoverTestCase.testFailbackClientAckQueueNIO.zip
>
>
> Test Scenario:
> * Start 2 servers in colocated topology with shared store
> * Start procuder and consumer sending/receiving messages to/from queue on 1st server(to live)
> * Kill 1st server and wait for backup on 2nd server to start
> * Clients failover to backup to 2nd server and wait for them to finish
> * Check that number of sent/received messages is the same
> Result:
> Sometimes happens that clients do not failover to backup to 2nd server and retry connection. I can see that backup on 2nd server started and created cluster with colocated live server but clients did not connect to it.
> Clients are retrying connection:
> {code}
> 07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:794] Trying reconnection attempt 163/-1
> 07:32:16,676 Thread-39 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:446] Connector NettyConnector [host=rhel7-large-42748, port=5445, httpEnabled=fa
> lse, httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true] using native epoll
> 07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1080] Trying to connect with connectorFactory = org.apache.activemq.artemis.co
> re.remoting.impl.netty.NettyConnectorFactory at 6568a014, connectorConfig=TransportConfiguration(name=null, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&useNio=true
> &host=rhel7-large-42748&useNioGlobalWorkerPool=true&blockOnNonDurableSend=false&retryIntervalMultiplier=1-0&maxRetryInterval=2000&producerMaxRate=-1&blockOnDurableSend=true&connectionTTL=60000&compressLargeMessa
> ge=false&reconnectAttempts=-1&ignoreJTA=false&cacheLargeMessagesClient=false&scheduledThreadPoolMaxSize=5&useGlobalPools=true&callFailoverTimeout=60000&initialConnectAttempts=1&clientFailureCheckPeriod=30000&blo
> ckOnAcknowledge=true&consumerWindowSize=1048576&minLargeMessageSize=102400&finalizeChecks=false&autoGroup=false&threadPoolMaxSize=30&confirmationWindowSize=-1&transactionBatchSize=1048576&callTimeout=30000&preAc
> knowledge=false&enable1xPrefixes=true&cacheDestinations=false&connectionLoadBalancingPolicyClassName=org-apache-activemq-artemis-api-core-client-loadbalance-RoundRobinConnectionLoadBalancingPolicy&dupsOKBatchSiz
> e=1048576&incomingInterceptorList=&initialMessagePacketSize=1500&consumerMaxRate=-1&enableSharedClientID=true&HA=true&retryInterval=2000&factoryType=0&failoverOnInitialConnection=false&outgoingInterceptorList=&p
> roducerWindowSize=65536
> 07:32:16,676 Thread-39 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:600] AMQ211002: Started EPOLL Netty Connector version 4.1.16.Final to rhel7-larg
> e-42748:5445
> 07:32:16,676 Thread-36 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:446] Connector NettyConnector [host=rhel7-large-42748, port=5445, httpEnabled=fa
> lse, httpUpgradeEnabled=false, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true] using native epoll
> 07:32:16,676 Thread-38 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:794] Trying reconnection attempt 163/-1
> 07:32:16,676 Thread-38 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1080] Trying to connect with connectorFactory = org.apache.activemq.artemis.co
> re.remoting.impl.netty.NettyConnectorFactory at 43d7f9fb, connectorConfig=TransportConfiguration(name=null, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?port=5445&useNio=true
> &host=rhel7-large-42748&useNioGlobalWorkerPool=true&blockOnNonDurableSend=false&retryIntervalMultiplier=1-0&maxRetryInterval=2000&producerMaxRate=-1&blockOnDurableSend=true&connectionTTL=60000&compressLargeMessa
> ge=false&reconnectAttempts=-1&ignoreJTA=false&cacheLargeMessagesClient=false&scheduledThreadPoolMaxSize=5&useGlobalPools=true&callFailoverTimeout=60000&initialConnectAttempts=1&clientFailureCheckPeriod=30000&blo
> ckOnAcknowledge=true&consumerWindowSize=1048576&minLargeMessageSize=102400&finalizeChecks=false&autoGroup=false&threadPoolMaxSize=30&confirmationWindowSize=-1&transactionBatchSize=1048576&callTimeout=30000&preAc
> knowledge=false&enable1xPrefixes=true&cacheDestinations=false&connectionLoadBalancingPolicyClassName=org-apache-activemq-artemis-api-core-client-loadbalance-RoundRobinConnectionLoadBalancingPolicy&dupsOKBatchSiz
> e=1048576&incomingInterceptorList=&initialMessagePacketSize=1500&consumerMaxRate=-1&enableSharedClientID=true&HA=true&retryInterval=2000&factoryType=0&failoverOnInitialConnection=false&outgoingInterceptorList=&p
> roducerWindowSize=65536
> {code}

--
This message was sent by Atlassian JIRA
(v7.5.0#75005)