[jboss-jira] [JBoss JIRA] (WFWIP-27) [Artemis 2.x Upgrade] Failover does not work with HTTP connectors/acceptors

Martyn Taylor (JIRA) issues at jboss.org
Tue Jul 17 09:46:01 EDT 2018


    [ https://issues.jboss.org/browse/WFWIP-27?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606330#comment-13606330 ] 

Martyn Taylor commented on WFWIP-27:
------------------------------------

[~jmesnil] I can't say for sure as I don't know why the original change was made.  I don't believe though that we should be clearing the callbacks on a stop.  We'll revert the change and run the Artemis suite to see if it uncovers the original problem.  [~brian.stansberry] Thanks again for the investigation.

> [Artemis 2.x Upgrade] Failover does not work with HTTP connectors/acceptors
> ---------------------------------------------------------------------------
>
>                 Key: WFWIP-27
>                 URL: https://issues.jboss.org/browse/WFWIP-27
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: Artemis
>            Reporter: Erich Duda
>            Assignee: Jeff Mesnil
>            Priority: Blocker
>              Labels: activemq, feature-branch-blocker
>         Attachments: clients.log, server1.log, server2.log, standalone-full-ha.xml, standalone-full-ha.xml, test.log
>
>
> Failover does not work with HTTP connectors/acceptors.
> *Scenario:*
> # There are two Wildfly servers configured as Live-Backup pair
> # There is one JMS producer and one JMS receiver which sends/receives messages
> # Live server is several times killed and restarted.
> *Expectation:* Always when the Live server is killed or restarted, clients do failover or failback.
> *Reality*: Sometimes happens that clients don't do failover.
> *Users impact:* One of basics feature of HA, failover, does not work properly.
> *Blocker* priority was set because this is regression against previous releases of Wildfly.
> *Detail description of issue*
> In the trace logs it can be seen that clients send HTTP handshake request to active backup, but the handshake fails. All checks (and logs) say that the backup is active in this time period. I tried to run the test with Netty connectors/acceptors and I didn't see this issue.
> {code:title=Backup log}
> 07:52:47,856 WARN  [org.apache.activemq.artemis.core.server] (Thread-2 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6 at 1f01f347)) AMQ222029: Could not locate page transaction 3 995, ignoring message on position PagePositionImpl [pageNr=2, messageNr=55, recordID=-1] on address=jms.queue.testQueue queue=jms.queue.testQueue
> 07:52:47,856 WARN  [org.apache.activemq.artemis.core.server] (Thread-2 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6 at 1f01f347)) AMQ222029: Could not locate page transaction 3 995, ignoring message on position PagePositionImpl [pageNr=2, messageNr=56, recordID=-1] on address=jms.queue.testQueue queue=jms.queue.testQueue
> 07:52:47,863 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221005: Deleting pending large message as it was not completed: Pair[a=2147485068, b=2147485067]
> 07:52:47,863 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221005: Deleting pending large message as it was not completed: Pair[a=2147485076, b=2147485075]
> 07:52:47,864 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221005: Deleting pending large message as it was not completed: Pair[a=2147484450, b=2147484449]
> 07:52:47,864 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221005: Deleting pending large message as it was not completed: Pair[a=2147485072, b=2147485071]
> 07:52:47,864 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221005: Deleting pending large message as it was not completed: Pair[a=2147484454, b=2147484453]
> 07:52:47,866 WARN  [org.apache.activemq.artemis.core.client] (activemq-discovery-group-thread-dg-group1) AMQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=0dee81c9-2d9d-11e8-ba3f-cc3d825f79a4
> 07:52:47,867 INFO  [org.apache.activemq.artemis.core.server] (AMQ119000: Activation for server ActiveMQServerImpl::serverUUID=null) AMQ221007: Server is now live
> {code}
> {code:title=Clients log}
> 07:52:49,274 Thread-80 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:726] Remote destination: localhost/127.0.0.1:9080
> 07:52:49,274 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1040] Connector towards NettyConnector [host=localhost, port=8080, httpEnabled=false, httpUpgradeEnabled=true, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true, activemqServerName=default, httpUpgradeEndpoint=acceptor] failed
> 07:52:49,275 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1081] Trying backup config = TransportConfiguration(name=connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=9080&host=localhost
> 07:52:49,275 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:447] Connector NettyConnector [host=localhost, port=9080, httpEnabled=false, httpUpgradeEnabled=true, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true, activemqServerName=default, httpUpgradeEndpoint=acceptor] using native epoll
> 07:52:49,275 Thread-80 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:797] Sending HTTP request DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
> GET  HTTP/1.1
> host: localhost
> upgrade: activemq-remoting
> connection: upgrade
> activemqServerName: default
> httpUpgradeEndpoint: acceptor
> Sec-ActiveMQRemoting-Key: QWV3bCfgh75NjWH3pZV5Ew==
> 07:52:49,275 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:601] AMQ211002: Started EPOLL Netty Connector version 4.1.16.Final to localhost:9080
> 07:52:49,275 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:726] Remote destination: localhost/127.0.0.1:9080
> 07:52:49,276 Thread-79 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector:797] Sending HTTP request DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
> GET  HTTP/1.1
> host: localhost
> upgrade: activemq-remoting
> connection: upgrade
> activemqServerName: default
> httpUpgradeEndpoint: acceptor
> Sec-ActiveMQRemoting-Key: P9xBwRk1eZP5QjDWjqYuIg==
> 07:52:49,276 Thread-2 (ActiveMQ-client-netty-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector$HttpUpgradeHandler:876] Received msg=DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
> HTTP/1.1 200 OK
> Connection: keep-alive
> Last-Modified: Thu, 22 Mar 2018 06:47:03 GMT
> Content-Length: 2426
> Content-Type: text/html
> Accept-Ranges: bytes
> Date: Thu, 22 Mar 2018 06:52:49 GMT
> 07:52:49,276 Thread-2 (ActiveMQ-client-netty-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector$HttpUpgradeHandler:903] AMQ214023: HTTP Handshake failed, received DefaultHttpResponse(decodeResult: success, version: HTTP/1.1)
> HTTP/1.1 200 OK
> Connection: keep-alive
> Last-Modified: Thu, 22 Mar 2018 06:47:03 GMT
> Content-Length: 2426
> Content-Type: text/html
> Accept-Ranges: bytes
> Date: Thu, 22 Mar 2018 06:52:49 GMT
> 07:52:49,276 Thread-2 (ActiveMQ-client-netty-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector$HttpUpgradeHandler:876] Received msg=DefaultHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 829, cap: 829/829, unwrapped: PooledUnsafeDirectByteBuf(ridx: 1024, widx: 1024, cap: 1024)), decoderResult: success)
> 07:52:49,276 Thread-2 (ActiveMQ-client-netty-threads) DEBUG [org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector$HttpUpgradeHandler:903] AMQ214023: HTTP Handshake failed, received DefaultHttpContent(data: PooledSlicedByteBuf(ridx: 0, widx: 829, cap: 829/829, unwrapped: PooledUnsafeDirectByteBuf(ridx: 1024, widx: 1024, cap: 1024)), decoderResult: success)
> 07:52:49,277 Thread-80 (ActiveMQ-client-global-threads) DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl:1040] Connector towards NettyConnector [host=localhost, port=9080, httpEnabled=false, httpUpgradeEnabled=true, useServlet=false, servletPath=/messaging/ActiveMQServlet, sslEnabled=false, useNio=true, activemqServerName=default, httpUpgradeEndpoint=acceptor] failed
> {code}



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)



More information about the jboss-jira mailing list