[jboss-jira] [JBoss JIRA] (WFLY-11292) Legacy EJB Client: High fail rate

tommaso borgato (Jira) issues at jboss.org
Wed Nov 7 06:00:01 EST 2018


     [ https://issues.jboss.org/browse/WFLY-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

tommaso borgato updated WFLY-11292:
-----------------------------------
    Description: 
This bug is being filed as Blocker because we are observing and elevated fail rate: roughly a thousand each run for (about 0.3%).

h2. WildFly Built from master branch on 6 Nov 2018

With this WildFly version (client org.jboss:jboss-ejb-client-legacy:3.0.2.Final-redhat-1) in a scenario with 4 clustered nodes where nodes are failed via jboss shut-down / restart: after node 1 of 4 is shut-down, a a series of errors start on the client side the yield to 1097 errors on a total of 340218 samples;

find [here|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-jobs-legacy-ejb-ejbremote/job/perflab_eap-7x-failover-legacy-ejb-ejbremote-shutdown-dist-sync/8/] the complete logs;

the start of client errors is here:

{noformat}
2018/11/07 05:28:51:255 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Failing node 0 (perf18)
2018/11/07 05:28:51:270 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Total: Sessions: 2000, active: 2000, samples: 5118, throughput: 511.7 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 54 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5118 (100%)
2018/11/07 05:28:51:270 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - unknown-node: Sessions: 2000, active: 2000, samples: 5118, throughput: 511.7 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 54 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5118 (100%)
05:28:53,257 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-2) EJBCLIENT000016: Channel Channel ID c540daf7 (outbound) of Remoting connection 4ed5523c to perf18/10.16.90.54:8080 of endpoint "config-based-ejb-client-endpoint" <664ac6d5> can no longer process messages
05:28:53,277 ERROR [org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver] (Remoting "config-based-ejb-client-endpoint" task-2) Failed to open channel for context EJBReceiverContext{clientContext=org.jboss.ejb.client.EJBClientContext at 3cd51445, receiver=Remoting connection EJB receiver [connection=org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection at 302066a5,channel=jboss.ejb,nodename=perf18]}
org.jboss.remoting3.NotOpenException: Cannot open new channel because close was initiated
	at org.jboss.remoting3.remote.RemoteConnectionHandler.handleOutboundChannelOpen(RemoteConnectionHandler.java:198)
	at org.jboss.remoting3.remote.RemoteConnectionHandler.open(RemoteConnectionHandler.java:335)
	at org.jboss.remoting3.ConnectionImpl.openChannel(ConnectionImpl.java:109)
	at org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection.openChannel(ConnectionPool.java:292)
	at org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver.associate(RemotingConnectionEJBReceiver.java:180)
	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:399)
	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:349)
	at org.jboss.ejb.client.remoting.EJBClientContextConnectionReconnectHandler.reconnect(EJBClientContextConnectionReconnectHandler.java:67)
	at org.jboss.ejb.client.EJBClientContext$ReconnectAttempt.run(EJBClientContext.java:1474)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
{noformat}


  was:
This bug is being filed as Blocker because we are observing and elevated fail rate: roughly a thousand each run for (about 0.3%).

h2. 7.2.0.GA.CR1

With version 7.2.0.GA.CR1 (server jboss-eap-7.2.0.GA.CR1.zip, client org.jboss:jboss-ejb-client-legacy:3.0.2.Final-redhat-1) in a scenario with 4 clustered nodes where nodes are failed via jboss shut-down / restart: after node 3 of 4 is shut-down, a a series of errors start on the client side the yield to 1282 errors on a total of 341408 samples;

find [here|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-jobs-legacy-ejb-ejbremote/job/perflab_eap-7x-failover-legacy-ejb-ejbremote-shutdown-dist-sync/3/] the complete logs;

the start of client errors is here:

{noformat}
2018/11/06 11:12:27:263 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Failing node 2 (perf20)
11:12:29,078 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-1) EJBCLIENT000016: Channel Channel ID fb008f94 (outbound) of Remoting connection 2a00ed12 to /10.16.90.58:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61> can no longer process messages
11:12:29,114 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-4) EJBCLIENT000016: Channel Channel ID 8f12f2ec (outbound) of Remoting connection 13f3f7a6 to /10.16.90.72:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61> can no longer process messages
11:12:29,132 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-4) EJBCLIENT000017: Received server version 3 and marshalling strategies [river]
11:12:29,132 INFO  [org.jboss.ejb.client.remoting] (ejb-client-context-tasks-1-thread-22) EJBCLIENT000013: Successful version handshake completed for receiver context EJBReceiverContext{clientContext=org.jboss.ejb.client.EJBClientContext at 79a74a25, receiver=Remoting connection EJB receiver [connection=org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection at 17bd8cb2,channel=jboss.ejb,nodename=perf21]} on channel Channel ID f6f6b0ba (outbound) of Remoting connection 13f3f7a6 to /10.16.90.72:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61>
11:12:29,164 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-3) EJBCLIENT000016: Channel Channel ID f6f6b0ba (outbound) of Remoting connection 13f3f7a6 to /10.16.90.72:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61> can no longer process messages
2018/11/06 11:12:29:180 EST [ERROR][Runner - 1922] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error getting response. <java.lang.IllegalStateException: EJBCLIENT000032: Cannot retry a request which hasn't previously been completed>
        
2018/11/06 11:12:29:180 EST [ERROR][Runner - 494] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error getting response. <java.lang.IllegalStateException: EJBCLIENT000032: Cannot retry a request which hasn't previously been completed>
        
2018/11/06 11:12:29:180 EST [ERROR][Runner - 969] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error getting response. <java.lang.IllegalStateException: EJBCLIENT000032: Cannot retry a request which hasn't previously been completed>
        
2018/11/06 11:12:29:181 EST [INFO ][Runner - 494] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Reached treshold of 1 errors, resetting session.
2018/11/06 11:12:29:180 EST [ERROR][Runner - 1287] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error getting response. <java.lang.IllegalStateException: EJBCLIENT000032: Cannot retry a request which hasn't previously been completed>
        
2018/11/06 11:12:29:180 EST [ERROR][Runner - 1761] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error getting response. <javax.ejb.EJBException: java.io.IOException: Channel Channel ID f6f6b0ba (outbound) of Remoting connection 13f3f7a6 to /10.16.90.72:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61> has been closed>
        
2018/11/06 11:12:29:181 EST [INFO ][Runner - 1287] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Reached treshold of 1 errors, resetting session.
2018/11/06 11:12:29:181 EST [INFO ][Runner - 969] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Reached treshold of 1 errors, resetting session.
2018/11/06 11:12:29:180 EST [INFO ][Runner - 1922] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Reached treshold of 1 errors, resetting session.
2018/11/06 11:12:29:181 EST [WARN ][Runner - 494] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Error sampling data:  <org.jboss.smartfrog.loaddriver.RequestProcessingException: Could not get valid response. Runner: 494.>
        
2018/11/06 11:12:29:189 EST [INFO ][Runner - 1761] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Reached treshold of 1 errors, resetting session.
11:12:29,181 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-1) EJBCLIENT000017: Received server version 3 and marshalling strategies [river]
11:12:29,199 INFO  [org.jboss.ejb.client.remoting] (ejb-client-context-tasks-1-thread-22) EJBCLIENT000013: Successful version handshake completed for receiver context EJBReceiverContext{clientContext=org.jboss.ejb.client.EJBClientContext at 79a74a25, receiver=Remoting connection EJB receiver [connection=org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection at 17bd8cb2,channel=jboss.ejb,nodename=perf21]} on channel Channel ID beeb14c5 (outbound) of Remoting connection 13f3f7a6 to /10.16.90.72:8080 of endpoint "config-based-ejb-client-endpoint" <4b824c61>
2018/11/06 11:12:29:190 EST [WARN ][Runner - 494] SFCORE_LOG - Error sampling data:  <org.jboss.smartfrog.loaddriver.RequestProcessingException: Could not get valid response. Runner: 494.>
        org.jboss.smartfrog.loaddriver.RequestProcessingException: Could not get valid response. Runner: 494.
	at org.jboss.smartfrog.clustering.ejb3.StatefulSBProcessorFactoryImplLegacyClient$EJB3RequestProcessor.processRequest(StatefulSBProcessorFactoryImplLegacyClient.java:109)
	at org.jboss.smartfrog.loaddriver.CompoundRequestProcessorFactoryImpl$CompoundRequestProcessor.processRequest(CompoundRequestProcessorFactoryImpl.java:52)
	at org.jboss.smartfrog.loaddriver.Runner.run(Runner.java:103)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: EJBCLIENT000032: Cannot retry a request which hasn't previously been completed
	at org.jboss.ejb.client.EJBClientInvocationContext.retryRequest(EJBClientInvocationContext.java:203)
	at org.jboss.ejb.client.EJBInvocationHandler.sendRequestWithPossibleRetries(EJBInvocationHandler.java:260)
	at org.jboss.ejb.client.EJBInvocationHandler.sendRequestWithPossibleRetries(EJBInvocationHandler.java:279)
	at org.jboss.ejb.client.EJBInvocationHandler.doInvoke(EJBInvocationHandler.java:202)
	at org.jboss.ejb.client.EJBInvocationHandler.doInvoke(EJBInvocationHandler.java:185)
	at org.jboss.ejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:148)
	at com.sun.proxy.$Proxy2.getSerialAndIncrement(Unknown Source)
	at org.jboss.smartfrog.clustering.ejb3.StatefulSBProcessorFactoryImplLegacyClient$EJB3RequestProcessor.processRequest(StatefulSBProcessorFactoryImplLegacyClient.java:79)
	... 3 more
{noformat}

h2. 7.2.0.CD14.CR2

With version 7.2.0.CD14.CR2 (server jboss-eap-7.2.0.CD14.CR2.zip, client org.jboss:jboss-ejb-client-legacy:3.0.2.Final-redhat-1) in a scenario with 4 clustered nodes where nodes are failed via jboss shut-down / restart: after node 1 of 4 is shut-down, a a series of errors start on the client side the yield to 1307 errors on a total of 340159 samples;

find [here|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-jobs-legacy-ejb-ejbremote/job/perflab_eap-7x-failover-legacy-ejb-ejbremote-shutdown-dist-sync/7/] the complete logs;

the start of client errors is here:

{noformat}
2018/11/07 04:00:17:108 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Failing node 0 (perf18)
2018/11/07 04:00:17:116 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Total: Sessions: 2000, active: 2000, samples: 5120, throughput: 511.9 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 29 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5120 (100%)
2018/11/07 04:00:17:117 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - unknown-node: Sessions: 2000, active: 2000, samples: 5120, throughput: 511.9 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 29 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5120 (100%)
04:00:19,053 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-3) EJBCLIENT000016: Channel Channel ID 9687d0a2 (outbound) of Remoting connection 35bf3cbf to perf18/10.16.90.54:8080 of endpoint "config-based-ejb-client-endpoint" <e612b73> can no longer process messages
04:00:19,074 ERROR [org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver] (Remoting "config-based-ejb-client-endpoint" task-3) Failed to open channel for context EJBReceiverContext{clientContext=org.jboss.ejb.client.EJBClientContext at 6d037274, receiver=Remoting connection EJB receiver [connection=org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection at 7ac140b5,channel=jboss.ejb,nodename=perf18]}
org.jboss.remoting3.NotOpenException: Cannot open new channel because close was initiated
	at org.jboss.remoting3.remote.RemoteConnectionHandler.handleOutboundChannelOpen(RemoteConnectionHandler.java:198)
	at org.jboss.remoting3.remote.RemoteConnectionHandler.open(RemoteConnectionHandler.java:335)
	at org.jboss.remoting3.ConnectionImpl.openChannel(ConnectionImpl.java:109)
	at org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection.openChannel(ConnectionPool.java:292)
	at org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver.associate(RemotingConnectionEJBReceiver.java:180)
	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:399)
	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:349)
	at org.jboss.ejb.client.remoting.EJBClientContextConnectionReconnectHandler.reconnect(EJBClientContextConnectionReconnectHandler.java:67)
	at org.jboss.ejb.client.EJBClientContext$ReconnectAttempt.run(EJBClientContext.java:1474)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
{noformat}




> Legacy EJB Client: High fail rate
> ---------------------------------
>
>                 Key: WFLY-11292
>                 URL: https://issues.jboss.org/browse/WFLY-11292
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering, EJB
>    Affects Versions: 15.0.0.Alpha1
>            Reporter: tommaso borgato
>            Assignee: Flavia Rainone
>            Priority: Blocker
>
> This bug is being filed as Blocker because we are observing and elevated fail rate: roughly a thousand each run for (about 0.3%).
> h2. WildFly Built from master branch on 6 Nov 2018
> With this WildFly version (client org.jboss:jboss-ejb-client-legacy:3.0.2.Final-redhat-1) in a scenario with 4 clustered nodes where nodes are failed via jboss shut-down / restart: after node 1 of 4 is shut-down, a a series of errors start on the client side the yield to 1097 errors on a total of 340218 samples;
> find [here|https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/view/EAP7/view/EAP7-Clustering_JJB/view/clustering-jobs-legacy-ejb-ejbremote/job/perflab_eap-7x-failover-legacy-ejb-ejbremote-shutdown-dist-sync/8/] the complete logs;
> the start of client errors is here:
> {noformat}
> 2018/11/07 05:28:51:255 EST [INFO ][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Failing node 0 (perf18)
> 2018/11/07 05:28:51:270 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Total: Sessions: 2000, active: 2000, samples: 5118, throughput: 511.7 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 54 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5118 (100%)
> 2018/11/07 05:28:51:270 EST [INFO ][StatsRunner] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - unknown-node: Sessions: 2000, active: 2000, samples: 5118, throughput: 511.7 samples/s, bandwidth: 0.0 MB/s, response min: 1 ms, mean: 1 ms, max: 54 ms, sampling errors: 0, unhealthy samples: 0, valid samples: 5118 (100%)
> 05:28:53,257 INFO  [org.jboss.ejb.client.remoting] (Remoting "config-based-ejb-client-endpoint" task-2) EJBCLIENT000016: Channel Channel ID c540daf7 (outbound) of Remoting connection 4ed5523c to perf18/10.16.90.54:8080 of endpoint "config-based-ejb-client-endpoint" <664ac6d5> can no longer process messages
> 05:28:53,277 ERROR [org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver] (Remoting "config-based-ejb-client-endpoint" task-2) Failed to open channel for context EJBReceiverContext{clientContext=org.jboss.ejb.client.EJBClientContext at 3cd51445, receiver=Remoting connection EJB receiver [connection=org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection at 302066a5,channel=jboss.ejb,nodename=perf18]}
> org.jboss.remoting3.NotOpenException: Cannot open new channel because close was initiated
> 	at org.jboss.remoting3.remote.RemoteConnectionHandler.handleOutboundChannelOpen(RemoteConnectionHandler.java:198)
> 	at org.jboss.remoting3.remote.RemoteConnectionHandler.open(RemoteConnectionHandler.java:335)
> 	at org.jboss.remoting3.ConnectionImpl.openChannel(ConnectionImpl.java:109)
> 	at org.jboss.ejb.client.remoting.ConnectionPool$PooledConnection.openChannel(ConnectionPool.java:292)
> 	at org.jboss.ejb.client.remoting.RemotingConnectionEJBReceiver.associate(RemotingConnectionEJBReceiver.java:180)
> 	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:399)
> 	at org.jboss.ejb.client.EJBClientContext.registerEJBReceiver(EJBClientContext.java:349)
> 	at org.jboss.ejb.client.remoting.EJBClientContextConnectionReconnectHandler.reconnect(EJBClientContextConnectionReconnectHandler.java:67)
> 	at org.jboss.ejb.client.EJBClientContext$ReconnectAttempt.run(EJBClientContext.java:1474)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> 	at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the jboss-jira mailing list