[jboss-jira] [JBoss JIRA] (WFLY-5304) NullPointerException during server graceful shutdown after EJBCLient invocation arrived

Tue Mar 1 15:08:00 EST 2016

    [ https://issues.jboss.org/browse/WFLY-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170383#comment-13170383 ] 

Richard Achmatowicz edited comment on WFLY-5304 at 3/1/16 3:07 PM:
-------------------------------------------------------------------

Briefly, the DeploymentRepository holds a map of deployed modules which is updated on deploy and undeploy. This information is used to lookup EJBs which are referenced in invocations which arrive. On the surface, it would look as though DeployableModules.stop() had already been called by the time this invocation arrived.  The NPE is occurring because an invocation has arrived after DeploymentRepository.stop() has been called, and this should happen only after the graceful shutdown mechanism has decided it is safe to shutdown.

In the case of EJBs, clean shutdown operates via EJB interceptors to prevent newly arriving EJB invocations from being processed. However, before the invocation arrives at the interceptor, there is an EJB lookup which occurs in the EJBClient protocol code (VersionOneProtocolChannelReceiver) and this is where the NPE is occurring. 

In theory, when an EJB application is undeployed, a message gets sent to the EJBClient informing it of this fact, and the EJBClient code should try to find a new EJBReceiver to send the invocation to on the client side. In the case of a clean shutdown, such a message is sent out for all EJB applications on the server and so should stop sending requests to the server which is shutting down. However, due to timing between the notifications being sent and the deployment being undeployed on the server side and the notifications being processed on the client side, some invocations could still be sent to the server post shutdown.

In any case, if an invocation arrives at a server and the DeploymentRepository has been stopped, the server should return an EJBNotAvailableException to the client, as it really is unavailable.

was (Author: rachmato):
Briefly, the DeploymentRepository holds a map of deployed modules which is updated on deploy and undeploy. This information is used to lookup EJBs which are referenced in invocations which arrive. On the surface, it would look as though DeployableModules.stop() had already been called by the time this invocation arrived.  The NPE is occurring because an invocation has arrived after DeploymentRepository.stop() has been called, and this should happen only after the graceful shutdown mechanism has decided it is safe to shutdown.

In the case of EJBs, clean shutdown operates via EJB interceptors to prevent newly arriving EJB invocations from being processed. However, before the invocation arrives at the interceptor, there is an EJB lookup which occurs in the EJBClient protocol code (VersionOneProtocolChannelReceiver) and this is where the NPE is occurring. 

In theory, when an EJB application is undeployed, a message gets sent to the EJBClient informing it of this fact, and the EJBClient code should try to find a new EJBReceiver to send the invocation to on the client side. In the case of a clean shutdown, such a message is sent out for all EJB applications on the server and so should stop sending requests to the server which is shutting down. 

In any case, if an invocation arrives at a server and the DeploymentRepository has been stopped, the server should return an EJBNotAvailableException to the client, as it really is unavailable.

> NullPointerException during server graceful shutdown after EJBCLient invocation arrived
> ---------------------------------------------------------------------------------------
>
>                 Key: WFLY-5304
>                 URL: https://issues.jboss.org/browse/WFLY-5304
>             Project: WildFly
>          Issue Type: Bug
>          Components: Clustering, EJB, Remoting
>    Affects Versions: 10.0.0.Beta1
>            Reporter: Michal Vinkler
>            Assignee: Richard Achmatowicz
>            Priority: Minor
>
> EAP 7.0.0.DR9
> Scenario: ejb-ejbremote-shutdown-repl-async
> Perf21 logged NPE just before it gracefully shut down:
> {code}
> [JBossINF] [0m[31m09:02:25,058 ERROR [org.jboss.as.ejb3.remote] (default task-24) WFLYEJB0148: Exception on channel Channel ID 423b1c54 (inbound) of Remoting connection 41f997cb to /10.16.90.52:54492 from message org.jboss.remoting3.remote.InboundMessage$3 at 45c60b2c: java.lang.NullPointerException
> [JBossINF] 	at org.jboss.as.ejb3.deployment.DeploymentRepository.getStartedModules(DeploymentRepository.java:152)
> [JBossINF] 	at org.jboss.as.ejb3.remote.protocol.versionone.EJBIdentifierBasedMessageHandler.findEJB(EJBIdentifierBasedMessageHandler.java:46)
> [JBossINF] 	at org.jboss.as.ejb3.remote.protocol.versionone.MethodInvocationMessageHandler.processMessage(MethodInvocationMessageHandler.java:124)
> [JBossINF] 	at org.jboss.as.ejb3.remote.protocol.versionone.VersionOneProtocolChannelReceiver.processMessage(VersionOneProtocolChannelReceiver.java:213)
> [JBossINF] 	at org.jboss.as.ejb3.remote.protocol.versiontwo.VersionTwoProtocolChannelReceiver.processMessage(VersionTwoProtocolChannelReceiver.java:76)
> [JBossINF] 	at org.jboss.as.ejb3.remote.protocol.versionone.VersionOneProtocolChannelReceiver.handleMessage(VersionOneProtocolChannelReceiver.java:157)
> [JBossINF] 	at org.jboss.remoting3.remote.RemoteConnectionChannel$5.run(RemoteConnectionChannel.java:463)
> [JBossINF] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [JBossINF] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [JBossINF] 	at java.lang.Thread.run(Thread.java:745)
> [JBossINF] 
> [JBossINF] [0m[0m09:02:25,104 INFO  [org.infinispan.eviction.impl.PassivationManagerImpl] (ServerService Thread Pool -- 81) ISPN000029: Passivating all entries to disk
> [JBossINF] [0m[0m09:02:25,108 INFO  [org.infinispan.eviction.impl.PassivationManagerImpl] (ServerService Thread Pool -- 81) ISPN000030: Passivated 3 entries in 3 milliseconds
> [JBossINF] [0m[0m09:02:25,131 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-7) WFLYUT0008: Undertow HTTP listener default suspending
> [JBossINF] [0m[0m09:02:25,132 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 81) WFLYCLINF0003: Stopped repl cache from ejb container
> [JBossINF] [0m[0m09:02:25,148 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-4) WFLYUT0008: Undertow AJP listener ajp suspending
> [JBossINF] [0m[0m09:02:25,151 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-7) WFLYUT0007: Undertow HTTP listener default stopped, was bound to perf21/10.16.90.60:8080
> [JBossINF] [0m[0m09:02:25,156 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-5) WFLYSRV0208: Stopped subdeployment (runtime-name: clusterbench-ee7-ejb.jar) in 1024ms
> [JBossINF] [0m[0m09:02:25,156 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-2) WFLYSRV0208: Stopped subdeployment (runtime-name: clusterbench-ee7-web-passivating.war) in 1055ms
> [JBossINF] [0m[0m09:02:25,156 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) WFLYSRV0208: Stopped subdeployment (runtime-name: clusterbench-ee7-web-default.war) in 1041ms
> [JBossINF] [0m[0m09:02:25,167 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000080: Disconnecting JGroups channel ejb
> [JBossINF] [0m[0m09:02:25,169 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-4) WFLYUT0007: Undertow AJP listener ajp stopped, was bound to perf21/10.16.90.60:8009
> [JBossINF] [0m[0m09:02:25,169 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-5) WFLYSRV0028: Stopped deployment clusterbench-ee7.ear (runtime-name: clusterbench-ee7.ear) in 1054ms
> [JBossINF] [0m[0m09:02:25,170 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (MSC service thread 1-1) ISPN000082: Stopping the RpcDispatcher for channel ejb
> [JBossINF] [0m[0m09:02:25,171 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-3) WFLYUT0004: Undertow 1.3.0.Beta9 stopping
> [JBossINF] [0m[0m09:02:25,275 INFO  [org.jboss.as] (MSC service thread 1-2) WFLYSRV0050: EAP 7.0.0.Alpha1 (WildFly Core 2.0.0.Beta4) stopped in 1125ms
> {code}
> Graceful shutdown timeout was set to 300 seconds. 
> It does not seem to generate any kind of error on the client.
> The similar issue was already mentioned by Richard Achmatowicz here:
> https://issues.jboss.org/browse/WFLY-4697?focusedCommentId=13076866&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13076866
> Server log:
> http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-7x-failover-ejb-ejbremote-shutdown-repl-async/6/console-perf21/

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)