[Red Hat JIRA] (WFLY-14284) WildFly doesn't stop while waiting for PeriodicRecovery
by Ondrej Chaloupka (Jira)
[ https://issues.redhat.com/browse/WFLY-14284?page=com.atlassian.jira.plugi... ]
Ondrej Chaloupka edited comment on WFLY-14284 at 1/22/21 7:00 AM:
------------------------------------------------------------------
hi [~adrianots] I'm looking at your issue and I have few points.
The reason why you can see the different behaviour between WildFly 10 and WildFly 20 is that between these two releases there was added new component of WildFly Transaction Client (https://github.com/wildfly/wildfly-transaction-client) which refactored the way how the remote EJB calls are done.
When I'm looking at the stack trace there is the trouble of the recovery processing being stuck at place of remote {{recover()}} call[1]. The recovery manager tries to be stopped[2] but it can't be until there is some recovery in progress. Thus is waiting and the app container is waiting for the {{recover()}} call being finished as well.
The http client (WFLY 20.0.1/wildfly-http-transaction-client 1.0.21.Final) is waiting endlessly for reading the remote data on recover (https://github.com/wildfly/wildfly-http-client/blob/1.0.21.Final/transact...).
Could be [~flavia.rainone][~tomekadamski] that the {{HttpRemoteTransactionPeer#recover()}} does not consider the error state where no data or some exception data is sent back instead of {{Xid}}?
[~adrianots] The configuration (cli script) configures the {{http-remoting}} protocol. I think this is what was used in older versions of WildFly and what can be used for interoperability purposes. With the new remoting library there was added the new protocol which runs with http upgrade the {{remote+http}}
(ie. {{.../subsystem=remoting/remote-outbound-connection=remote-workflow-connection:add(outbound-socket-binding-ref=remote-workflow, protocol=remote+http, ...}})
The handler for this should be then https://github.com/wildfly/wildfly-transaction-client/blob/1.1.11.Final/s... which could be capable to handle the call correctly (ie. probably by reporting the error and letting the recovery to be finished). This requires that both of your servers are the WFLY20 (or similar which handles 'remote+http', https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
The question is why the authentication of remote recovery fails. As you run the call to the same server on the same machine then there is probably disables the $local authentication (https://docs.wildfly.org/22/WildFly_Elytron_Security.html#out-of-the-box-...) which is done by reason but I would just mention that with that it could be working. If it's working then the trouble of recover remote ejb calls is that the security configuration has to be globally announced. There is an issue on this here https://issues.redhat.com/browse/WFCORE-4668 which would be related then. I think you will need to configure the {{-Dwildfly.config.url}} with {{wildfly-config.xml}} to get authentication of recover working.
The second question is why you call the remote EJB call from the server to the same server - ie. both applications are at the same server. Why don't you use just local EJB lookup (with {{@EJB}} injection/{{java:app}} lookup}}? I would assume you lookup the ejb with {{ejb:/}}, right?
I'm not sure here but it could be that the {{remote+http}} may be capable to first search the local context and does not invoke the remote call (which would help if the both applications are at the same server, ie. depending on your descriptor of the ejb client - https://docs.wildfly.org/22/Client_Guide.html#jboss-ejb-client, expecting the {{exclude-local-receiver}} should *not* be set to true).
The remote invocation makes here the trouble that the recover calls goes over remote call as well. Maybe even it forces the transaction manager to involve the 2PC processing. If the EJB bean makes some data insertion to database and then there is a remote ejb call to a different server and transaction is propagated, this remote call is considered as a transaction participant.
Other possibility would be you know the transaction is not necessary to be propagated over EJB call and you can use annotation {{@ClientTransaction}} (https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
[1]
{code}
"Periodic Recovery" - Thread t@115
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6a00decd> (a java.util.concurrent.CompletableFuture$Signaller)
at java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.8/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
at java.base@11.0.8/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.wildfly.http-client.transaction@1.0.21.Final//org.wildfly.httpclient.transaction.HttpRemoteTransactionPeer.recover(HttpRemoteTransactionPeer.java:98)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:213)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:209)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:659)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:240)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
{code}
[2]
{code}
"ServerService Thread Pool -- 37" - Thread t@65
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/java.lang.Object.wait(Native Method)
- waiting on <15eaeba8> (a java.lang.Object)
at java.base@11.0.8/java.lang.Object.wait(Object.java:328)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.suspendScan(PeriodicRecovery.java:247)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.trySuspendScan(RecoveryManagerImple.java:192)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.trySuspend(RecoveryManager.java:265)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.suspend(RecoveryManager.java:259)
at org.jboss.jts.integration//com.arjuna.ats.jbossatx.jta.RecoveryManagerService.suspend(RecoveryManagerService.java:79)
at org.jboss.as.transactions@20.0.1.Final//org.jboss.as.txn.suspend.RecoverySuspendController.preSuspend(RecoverySuspendController.java:42)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.suspend.SuspendController.suspend(SuspendController.java:103)
- locked <2ef5626e> (a org.jboss.as.server.suspend.SuspendController)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.operations.ServerDomainProcessShutdownHandler$1$1.handleResult(ServerDomainProcessShutdownHandler.java:100)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1533)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1515)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
...
{code}
was (Author: ochaloup):
hi [~adrianots] I'm looking at your issue and I have few points.
The reason why you can see the different behaviour between WildFly 10 and WildFly 20 is that between these two releases there was added new component of WildFly Transaction Client (https://github.com/wildfly/wildfly-transaction-client) which refactored the way how the remote EJB calls are done.
When I'm looking at the stack trace there is the trouble of the recovery processing being stuck at place of remote {{recover()}} call[1]. The recovery manager tries to be stopped[2] but it can't be until there is some recovery in progress. Thus is waiting and the app container is waiting for the {{recover()}} call being finished as well.
The http client (WFLY 20.0.1/wildfly-http-transaction-client 1.0.21.Final) is waiting endlessly for reading the remote data on recover (https://github.com/wildfly/wildfly-http-client/blob/1.0.21.Final/transact...).
Could be [~flavia.rainone][~tomekadamski] that the {{HttpRemoteTransactionPeer#recover()}} does not consider the error state where no data or some exception data is sent back instead of {{Xid}}?
[~adrianots] The configuration (cli script) configures the {{http-remoting}} protocol. I think this is what was used in older versions of WildFly and what can be used for interoperability purposes. With the new remoting library there was added the new protocol which runs with http upgrade the {{remote+http}}
(ie. {{.../subsystem=remoting/remote-outbound-connection=remote-workflow-connection:add(outbound-socket-binding-ref=remote-workflow, protocol=remote+http, ...}})
The handler for this should be then https://github.com/wildfly/wildfly-transaction-client/blob/1.1.11.Final/s... which could be capable to handle the call correctly (ie. probably by reporting the error and letting the recovery to be finished). This requires that both of your servers are the WFLY20 (or similar which handles 'remote+http', https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
The question is why the authentication of remote recovery fails. As you run the call to the same server on the same machine then there is probably disables the $local authentication (https://docs.wildfly.org/22/WildFly_Elytron_Security.html#out-of-the-box-...) which is done by reason but I would just mention that with that it could be working. If it's working then the trouble of recover remote ejb calls is that the security configuration has to be globally announced. There is an issue on this here https://issues.redhat.com/browse/WFCORE-4668 which would be related then. I think you will need to configure the {{-Dwildfly.config.url}} with {{wildfly-config.xml}} to get authentication of recover working.
The second question is why you call the remote EJB call from the server to the same server - ie. both applications are at the same server. Why don't you use just local EJB lookup (with {{@EJB}} injection/{{java:app}} lookup}}? I would assume you lookup the ejb with {{ejb:/}}, right?
I'm not sure here but it could be that the {{remote+http}} may be capable to first search the local context and does not invoke the remote call (which would help if the both applications are at the same server, ie. depending on your descriptor of the ejb client - https://docs.wildfly.org/22/Client_Guide.html#jboss-ejb-client, expecting the {{exclude-local-receiver}} should *not* be set to true).
The remote invocation makes here the trouble that the recover calls goes over remote call as well. Maybe even it forces the transaction manager to involve the 2PC processing. If the EJB bean makes some data insertion to database and then there is a remote ejb call to a different server and transaction is propagated, this remote call is considered as a transaction participant.
Other possibility would be you know the transaction is not necessary to be propagated over EJB call and you can use https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap....
[1]
{code}
"Periodic Recovery" - Thread t@115
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6a00decd> (a java.util.concurrent.CompletableFuture$Signaller)
at java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.8/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
at java.base@11.0.8/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.wildfly.http-client.transaction@1.0.21.Final//org.wildfly.httpclient.transaction.HttpRemoteTransactionPeer.recover(HttpRemoteTransactionPeer.java:98)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:213)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:209)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:659)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:240)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
{code}
[2]
{code}
"ServerService Thread Pool -- 37" - Thread t@65
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/java.lang.Object.wait(Native Method)
- waiting on <15eaeba8> (a java.lang.Object)
at java.base@11.0.8/java.lang.Object.wait(Object.java:328)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.suspendScan(PeriodicRecovery.java:247)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.trySuspendScan(RecoveryManagerImple.java:192)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.trySuspend(RecoveryManager.java:265)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.suspend(RecoveryManager.java:259)
at org.jboss.jts.integration//com.arjuna.ats.jbossatx.jta.RecoveryManagerService.suspend(RecoveryManagerService.java:79)
at org.jboss.as.transactions@20.0.1.Final//org.jboss.as.txn.suspend.RecoverySuspendController.preSuspend(RecoverySuspendController.java:42)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.suspend.SuspendController.suspend(SuspendController.java:103)
- locked <2ef5626e> (a org.jboss.as.server.suspend.SuspendController)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.operations.ServerDomainProcessShutdownHandler$1$1.handleResult(ServerDomainProcessShutdownHandler.java:100)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1533)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1515)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
...
{code}
> WildFly doesn't stop while waiting for PeriodicRecovery
> -------------------------------------------------------
>
> Key: WFLY-14284
> URL: https://issues.redhat.com/browse/WFLY-14284
> Project: WildFly
> Issue Type: Bug
> Components: EJB, Transactions
> Affects Versions: 18.0.1.Final, 20.0.1.Final
> Reporter: Adriano Teixeira de Souza
> Assignee: Michael Musgrove
> Priority: Major
> Attachments: ejb-configs.sh, jboss-ejb-client.xml, server(transaction).log, thread-dump-stop-1.txt
>
>
> I'm testing wildfly 20.0.1 (and 21.0.2 was tested too) for replace our old version of Wildfly 10.
> it happens that frequently we have seen that the stop function of server does not work and we need to kill the process by manual operation on the OS.
> It sounds like a dead look.
> I attatch the thread dump on this.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
3 years, 9 months
[Red Hat JIRA] (WFLY-14284) WildFly doesn't stop while waiting for PeriodicRecovery
by Ondrej Chaloupka (Jira)
[ https://issues.redhat.com/browse/WFLY-14284?page=com.atlassian.jira.plugi... ]
Ondrej Chaloupka edited comment on WFLY-14284 at 1/22/21 6:57 AM:
------------------------------------------------------------------
hi [~adrianots] I'm looking at your issue and I have few points.
The reason why you can see the different behaviour between WildFly 10 and WildFly 20 is that between these two releases there was added new component of WildFly Transaction Client (https://github.com/wildfly/wildfly-transaction-client) which refactored the way how the remote EJB calls are done.
When I'm looking at the stack trace there is the trouble of the recovery processing being stuck at place of remote {{recover()}} call[1]. The recovery manager tries to be stopped[2] but it can't be until there is some recovery in progress. Thus is waiting and the app container is waiting for the {{recover()}} call being finished as well.
The http client (WFLY 20.0.1/wildfly-http-transaction-client 1.0.21.Final) is waiting endlessly for reading the remote data on recover (https://github.com/wildfly/wildfly-http-client/blob/1.0.21.Final/transact...).
Could be [~flavia.rainone][~tomekadamski] that the {{HttpRemoteTransactionPeer#recover()}} does not consider the error state where no data or some exception data is sent back instead of {{Xid}}?
[~adrianots] The configuration (cli script) configures the {{http-remoting}} protocol. I think this is what was used in older versions of WildFly and what can be used for interoperability purposes. With the new remoting library there was added the new protocol which runs with http upgrade the {{remote+http}}
(ie. {{.../subsystem=remoting/remote-outbound-connection=remote-workflow-connection:add(outbound-socket-binding-ref=remote-workflow, protocol=remote+http, ...}})
The handler for this should be then https://github.com/wildfly/wildfly-transaction-client/blob/1.1.11.Final/s... which could be capable to handle the call correctly (ie. probably by reporting the error and letting the recovery to be finished). This requires that both of your servers are the WFLY20 (or similar which handles 'remote+http', https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
The question is why the authentication of remote recovery fails. As you run the call to the same server on the same machine then there is probably disables the $local authentication (https://docs.wildfly.org/22/WildFly_Elytron_Security.html#out-of-the-box-...) which is done by reason but I would just mention that with that it could be working. If it's working then the trouble of recover remote ejb calls is that the security configuration has to be globally announced. There is an issue on this here https://issues.redhat.com/browse/WFCORE-4668 which would be related then. I think you will need to configure the {{-Dwildfly.config.url}} with {{wildfly-config.xml}} to get authentication of recover working.
The second question is why you call the remote EJB call from the server to the same server - ie. both applications are at the same server. Why don't you use just local EJB lookup (with {{@EJB}} injection/{{java:app}} lookup}}? I would assume you lookup the ejb with {{ejb:/}}, right?
I'm not sure here but it could be that the {{remote+http}} may be capable to first search the local context and does not invoke the remote call (which would help if the both applications are at the same server, ie. depending on your descriptor of the ejb client - https://docs.wildfly.org/22/Client_Guide.html#jboss-ejb-client, expecting the {{exclude-local-receiver}} should *not* be set to true).
The remote invocation makes here the trouble that the recover calls goes over remote call as well. Maybe even it forces the transaction manager to involve the 2PC processing. If the EJB bean makes some data insertion to database and then there is a remote ejb call to a different server and transaction is propagated, this remote call is considered as a transaction participant.
Other possibility would be you know the transaction is not necessary to be propagated over EJB call and you can use https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap....
[1]
{code}
"Periodic Recovery" - Thread t@115
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6a00decd> (a java.util.concurrent.CompletableFuture$Signaller)
at java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.8/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
at java.base@11.0.8/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.wildfly.http-client.transaction@1.0.21.Final//org.wildfly.httpclient.transaction.HttpRemoteTransactionPeer.recover(HttpRemoteTransactionPeer.java:98)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:213)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:209)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:659)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:240)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
{code}
[2]
{code}
"ServerService Thread Pool -- 37" - Thread t@65
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/java.lang.Object.wait(Native Method)
- waiting on <15eaeba8> (a java.lang.Object)
at java.base@11.0.8/java.lang.Object.wait(Object.java:328)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.suspendScan(PeriodicRecovery.java:247)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.trySuspendScan(RecoveryManagerImple.java:192)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.trySuspend(RecoveryManager.java:265)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.suspend(RecoveryManager.java:259)
at org.jboss.jts.integration//com.arjuna.ats.jbossatx.jta.RecoveryManagerService.suspend(RecoveryManagerService.java:79)
at org.jboss.as.transactions@20.0.1.Final//org.jboss.as.txn.suspend.RecoverySuspendController.preSuspend(RecoverySuspendController.java:42)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.suspend.SuspendController.suspend(SuspendController.java:103)
- locked <2ef5626e> (a org.jboss.as.server.suspend.SuspendController)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.operations.ServerDomainProcessShutdownHandler$1$1.handleResult(ServerDomainProcessShutdownHandler.java:100)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1533)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1515)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
...
{code}
was (Author: ochaloup):
hi [~adrianots] I'm looking at your issue and I have few points.
The reason why you can see the different behaviour between WildFly 10 and WildFly 20 is that between these two releases there was added new component of WildFly Transaction Client (https://github.com/wildfly/wildfly-transaction-client) which refactored the way how the remote EJB calls are done.
When I'm looking at the stack trace there is the trouble of the recovery processing being stuck at place of remote {{recover()}} call[1]. The recovery manager tries to be stopped[2] but it can't be until there is some recovery in progress. Thus is waiting and the app container is waiting for the {{recover()}} call being finished as well.
The http client (WFLY 20.0.1/wildfly-http-transaction-client 1.0.21.Final) is waiting endlessly for reading the remote data on recover (/home/ochaloup/jboss/wildfly-20.0.1.Final/modules/system/layers/base/org/wildfly/http-client/transaction/main/wildfly-http-transaction-client-1.0.21.Final.jar).
Could be [~flavia.rainone][~tomekadamski] that the {{HttpRemoteTransactionPeer#recover()}} does not consider the error state where no data or some exception data is sent back instead of {{Xid}}?
[~adrianots] The configuration (cli script) configures the {{http-remoting}} protocol. I think this is what was used in older versions of WildFly and what can be used for interoperability purposes. With the new remoting library there was added the new protocol which runs with http upgrade the {{remote+http}}
(ie. {{.../subsystem=remoting/remote-outbound-connection=remote-workflow-connection:add(outbound-socket-binding-ref=remote-workflow, protocol=remote+http, ...}})
The handler for this should be then https://github.com/wildfly/wildfly-transaction-client/blob/1.1.11.Final/s... which could be capable to handle the call correctly (ie. probably by reporting the error and letting the recovery to be finished). This requires that both of your servers are the WFLY20 (or similar which handles 'remote+http', https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
The question is why the authentication of remote recovery fails. As you run the call to the same server on the same machine then there is probably disables the $local authentication (https://docs.wildfly.org/22/WildFly_Elytron_Security.html#out-of-the-box-...) which is done by reason but I would just mention that with that it could be working. If it's working then the trouble of recover remote ejb calls is that the security configuration has to be globally announced. There is an issue on this here https://issues.redhat.com/browse/WFCORE-4668 which would be related then. I think you will need to configure the {{-Dwildfly.config.url}} with {{wildfly-config.xml}} to get authentication of recover working.
The second question is why you call the remote EJB call from the server to the same server - ie. both applications are at the same server. Why don't you use just local EJB lookup (with {{@EJB}} injection/{{java:app}} lookup}}? I would assume you lookup the ejb with {{ejb:/}}, right?
I'm not sure here but it could be that the {{remote+http}} may be capable to first search the local context and does not invoke the remote call (which would help if the both applications are at the same server, ie. depending on your descriptor of the ejb client - https://docs.wildfly.org/22/Client_Guide.html#jboss-ejb-client, expecting the {{exclude-local-receiver}} should *not* be set to true).
The remote invocation makes here the trouble that the recover calls goes over remote call as well. Maybe even it forces the transaction manager to involve the 2PC processing. If the EJB bean makes some data insertion to database and then there is a remote ejb call to a different server and transaction is propagated, this remote call is considered as a transaction participant.
Other possibility would be you know the transaction is not necessary to be propagated over EJB call and you can use https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap....
[1]
{code}
"Periodic Recovery" - Thread t@115
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6a00decd> (a java.util.concurrent.CompletableFuture$Signaller)
at java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.8/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
at java.base@11.0.8/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.wildfly.http-client.transaction@1.0.21.Final//org.wildfly.httpclient.transaction.HttpRemoteTransactionPeer.recover(HttpRemoteTransactionPeer.java:98)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:213)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:209)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:659)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:240)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
{code}
[2]
{code}
"ServerService Thread Pool -- 37" - Thread t@65
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/java.lang.Object.wait(Native Method)
- waiting on <15eaeba8> (a java.lang.Object)
at java.base@11.0.8/java.lang.Object.wait(Object.java:328)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.suspendScan(PeriodicRecovery.java:247)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.trySuspendScan(RecoveryManagerImple.java:192)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.trySuspend(RecoveryManager.java:265)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.suspend(RecoveryManager.java:259)
at org.jboss.jts.integration//com.arjuna.ats.jbossatx.jta.RecoveryManagerService.suspend(RecoveryManagerService.java:79)
at org.jboss.as.transactions@20.0.1.Final//org.jboss.as.txn.suspend.RecoverySuspendController.preSuspend(RecoverySuspendController.java:42)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.suspend.SuspendController.suspend(SuspendController.java:103)
- locked <2ef5626e> (a org.jboss.as.server.suspend.SuspendController)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.operations.ServerDomainProcessShutdownHandler$1$1.handleResult(ServerDomainProcessShutdownHandler.java:100)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1533)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1515)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
...
{code}
> WildFly doesn't stop while waiting for PeriodicRecovery
> -------------------------------------------------------
>
> Key: WFLY-14284
> URL: https://issues.redhat.com/browse/WFLY-14284
> Project: WildFly
> Issue Type: Bug
> Components: EJB, Transactions
> Affects Versions: 18.0.1.Final, 20.0.1.Final
> Reporter: Adriano Teixeira de Souza
> Assignee: Michael Musgrove
> Priority: Major
> Attachments: ejb-configs.sh, jboss-ejb-client.xml, server(transaction).log, thread-dump-stop-1.txt
>
>
> I'm testing wildfly 20.0.1 (and 21.0.2 was tested too) for replace our old version of Wildfly 10.
> it happens that frequently we have seen that the stop function of server does not work and we need to kill the process by manual operation on the OS.
> It sounds like a dead look.
> I attatch the thread dump on this.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
3 years, 9 months
[Red Hat JIRA] (WFLY-14284) WildFly doesn't stop while waiting for PeriodicRecovery
by Ondrej Chaloupka (Jira)
[ https://issues.redhat.com/browse/WFLY-14284?page=com.atlassian.jira.plugi... ]
Ondrej Chaloupka commented on WFLY-14284:
-----------------------------------------
hi [~adrianots] I'm looking at your issue and I have few points.
The reason why you can see the different behaviour between WildFly 10 and WildFly 20 is that between these two releases there was added new component of WildFly Transaction Client (https://github.com/wildfly/wildfly-transaction-client) which refactored the way how the remote EJB calls are done.
When I'm looking at the stack trace there is the trouble of the recovery processing being stuck at place of remote {{recover()}} call[1]. The recovery manager tries to be stopped[2] but it can't be until there is some recovery in progress. Thus is waiting and the app container is waiting for the {{recover()}} call being finished as well.
The http client (WFLY 20.0.1/wildfly-http-transaction-client 1.0.21.Final) is waiting endlessly for reading the remote data on recover (/home/ochaloup/jboss/wildfly-20.0.1.Final/modules/system/layers/base/org/wildfly/http-client/transaction/main/wildfly-http-transaction-client-1.0.21.Final.jar).
Could be [~flavia.rainone][~tomekadamski] that the {{HttpRemoteTransactionPeer#recover()}} does not consider the error state where no data or some exception data is sent back instead of {{Xid}}?
[~adrianots] The configuration (cli script) configures the {{http-remoting}} protocol. I think this is what was used in older versions of WildFly and what can be used for interoperability purposes. With the new remoting library there was added the new protocol which runs with http upgrade the {{remote+http}}
(ie. {{.../subsystem=remoting/remote-outbound-connection=remote-workflow-connection:add(outbound-socket-binding-ref=remote-workflow, protocol=remote+http, ...}})
The handler for this should be then https://github.com/wildfly/wildfly-transaction-client/blob/1.1.11.Final/s... which could be capable to handle the call correctly (ie. probably by reporting the error and letting the recovery to be finished). This requires that both of your servers are the WFLY20 (or similar which handles 'remote+http', https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...).
The question is why the authentication of remote recovery fails. As you run the call to the same server on the same machine then there is probably disables the $local authentication (https://docs.wildfly.org/22/WildFly_Elytron_Security.html#out-of-the-box-...) which is done by reason but I would just mention that with that it could be working. If it's working then the trouble of recover remote ejb calls is that the security configuration has to be globally announced. There is an issue on this here https://issues.redhat.com/browse/WFCORE-4668 which would be related then. I think you will need to configure the {{-Dwildfly.config.url}} with {{wildfly-config.xml}} to get authentication of recover working.
The second question is why you call the remote EJB call from the server to the same server - ie. both applications are at the same server. Why don't you use just local EJB lookup (with {{@EJB}} injection/{{java:app}} lookup}}? I would assume you lookup the ejb with {{ejb:/}}, right?
I'm not sure here but it could be that the {{remote+http}} may be capable to first search the local context and does not invoke the remote call (which would help if the both applications are at the same server, ie. depending on your descriptor of the ejb client - https://docs.wildfly.org/22/Client_Guide.html#jboss-ejb-client, expecting the {{exclude-local-receiver}} should *not* be set to true).
The remote invocation makes here the trouble that the recover calls goes over remote call as well. Maybe even it forces the transaction manager to involve the 2PC processing. If the EJB bean makes some data insertion to database and then there is a remote ejb call to a different server and transaction is propagated, this remote call is considered as a transaction participant.
Other possibility would be you know the transaction is not necessary to be propagated over EJB call and you can use https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap....
[1]
{code}
"Periodic Recovery" - Thread t@115
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6a00decd> (a java.util.concurrent.CompletableFuture$Signaller)
at java.base@11.0.8/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.8/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1796)
at java.base@11.0.8/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3128)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
at java.base@11.0.8/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
at org.wildfly.http-client.transaction@1.0.21.Final//org.wildfly.httpclient.transaction.HttpRemoteTransactionPeer.recover(HttpRemoteTransactionPeer.java:98)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:213)
at org.wildfly.transaction.client@1.1.11.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:209)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:659)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:240)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:182)
- locked <45b75a72> (a com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
{code}
[2]
{code}
"ServerService Thread Pool -- 37" - Thread t@65
java.lang.Thread.State: WAITING
at java.base(a)11.0.8/java.lang.Object.wait(Native Method)
- waiting on <15eaeba8> (a java.lang.Object)
at java.base@11.0.8/java.lang.Object.wait(Object.java:328)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.suspendScan(PeriodicRecovery.java:247)
at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.RecoveryManagerImple.trySuspendScan(RecoveryManagerImple.java:192)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.trySuspend(RecoveryManager.java:265)
at org.jboss.jts//com.arjuna.ats.arjuna.recovery.RecoveryManager.suspend(RecoveryManager.java:259)
at org.jboss.jts.integration//com.arjuna.ats.jbossatx.jta.RecoveryManagerService.suspend(RecoveryManagerService.java:79)
at org.jboss.as.transactions@20.0.1.Final//org.jboss.as.txn.suspend.RecoverySuspendController.preSuspend(RecoverySuspendController.java:42)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.suspend.SuspendController.suspend(SuspendController.java:103)
- locked <2ef5626e> (a org.jboss.as.server.suspend.SuspendController)
at org.jboss.as.server@12.0.3.Final//org.jboss.as.server.operations.ServerDomainProcessShutdownHandler$1$1.handleResult(ServerDomainProcessShutdownHandler.java:100)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.invokeResultHandler(AbstractOperationContext.java:1533)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.handleResult(AbstractOperationContext.java:1515)
at org.jboss.as.controller@12.0.3.Final//org.jboss.as.controller.AbstractOperationContext$Step.finalizeInternal(AbstractOperationContext.java:1472)
...
{code}
> WildFly doesn't stop while waiting for PeriodicRecovery
> -------------------------------------------------------
>
> Key: WFLY-14284
> URL: https://issues.redhat.com/browse/WFLY-14284
> Project: WildFly
> Issue Type: Bug
> Components: EJB, Transactions
> Affects Versions: 18.0.1.Final, 20.0.1.Final
> Reporter: Adriano Teixeira de Souza
> Assignee: Michael Musgrove
> Priority: Major
> Attachments: ejb-configs.sh, jboss-ejb-client.xml, server(transaction).log, thread-dump-stop-1.txt
>
>
> I'm testing wildfly 20.0.1 (and 21.0.2 was tested too) for replace our old version of Wildfly 10.
> it happens that frequently we have seen that the stop function of server does not work and we need to kill the process by manual operation on the OS.
> It sounds like a dead look.
> I attatch the thread dump on this.
--
This message was sent by Atlassian Jira
(v8.13.1#813001)
3 years, 9 months