[jboss-jira] [JBoss JIRA] (WFLY-12090) Unknown service name jboss.ejb and txn

Thu May 23 09:00:01 EDT 2019

    [ https://issues.jboss.org/browse/WFLY-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737767#comment-13737767 ] 

Ondrej Chaloupka edited comment on WFLY-12090 at 5/23/19 8:59 AM:
------------------------------------------------------------------

[~flavia.rainone] thanks for the fix for the WFTC-62.
I would like to understand what is the relation of this issue (WFLY-12090) to the WFTC-64? Is it something the customer hits and makes issues?
I mean I think the WFTC-64 is not an issue. For the full correct recovery it's expected that the remote server will be restarted with the same object store as it was before plus with the same node id as it has before the failure.
>From the transaction consistency perspective it's not fully secure to remove the record as you lose the information about existence of some unfinished transaction on the remote side. I have in mind cases like (maybe we can discuss which one is real or not but still...)

* the remote server fails, administrator starts another blank new server at the same address (as WFTC-64 mentions), recovery occurs, WFTC removes the record, administrator realizes he did not port `/data` directory, restarts the server once again with txn records and we have never recovered data here (_this situation expects an error of  the administrator_)
* the remote server fails, administrator starts another blank new server, recovery occurs, WFTC  removes the record, administrator slowly migrates unfinished transactions to the newly started server, those migrated transactions won't be never recovered as the parent EAP node lost the information about them (_we were thinking about such option for the txn recovery management on OpenShift - even this idea was never implemented I think it's still feasible. It would be similar as migration of messages from a killed broker to an active one.)

My PoV is that the records should be left untouched for some time (ie. for hours) until they are really deleted. The Narayana has recovery `expiryScanInterval` which defines the time (by default 12h) after which similar records are removed.

For the sake of completeness, when remembering correctly these WFTC records - which we are talking about in WFTC-64 -  are for orphan detection thus the transaction was defined to be rollback (WFTC-38, WFLY-10201). The records are not necessary if information about participant is saved in the Narayana transaction log and the transaction outcome was defined to be commit.

was (Author: ochaloup):
[~flavia.rainone] thanks for the fix for the WFTC-62.
I would like to understand what is the relation of this issue (WFLY-12090) to the WFTC-64? Is it something the customer hits and makes issues?
I mean I think the WFTC-64 is not an issue. For the full correct recovery it's expected that the remote server will be restarted with the same object store as it was before plus with the same node id as it has before the failure.
>From the transaction consistency perspective it's not fully secure to remove the record as you lose the information about existence of some unfinished transaction on the remote side. I have in mind cases like (maybe we can discuss which one is real or not but still...)

* the remote server fails, administrator starts another blank new server at the same address (as WFTC-64 mentions), recovery occurs, WFTC removes the record, administrator realizes he did not port `/data` directory, restarts the server once again with txn records and we have never recovered data here (_this situation expects an error of  the administrator_)
* the remote server fails, administrator starts another blank new server, recovery occurs, WFTC  removes the record, administrator slowly migrates unfinished transactions to the newly started server, those migrated transactions won't be never recovered as the parent EAP node lost the information about them (_we were thinking about such option for the txn recovery management on OpenShift - even this idea was never implemented I think it's still feasible. It would be similar as migration of messages from a killed broker to an active one.)

My PoV is that the records should be left untouched for some time (ie. for hours) until they are really deleted. The Narayana has 

For the sake of completeness, when remembering correctly these WFTC records - which we are talking about in WFTC-64 -  are for orphan detection thus the transaction was defined to be rollback (WFTC-38, WFLY-10201). The records are not necessary if information about participant is saved in the Narayana transaction log and the transaction outcome was defined to be commit.

> Unknown service name jboss.ejb and txn
> --------------------------------------
>
>                 Key: WFLY-12090
>                 URL: https://issues.jboss.org/browse/WFLY-12090
>             Project: WildFly
>          Issue Type: Bug
>          Components: EJB, Transactions
>    Affects Versions: 17.0.0.Alpha1
>            Reporter: Flavia Rainone
>            Assignee: Flavia Rainone
>            Priority: Major
>             Fix For: 17.0.0.Beta1
>
>
> When EJBRemoting service is not started (it is LAZY), but XA recovery has, we could have the following sort of error:
> {code}
> 16:51:46,785 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016027: Local XARecoveryModule.xaRecovery got XA exception XAException.XAER_RMERR: javax.transaction.xa.XAException: WFTXN0034: Failed to acquire a connection for this operation
> 	at org.wildfly.transaction.client at 1.1.3.Final//org.wildfly.transaction.client.provider.remoting.RemotingRemoteTransactionPeer.getOperationsXA(RemotingRemoteTransactionPeer.java:139)
> 	at org.wildfly.transaction.client at 1.1.3.Final//org.wildfly.transaction.client.provider.remoting.RemotingRemoteTransactionPeer.recover(RemotingRemoteTransactionPeer.java:202)
> 	at org.wildfly.transaction.client at 1.1.3.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:201)
> 	at org.wildfly.transaction.client at 1.1.3.Final//org.wildfly.transaction.client.SubordinateXAResource.recover(SubordinateXAResource.java:197)
> 	at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoveryFirstPass(XARecoveryModule.java:634)
> 	at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:226)
> 	at org.jboss.jts//com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkFirstPass(XARecoveryModule.java:171)
> 	at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:770)
> 	at org.jboss.jts//com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:382)
> Caused by: org.jboss.remoting3.ServiceOpenException: Unknown service name jboss.ejb
> 	at org.jboss.remoting at 5.0.10.Final-SNAPSHOT//org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:429)
> 	at org.jboss.remoting at 5.0.10.Final-SNAPSHOT//org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:46)
> 	at org.jboss.xnio at 3.7.2.Final-SNAPSHOT//org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
> 	at org.jboss.xnio at 3.7.2.Final-SNAPSHOT//org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
> 	at org.jboss.xnio.nio at 3.7.2.Final-SNAPSHOT//org.xnio.nio.NioSocketConduit.handleReady(NioSocketConduit.java:89)
> 	at org.jboss.xnio.nio at 3.7.2.Final-SNAPSHOT//org.xnio.nio.WorkerThread.run(WorkerThread.java:591)
> 	Suppressed: org.jboss.remoting3.ServiceOpenException: Unknown service name txn
> 		... 6 more
> {code}

--
This message was sent by Atlassian Jira
(v7.12.1#712002)