[jboss-jira] [JBoss JIRA] (WFWIP-118) Transaction remained in prepared state
Ondra Chaloupka (JIRA)
issues at jboss.org
Wed Aug 22 07:27:00 EDT 2018
[ https://issues.jboss.org/browse/WFWIP-118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622575#comment-13622575 ]
Ondra Chaloupka commented on WFWIP-118:
---------------------------------------
I investigated the logs with Narayana tracing. I chose one specific transaction id for which the artemis id is {{AAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGqwAAAAIAAAAAAAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGgjY3MzgHAgIA}}.
This belongs to Narayana global id {{0:ffff0a10641a:1b0801c9:5b7bbda5:682}} and this branch id is {{0:ffff0a10641a:1b0801c9:5b7bbda5:6ab}}. The snippet from log is
{code}
TRACE [com.arjuna.ats.jta] (Thread-26 (ActiveMQ-client-global-threads)) XAResourceRecord.XAResourceRecord ( < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:6ab, subordinatenodename=null, eis_name=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2 >, XAResourceWrapperImpl at 3beb35f7[xaResource=org.jboss.activemq.artemis.wildfly.integration.WildFlyActiveMQXAResourceWrapper at efa9a8c pad=false overrideRmValue=null productName=ActiveMQ Artemis productVersion=2.0 jndiName=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2] ), record id=0:ffff0a10641a:1b0801c9:5b7bbda5:6ac
TRACE [org.apache.activemq.artemis.ra] (Thread-26 (ActiveMQ-client-global-threads)) start(< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:6ab, subordinatenodename=null, eis_name=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2 >, 0)
TRACE [org.apache.activemq.artemis.ra] (Thread-26 (ActiveMQ-client-global-threads)) lock()
TRACE [org.apache.activemq.artemis.core.client.impl.ClientSessionImpl] (Thread-26 (ActiveMQ-client-global-threads)) Calling start:: xid=XidImpl (1453388228 bq:0.0.0.0.0.0.0.0.0.0.-1.-1.10.16.100.26.27.8.1.-55.91.123.-67.-91.0.0.6.-85.0.0.0.2.0.0.0.0 formatID:131077 gtxid:0.0.0.0.0.0.0.0.0.0.-1.-1.10.16.100.26.27.8.1.-55.91.123.-67.-91.0.0.6.-126.54.55.51.56 base64:AAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGqwAAAAIAAAAAAAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGgjY3MzgHAgIA,clientXID=< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:6ab, subordinatenodename=null, eis_name=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2 > clientXID=< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:6ab, subordinatenodename=null, eis_name=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2 > flags = TMNOFLAGS
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Thread-26 (ActiveMQ-client-global-threads)) RemotingConnectionID=f3f899da Sending blocking PACKET(SessionXAStartMessage)[type=51, channelID=11, packetObject=SessionXAStartMessage, xid=< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:6ab, subordinatenodename=null, eis_name=java:/JmsXA NodeId:c180112b-a512-11e8-856b-0015178e70d2 >]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl] (Thread-3 (ActiveMQ-client-netty-threads)) RemotingConnectionID=f3f899da handling packet PACKET(SessionXAResponseMessage)[type=55, channelID=11, packetObject=SessionXAResponseMessage, error=false, message=null, responseCode=0]
{code}
The participants of this transacitons are
* in-VM messaging {{java:/LocalJmsXA}}
* remote messaging {{java:/JmsXA}}
* database {{java:/jdbc/lodhDS}}
The scenario for these transactions is that the server is killed at time when prepare phase was not completely finished. It means they are considered by periodic recovery for roll-back. Observing other transactions which were hit by the 'server kill' at time the prepare phase already finished and they should be committed by recovery they are correctly committed.
The issue here seems to be with *not* getting information about in-doubt transactions from the remote JMS server. It seems the [XAResource#recover|https://docs.oracle.com/javaee/7/api/javax/transaction/xa/XAResource.html#recover-int-] returns no records for them being rolled-back.
In case of the remote jms server where issue seems to be {{java:/JmsXA}} the output of the recover call is {{(Periodic Recovery) Found 0 xids in doubt}}
{code}
DEBUG [org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl] (Periodic Recovery) ClientSessionFactoryImpl received backup update for live/backup pair = TransportConfiguration(name=, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=11080&localAddress=127-0-0-1&host=127-0-0-1 / null but it didn't belong to TransportConfiguration(name=, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=11080&localAddress=127-0-0-1&host=127-0-0-1
TRACE [org.apache.activemq.artemis.logs] (Periodic Recovery) Saving string with utfSize=36 stringSize=36
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Periodic Recovery) RemotingConnectionID=d036646a Sending blocking PACKET(CreateSessionMessage)[type=30, channelID=1, packetObject=CreateSessionMessage, autoCommitAcks=false, autoCommitSends=false, defaultAddress=null, minLargeMessageSize=102400, name=6a2ee590-a513-11e8-887d-001b217d6d93, password=****, preAcknowledge=false, sessionChannelID=10, username=null, version=129, windowSize=-1, xa=true]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl] (Thread-16 (ActiveMQ-client-netty-threads)) RemotingConnectionID=d036646a handling packet PACKET(CreateSessionResponseMessage)[type=31, channelID=1, packetObject=CreateSessionResponseMessage, serverVersion=129]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Periodic Recovery) RemotingConnectionID=d036646a Setting handler on Channel[id=USER, RemotingConnectionID=d036646a, handler=null] as org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext$ClientSessionPacketHandler at 77db1ee4
DEBUG [org.apache.activemq.artemis.service.extensions.xa.recovery] (Periodic Recovery) looking for recover at ClientSessionImpl [name=6a2ee590-a513-11e8-887d-001b217d6d93, username=null, closed=false, factory = org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl at 1e875326, metaData=()]@1e35f8e0 configuration [XARecoveryConfig [transportConfiguration=[TransportConfiguration(name=, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory) ?httpUpgradeEndpoint=acceptor&activemqServerName=default&httpUpgradeEnabled=true&port=11080&localAddress=127-0-0-1&host=127-0-0-1], discoveryConfiguration=null, username=null, password=****, JNDI_NAME=java:/JmsXA]]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Periodic Recovery) RemotingConnectionID=d036646a Sending blocking PACKET(PacketImpl)[type=61, channelID=10, packetObject=PacketImpl]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Thread-2 (ActiveMQ-client-global-scheduled-threads)) RemotingConnectionID=d036646a Sending packet nonblocking PACKET(Ping)[type=10, channelID=0, packetObject=Ping, connectionTTL=60000] on channelID=0
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Thread-2 (ActiveMQ-client-global-scheduled-threads)) RemotingConnectionID=d036646a Writing buffer for channelID=0
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl] (Thread-16 (ActiveMQ-client-netty-threads)) RemotingConnectionID=d036646a handling packet PACKET(Ping)[type=10, channelID=0, packetObject=Ping, connectionTTL=60000]
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl] (Thread-16 (ActiveMQ-client-netty-threads)) RemotingConnectionID=d036646a handling packet PACKET(SessionXAGetInDoubtXidsResponseMessage)[type=62, channelID=10, packetObject=SessionXAGetInDoubtXidsResponseMessage, xids=[]]
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) Found 0 xids in doubt
{code}
When checking the other two participants (database and in-VM messaging) they responses with list of the in-doubt transactions to the periodic recovery and the transactions are then correctly rolled-back
{code}
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicActionConnectable, StateType.OS_ORIGINAL)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) FileSystemStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicActionConnectable, 11)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.currentState(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicActionConnectable) - returning StateStatus.OS_UNKNOWN
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.read_state could not find committed or uncommitted state for 0:ffff0a10641a:1b0801c9:5b7bbda5:682 instead found state StateStatus.OS_UNKNOWN
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction, StateType.OS_SHADOW)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) FileSystemStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction, 10)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction, StateType.OS_ORIGINAL)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) FileSystemStore.genPathName(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction, 11)
TRACE [com.arjuna.ats.arjuna] (Periodic Recovery) ShadowingStore.currentState(0:ffff0a10641a:1b0801c9:5b7bbda5:682, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction) - returning StateStatus.OS_UNKNOWN
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) No record found for < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 >
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTATransactionLogXAResourceOrphanFilter voted ABSTAIN
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 > is 6738
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTANodeNameXAResourceOrphanFilter voted ROLLBACK
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) subordinate node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 > is null
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.SubordinateJTAXAResourceOrphanFilter voted ABSTAIN
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) subordinate node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 > is null
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.SubordinationManagerXAResourceOrphanFilter voted ABSTAIN
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) node name of < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 > is 6738
DEBUG [com.arjuna.ats.jta] (Periodic Recovery) XAResourceOrphanFilter com.arjuna.ats.internal.jta.recovery.arjunacore.JTAActionStatusServiceXAResourceOrphanFilter voted ABSTAIN
INFO [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016013: Rolling back < formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 >
DEBUG [org.apache.activemq.artemis.service.extensions.xa.recovery] (Periodic Recovery) Rollback ClientSessionImpl [name=6a29192d-a513-11e8-887d-001b217d6d93, username=null, closed=false, factory = org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl at 1649da86, metaData=()]@39a1f4be xid
TRACE [org.apache.activemq.artemis.core.client.impl.ClientSessionImpl] (Periodic Recovery) Calling rollback:: xid=XidImpl (766297512 bq:0.0.0.0.0.0.0.0.0.0.-1.-1.10.16.100.26.27.8.1.-55.91.123.-67.-91.0.0.6.-122.0.0.0.1.0.0.0.0 formatID:131077 gtxid:0.0.0.0.0.0.0.0.0.0.-1.-1.10.16.100.26.27.8.1.-55.91.123.-67.-91.0.0.6.-126.54.55.51.56 base64:AAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGhgAAAAEAAAAAAAAAAAAAAAAAAP__ChBkGhsIAclbe72lAAAGgjY3MzgHAgIA,clientXID=< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 >
TRACE [org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl] (Periodic Recovery) RemotingConnectionID=6a282ecb-a513-11e8-887d-001b217d6d93 Sending blocking PACKET(SessionXARollbackMessage)[type=56, channelID=10, packetObject=SessionXARollbackMessage, xid=< formatId=131077, gtrid_length=32, bqual_length=36, tx_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:682, node_name=6738, branch_uid=0:ffff0a10641a:1b0801c9:5b7bbda5:686, subordinatenodename=null, eis_name=java:/LocalJmsXA NodeId:b5847054-a512-11e8-b703-0015178e70d2 >]
{code}
> Transaction remained in prepared state
> --------------------------------------
>
> Key: WFWIP-118
> URL: https://issues.jboss.org/browse/WFWIP-118
> Project: WildFly WIP
> Issue Type: Bug
> Reporter: Erich Duda
> Assignee: Kabir Khan
> Priority: Blocker
>
> *Scenario*
> * Start group A of two servers (node-1 and node-3) Servers are not in cluster.
> * Send messages to queue on node-1.
> * Start another group B of two servers(node-2 and node-4). Servers are not in cluster.
> * Deploy mdb on both servers in A group. This mdb reads messages from local queue and perform insert into oracle 11 gr2 database, and also sends messages to remote queue on group B.
> * Mdb deployed on nodes in group B inserts messages from local queue to oracle 11 gr2 database.
> * Kill server node-1. Restart failed node. Process all messages and verify both mdbs performed database insert
> After node-1 is killed and restarted, there are still transactions in prepare state and they are not cleared in 10 minutes. The transactions remains on servers from group B.
> Logs from test can be found in build - https://jenkins.hosts.mwqe.eng.bos.redhat.com/hudson/job/mnovak-verifier-artemis2x/12/
> *Investigation of issue is in progress. It is not known what component causes it.*
> Clebert looked at the logs but there weren't traces from Arjuna. The above build contains also Arjuna traces. Copy paste Clebert's analyses.
> {quote}I looked at the output from the clients, and filtered the processing
> for one XA that's never commited..
> I looked for the XID on
> base64:AAAAAAAAAAAAAP__ChBkGmqipH1benkDAAAJdQAAAAMAAAAAAAAAAAAAAAAAAP__ChBkGmqipH1benkDAAAJaDUyODcHAgIA
> and filtered the onMessage that generated it (attached the log as
> processingOneMessage.txt):
> This XID was never commited simply because the TM never did it.
> Probably because of some issue on the JDBC.. but there are no errors,
> no exceptions.. totally clean on artemis side.
> There are no logging for TM or the MDB itself to correlate other
> failures. With the information I have I am certain there were nothing
> from Artemis side that would have caused it.
> In any case.. this is not replicated at all.. and not related to 87.
> This is simply the TM getting confused for some issue on the JDBC.
> As far as I am concerned I have fixed the replicated shutdown case.
> and i will send PRs upstream now.{quote}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the jboss-jira
mailing list