[JBoss JIRA] (JBTM-2147) afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2147:
--------------------------------
Summary: afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException (was: Transactions are leaked when XAResource misbehaves)
> afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2147:
--------------------------------
Status: Pull Request Sent (was: Reopened)
Git Pull Request: https://github.com/jbosstm/narayana/pull/636
> afterCompletion may be called before transaction _completes_ if reaper fires for a transaction and XAResource::end throws RuntimeException
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2137) JDBCStore recovery is too slow
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2137?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2137:
--------------------------------
Fix Version/s: 4.17.20
(was: 4.17.19)
> JDBCStore recovery is too slow
> ------------------------------
>
> Key: JBTM-2137
> URL: https://issues.jboss.org/browse/JBTM-2137
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Recovery
> Affects Versions: 4.17.18, 5.0.1
> Reporter: Michael Musgrove
> Assignee: Michael Musgrove
> Fix For: 4.17.20, 5.0.2
>
>
> The linked CI job failed because the STATESTOREJBOSSTSTXTABLE had too many entries (>600) and this caused the recovery pass to take about 15 minutes which in turn causes subsequent tests to time out. The reason for so many entries is covered by JBTM-2133 but this JIRA is for the poor performance when there are so many entries.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2133) Multiple QA test suite failures on Oracle
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2133?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-2133:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Multiple QA test suite failures on Oracle
> -----------------------------------------
>
> Key: JBTM-2133
> URL: https://issues.jboss.org/browse/JBTM-2133
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing
> Reporter: Gytis Trikleris
> Assignee: Michael Musgrove
> Priority: Minor
> Fix For: 4.17.19, 5.0.2
>
>
> http://172.17.131.2/view/Narayana+BlackTie/job/narayana-jdbcobjectstore/6...
> {code}
> txcore_lockrecord LockRecord_Thread_Test032b Fail (0m37.778s)
> txcore_lockrecord LockRecord_Thread_Test035b Fail (0m15.279s)
> txcore_lockrecord LockRecord_Thread_Test036a Fail (0m12.145s)
> txcore_lockrecord LockRecord_Thread_Test036b Fail (0m21.774s)
> txcore_lockrecord LockRecord_Thread_Test043b Fail (0m10.901s)
> txcore_lockrecord LockRecord_Thread_Test044b Fail (0m23.904s)
> txcore_lockrecord LockRecord_Thread_Test048a Fail (0m13.528s)
> txcore_lockrecord LockRecord_Thread_Test048b Fail (0m30.007s)
> crashrecovery12 CrashRecovery12_Test03 Fail (4m10.784s)
> crashrecovery12 CrashRecovery12_Test04 Fail (4m10.153s)
> crashrecovery12 CrashRecovery12_Test05 Fail (4m11.909s)
> crashrecovery12 CrashRecovery12_Test06 Fail (4m11.862s)
> crashrecovery12 CrashRecovery12_Test07 Fail (4m11.435s)
> crashrecovery12 CrashRecovery12_Test02 Fail (4m9.581s)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months
[JBoss JIRA] (JBTM-2147) Transactions are leaked when XAResource misbehaves
by Koen Janssens (JIRA)
[ https://issues.jboss.org/browse/JBTM-2147?page=com.atlassian.jira.plugin.... ]
Koen Janssens commented on JBTM-2147:
-------------------------------------
The hornetq issue is already being fixed. I just want to make sure narayna is a bit more robust in case of a misbehaving XA resource
If you want a Narayana log, i can give you one ;-)
ARJUNA012078: Abort called illegaly on atomic action 0:ffff40bab98c:1f71a485:5335ece7:3625a31)
This happens on the original thread, which has a TX in 'aborting' state (since reaper tried to abort it and failed in the middle). After this warning, Naryana calls
com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator#afterCompletion(int), which will end up calling javax.transaction.Synchronzier.afterComplete. But according to the JTA spec, that should only be done when the TX is 'finished'. HOwever, in case the Tx is in 'aborting' state, some resources can still be 'enlisted' and the TX should not be considered finished
.
> Transactions are leaked when XAResource misbehaves
> --------------------------------------------------
>
> Key: JBTM-2147
> URL: https://issues.jboss.org/browse/JBTM-2147
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: JTS
> Affects Versions: 4.17.7
> Environment: jboss EAP 6.1.1
> Reporter: Koen Janssens
> Assignee: Tom Jenkinson
>
> We have noticed that arjuna leaks transactions when something 'unexpected' happens during tx aborting. The transaction stays in 'aborting' state while it does notify TxLIsteners that the tx has ended.
> This can be reproduced as follows:
> * A TX is started and both a DB (last)resource and horntq (XA) resource get enlisted. The tx takes a long time and times out.
> * Arjuna reaper thread notices the time out and starts a worker thread to cancel the TX.
> * Before the worker thread can 'abort' the hornetq XA resource, the arjuna worker thread is interruped (txReaperCancelWaitPeriod expires) by arjuna
> * Since the worker thread is interruped, the hornetq XA resource throws an 'HornetQInterruptedException'
> * This unexpected exception causes arjuna to notify registered javax/transaction/Synchronization's (and return DB connection to the pool), without ending the Tx.
> From then on, jboss logs are full of:
> Trying to start a new transaction when old is not complete: Old: < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3625a31, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:3625bbc, subordinatenodename=null, eis_name=java:/XAOracleDS >, New < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffff40bab98c:1f71a485:5335ece7:3641efc, node_name=1, branch_uid=0:ffff40bab98c:1f71a485:5335ece7:36420a1, subordinatenodename=null, eis_name=java:/XAOracleDS >, Flags 0
> Although the root cause is a misbehaving hornetq resource, I think arjuna should be a bit more resilient.
> More background in https://access.redhat.com/support/cases/01061583/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
10 years, 7 months