[JBoss JIRA] (JBTM-2255) Do not return StatusCommiting, if transaction was commited by the original transaction manager
by Gytis Trikleris (JIRA)
[ https://issues.jboss.org/browse/JBTM-2255?page=com.atlassian.jira.plugin.... ]
Gytis Trikleris updated JBTM-2255:
----------------------------------
Status: Pull Request Sent (was: Reopened)
Git Pull Request: https://github.com/jbosstm/narayana/pull/732 (was: https://github.com/jbosstm/narayana/pull/727)
> Do not return StatusCommiting, if transaction was commited by the original transaction manager
> ----------------------------------------------------------------------------------------------
>
> Key: JBTM-2255
> URL: https://issues.jboss.org/browse/JBTM-2255
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: JTS
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23
>
>
> We need to check if the transaction is really in flight before returning status as committing. Previously code assumed, that if returned status is StatusCommitted, the only reason why the resource invoked replay_completion is that the transaction is in the process of committing.
> However, as shown by the attached bugzilla, another work flow is also possible. Because Oracle database returned XAException.XAER_RMFAIL, second resource was committed successfully and the client was told that the transaction committed successfully. Recovery was left to sort out the issue with the database. Once the database resource invoked replay_completion and transaction manager saw the status of the transaction as StatusCommitted, it assumed that it is still in action even though there is no BasicAction available.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2256) Race condition between recovery manager initialization and expiry scanner
by Mark Little (JIRA)
[ https://issues.jboss.org/browse/JBTM-2256?page=com.atlassian.jira.plugin.... ]
Mark Little commented on JBTM-2256:
-----------------------------------
Sounds like something Byteman could help with here. We should definitely add an automatic test for this into the test suite to ensure that any fix is appropriate and this case is covered in the future.
> Race condition between recovery manager initialization and expiry scanner
> -------------------------------------------------------------------------
>
> Key: JBTM-2256
> URL: https://issues.jboss.org/browse/JBTM-2256
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: Transaction Core
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23, 5.0.4
>
>
> In a constructor of RecoveryManagerImple expiry scanner is started before initiating PeriodicRecovery. This causes a problem from time to time, because during the initiation of PeriodicRecovery (more exact XARecoveryModule) ExtendedResourceRecord is added to the RecordTypeManager. It has to be there during the expiry scan execution. Since expiry scanner works in a separate thread it works most of the time, but race condition still exists. Personally I couldn't reproduce the problem.
> Swapping these two actions should solve the problem.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2256) Race condition between recovery manager initialization and expiry scanner
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-2256?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson commented on JBTM-2256:
-------------------------------------
Hi Gytis,
The way to reproduce the bug is to put a breakpoint here:
https://github.com/Gytis/narayana/blob/4.17/ArjunaCore/arjuna/classes/com...
If you do that and you have a scanner configured and that needs to create an abstract record of say type 172 (for example, as the ExpiredAssumedCompleteScanner in the linked BZ may need to), it will fail.
This is a reproducer I have to verify it: https://github.com/tomjenkinson/narayana/compare/4.17...JBTM-2256
You need to add the breakpoint in RecoveryManagerImple after it has started the expiry scanner but before it loads the recovery modules (line 113 above). You can also add a breakpoint in the reproducer class on "AbstractRecord.create(172);"
If you allow the AbstractRecord.create(172); before you release the RecoveryManagerImple you will see:
{quote}
java.lang.InstantiationException
at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at java.lang.Class.newInstance(Unknown Source)
at com.arjuna.ats.arjuna.coordinator.AbstractRecord.create(AbstractRecord.java:446)
at com.hp.mwtests.ts.jta.jts.recovery.RecoveryManagerUnitTest$1.scan(RecoveryManagerUnitTest.java:41)
at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:171)
{quote}
Swapping the lines does not allow this scenario to occur. I think you should add a comment into your fix to provide more details on this so it is not swapped back in the future.
Tom
> Race condition between recovery manager initialization and expiry scanner
> -------------------------------------------------------------------------
>
> Key: JBTM-2256
> URL: https://issues.jboss.org/browse/JBTM-2256
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: Transaction Core
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23, 5.0.4
>
>
> In a constructor of RecoveryManagerImple expiry scanner is started before initiating PeriodicRecovery. This causes a problem from time to time, because during the initiation of PeriodicRecovery (more exact XARecoveryModule) ExtendedResourceRecord is added to the RecordTypeManager. It has to be there during the expiry scan execution. Since expiry scanner works in a separate thread it works most of the time, but race condition still exists. Personally I couldn't reproduce the problem.
> Swapping these two actions should solve the problem.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2256) Race condition between recovery manager initialization and expiry scanner
by Gytis Trikleris (JIRA)
[ https://issues.jboss.org/browse/JBTM-2256?page=com.atlassian.jira.plugin.... ]
Gytis Trikleris commented on JBTM-2256:
---------------------------------------
Me and Tom went through the stack trace and the code and this was the only cause we could find.
However, I can confirm that not adding ExtendedXAResourceRecordMap to the record types map causes expiry scanner to throw java.lang.InstantiationException. It is because AbstractRecord class is used as a default type, if the requested type does not exist. And in such case Class.newInstance() call will always fail, since it is an abstract class.
Swapping places of those two lines assures that ExtendedXAResourceRecordMap is added to the record types map before expiry scanner is started.
On the other hand, the main reason of this pull request was to check if it doesn't cause our qa tests to fail. I will contact people who raised the bugzilla, to confirm that this issue is fixed, since they can reproduce the error without modifying the code.
> Race condition between recovery manager initialization and expiry scanner
> -------------------------------------------------------------------------
>
> Key: JBTM-2256
> URL: https://issues.jboss.org/browse/JBTM-2256
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: Transaction Core
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23, 5.0.4
>
>
> In a constructor of RecoveryManagerImple expiry scanner is started before initiating PeriodicRecovery. This causes a problem from time to time, because during the initiation of PeriodicRecovery (more exact XARecoveryModule) ExtendedResourceRecord is added to the RecordTypeManager. It has to be there during the expiry scan execution. Since expiry scanner works in a separate thread it works most of the time, but race condition still exists. Personally I couldn't reproduce the problem.
> Swapping these two actions should solve the problem.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2255) Do not return StatusCommiting, if transaction was commited by the original transaction manager
by Mark Little (JIRA)
[ https://issues.jboss.org/browse/JBTM-2255?page=com.atlassian.jira.plugin.... ]
Mark Little reopened JBTM-2255:
-------------------------------
> Do not return StatusCommiting, if transaction was commited by the original transaction manager
> ----------------------------------------------------------------------------------------------
>
> Key: JBTM-2255
> URL: https://issues.jboss.org/browse/JBTM-2255
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: JTS
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23
>
>
> We need to check if the transaction is really in flight before returning status as committing. Previously code assumed, that if returned status is StatusCommitted, the only reason why the resource invoked replay_completion is that the transaction is in the process of committing.
> However, as shown by the attached bugzilla, another work flow is also possible. Because Oracle database returned XAException.XAER_RMFAIL, second resource was committed successfully and the client was told that the transaction committed successfully. Recovery was left to sort out the issue with the database. Once the database resource invoked replay_completion and transaction manager saw the status of the transaction as StatusCommitted, it assumed that it is still in action even though there is no BasicAction available.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2255) Do not return StatusCommiting, if transaction was commited by the original transaction manager
by Mark Little (JIRA)
[ https://issues.jboss.org/browse/JBTM-2255?page=com.atlassian.jira.plugin.... ]
Mark Little commented on JBTM-2255:
-----------------------------------
TransactionFactoryImple.getOSStatus
> Do not return StatusCommiting, if transaction was commited by the original transaction manager
> ----------------------------------------------------------------------------------------------
>
> Key: JBTM-2255
> URL: https://issues.jboss.org/browse/JBTM-2255
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: JTS
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23
>
>
> We need to check if the transaction is really in flight before returning status as committing. Previously code assumed, that if returned status is StatusCommitted, the only reason why the resource invoked replay_completion is that the transaction is in the process of committing.
> However, as shown by the attached bugzilla, another work flow is also possible. Because Oracle database returned XAException.XAER_RMFAIL, second resource was committed successfully and the client was told that the transaction committed successfully. Recovery was left to sort out the issue with the database. Once the database resource invoked replay_completion and transaction manager saw the status of the transaction as StatusCommitted, it assumed that it is still in action even though there is no BasicAction available.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months
[JBoss JIRA] (JBTM-2255) Do not return StatusCommiting, if transaction was commited by the original transaction manager
by Mark Little (JIRA)
[ https://issues.jboss.org/browse/JBTM-2255?page=com.atlassian.jira.plugin.... ]
Mark Little commented on JBTM-2255:
-----------------------------------
StatusChecker.getStatus checks the object store (see bottom of method) using getStatus on the OTS factory if instance isn't running. Make sure that checks the object store location where logs may be moved in the event of assumed complete scanner going off.
> Do not return StatusCommiting, if transaction was commited by the original transaction manager
> ----------------------------------------------------------------------------------------------
>
> Key: JBTM-2255
> URL: https://issues.jboss.org/browse/JBTM-2255
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Components: JTS
> Reporter: Gytis Trikleris
> Assignee: Gytis Trikleris
> Fix For: 4.17.23
>
>
> We need to check if the transaction is really in flight before returning status as committing. Previously code assumed, that if returned status is StatusCommitted, the only reason why the resource invoked replay_completion is that the transaction is in the process of committing.
> However, as shown by the attached bugzilla, another work flow is also possible. Because Oracle database returned XAException.XAER_RMFAIL, second resource was committed successfully and the client was told that the transaction committed successfully. Recovery was left to sort out the issue with the database. Once the database resource invoked replay_completion and transaction manager saw the status of the transaction as StatusCommitted, it assumed that it is still in action even though there is no BasicAction available.
--
This message was sent by Atlassian JIRA
(v6.3.1#6329)
10 years, 2 months