Adrian Brock wrote:
The first mention of this "broken" XAResource looks to be
after the
ha invoker tests, can you have a look at the log to see
if there are any other clues?
This is about 13 minutes before the naming test hangs.
http://hudson.jboss.org/hudson/job/JBoss-AS-5.0.x-TestSuite-sun15/lastBui...
Well, working backwards from the first occurence of the recovery
failures I found this report of a heuristic outcome:
2008-06-26 03:50:58,957 INFO
[org.jboss.test.jca.adapter.TestManagedConnection]
(WorkerThread#39[10.18.94.175:52882]) rollback with
xid=1-a120086:bcb0:486344fa:3ea2 for TestManagedConnection#1{xid=null
destroyed=false}
2008-06-26 03:50:58,969 ERROR [org.jboss.ejb.plugins.LogInterceptor]
(WorkerThread#39[10.18.94.175:52882])
TransactionRolledbackLocalException in method: public abstract void
org.jboss.test.jca.interfaces.XAExceptionSessionLocal.testXAExceptionToTransactionRolledbackException(),
causedBy:
javax.transaction.HeuristicMixedException
at
com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commit(TransactionImple.java:252)
at
org.jboss.ejb.plugins.TxInterceptorCMT.endTransaction(TxInterceptorCMT.java:501)
at
org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:361)
. . .
2008-06-26 03:50:58,970 INFO
[org.jboss.test.jca.ejb.XAExceptionTestSessionBean]
(WorkerThread#39[10.18.94.175:52882]) Test worked
The next mention of the associated transaction
(a120086:bcb0:486344fa:3ea2) is in the recovery failure message
2008-06-26 03:52:18,342 DEBUG
[com.arjuna.ats.arjuna.logging.arjLoggerI18N]
(Server.Connection:10.18.0.134:34373)
[com.arjuna.ats.arjuna.recovery.ActionStatusService_1] -
transactionType:
/StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction uid:
a120086:bcb0:486344fa:3ea2
Status is 7
2008-06-26 03:52:18,344 WARN [com.arjuna.ats.jta.logging.loggerI18N]
(Thread-19)
[com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa]
[com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] Could
not find new XAResource to use for recovering non-serializable
XAResource < 131075, 28, 26,
4945974950484856545898999848585256545152521029758511019750974950484856545898999848585256545152521029758511019755
2008-06-26 03:52:18,344 WARN
[com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19)
[com.arjuna.ats.internal.arjuna.gandiva.inventory.StaticInventory_1] -
cannot find null implementation.
2008-06-26 03:52:18,344 WARN
[com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19)
[com.arjuna.ats.arjuna.recovery.RecoverAtomicAction_4] -
RecoverAtomicAction: transaction a120086:bcb0:486344fa:3ea2 not
activated, unable to replay phase 2 commit
Now, when the heuristic outcome happens the failed TX is written to the
object store in order that it can be retrieved for later analysis. The
error happens when the recovery thread tries to scan it and finds a
corrupt entry. So, somewhere between these two messages the object store
appears to have been trashed. The same appears to be the case for the
other TXs which also have commitwhenaborted failures.
In between the two messages there are a whole series of errors with
messages from
org.jboss.test.jca.adapter.TestManagedConnectionFactory.createManagedConnection
Throwable while attempting to get a new connection: null
followed by yet more ResourceExceptions
javax.resource.ResourceException:
at
org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:150)
and even RuntimeExceptions
java.lang.RuntimeException:
at
org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:152)
These are also followed by some TX is not active and rollback errors. I
have no idea if these are related. I doubt it though, because none of
them should have any effect on the object store contents for the now
rolled back heuristic TX. It's dead, Jim -- at least until recovery
kicks in.
Is this running clustered? Are the clusters using independent object
stores? Could there be a caching issue here which allows another server
to think the TX is active? This is probably a more fruitful line of
enquiry than anything I can see in the log.
regards,
Andrew Dinn
-----------