XAResource problem was Re: Naming test timeout was Re: [jboss-dev] PUSH

Adrian Brock abrock at redhat.com
Tue Jul 1 08:02:29 EDT 2008


To make sure this doesn't "fall the cracks"
you can reproduce this problem by running the following
in the testsuite in jboss-head.

./build.sh one-test
-Dtest=org.jboss.test.jca.test.XAExceptionUnitTestCase

(and wait for periodic recovery to kick in).

The idea of this test is to make sure that
the server is stable to "random" XAExceptions from
an XAResource, so their's clearly something wrong
somewhere.

NOTE: I don't see a problem running the same
test in jboss-4.2.x

On Thu, 2008-06-26 at 17:55 +0100, Andrew Dinn wrote:
> Adrian Brock wrote:
> > The first mention of this "broken" XAResource looks to be after the
> > ha invoker tests, can you have a look at the log to see
> > if there are any other clues?
> > 
> > This is about 13 minutes before the naming test hangs.
> > 
> > http://hudson.jboss.org/hudson/job/JBoss-AS-5.0.x-TestSuite-sun15/lastBuild/artifact/trunk/build/output/jboss-5.0.0.CR1/server/all/log/server.log
> 
> Well, working backwards from the first occurence of the recovery 
> failures I found this report of a heuristic outcome:
> 
> 
> 2008-06-26 03:50:58,957 INFO 
> [org.jboss.test.jca.adapter.TestManagedConnection] 
> (WorkerThread#39[10.18.94.175:52882]) rollback with 
> xid=1-a120086:bcb0:486344fa:3ea2 for TestManagedConnection#1{xid=null 
> destroyed=false}
> 2008-06-26 03:50:58,969 ERROR [org.jboss.ejb.plugins.LogInterceptor] 
> (WorkerThread#39[10.18.94.175:52882]) 
> TransactionRolledbackLocalException in method: public abstract void 
> org.jboss.test.jca.interfaces.XAExceptionSessionLocal.testXAExceptionToTransactionRolledbackException(), 
> causedBy:
> javax.transaction.HeuristicMixedException
> 	at 
> com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commit(TransactionImple.java:252)
> 	at 
> org.jboss.ejb.plugins.TxInterceptorCMT.endTransaction(TxInterceptorCMT.java:501)
> 	at 
> org.jboss.ejb.plugins.TxInterceptorCMT.runWithTransactions(TxInterceptorCMT.java:361)
> 	. . .
> 2008-06-26 03:50:58,970 INFO 
> [org.jboss.test.jca.ejb.XAExceptionTestSessionBean] 
> (WorkerThread#39[10.18.94.175:52882]) Test worked
> 
> The next mention of the associated transaction 
> (a120086:bcb0:486344fa:3ea2) is in the recovery failure message
> 
> 2008-06-26 03:52:18,342 DEBUG 
> [com.arjuna.ats.arjuna.logging.arjLoggerI18N] 
> (Server.Connection:10.18.0.134:34373) 
> [com.arjuna.ats.arjuna.recovery.ActionStatusService_1] - 
> transactionType: 
> /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction uid: 
> a120086:bcb0:486344fa:3ea2
>   Status is 7
> 2008-06-26 03:52:18,344 WARN  [com.arjuna.ats.jta.logging.loggerI18N] 
> (Thread-19) 
> [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] 
> [com.arjuna.ats.internal.jta.resources.arjunacore.norecoveryxa] Could 
> not find new XAResource to use for recovering non-serializable 
> XAResource < 131075, 28, 26, 
> 4945974950484856545898999848585256545152521029758511019750974950484856545898999848585256545152521029758511019755 
>  >
> 2008-06-26 03:52:18,344 WARN 
> [com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19) 
> [com.arjuna.ats.internal.arjuna.gandiva.inventory.StaticInventory_1] - 
> cannot find null implementation.
> 2008-06-26 03:52:18,344 WARN 
> [com.arjuna.ats.arjuna.logging.arjLoggerI18N] (Thread-19) 
> [com.arjuna.ats.arjuna.recovery.RecoverAtomicAction_4] - 
> RecoverAtomicAction: transaction a120086:bcb0:486344fa:3ea2 not 
> activated, unable to replay phase 2 commit
> 
> Now, when the heuristic outcome happens the failed TX is written to the 
> object store in order that it can be retrieved for later analysis. The 
> error happens when the recovery thread tries to scan it and finds a 
> corrupt entry. So, somewhere between these two messages the object store 
> appears to have been trashed. The same appears to be the case for the 
> other TXs which also have commitwhenaborted failures.
> 
> In between the two messages there are a whole series of errors with 
> messages from 
> org.jboss.test.jca.adapter.TestManagedConnectionFactory.createManagedConnection
> Throwable while attempting to get a new connection: null
> 
> followed by yet more ResourceExceptions
> 
> javax.resource.ResourceException:
> 	at 
> org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:150)
> 
> 
> and even RuntimeExceptions
> 
> java.lang.RuntimeException:
> 	at 
> org.jboss.test.jca.adapter.TestManagedConnectionFactory.matchManagedConnections(TestManagedConnectionFactory.java:152)
> 
> These are also followed by some TX is not active and rollback errors. I 
> have no idea if these are related. I doubt it though, because none of 
> them should have any effect on the object store contents for the now 
> rolled back heuristic TX. It's dead, Jim -- at least until recovery 
> kicks in.
> 
> Is this running clustered? Are the clusters using independent object 
> stores? Could there be a caching issue here which allows another server 
> to think the TX is active? This is probably a more fruitful line of 
> enquiry than anything I can see in the log.
> 
> regards,
> 
> 
> 
> Andrew Dinn
> -----------
> _______________________________________________
> jboss-development mailing list
> jboss-development at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/jboss-development
-- 
xxxxxxxxxxxxxxxxxxxxxxxxxxxx
Adrian Brock
Chief Scientist
JBoss, a division of Red Hat
xxxxxxxxxxxxxxxxxxxxxxxxxxxx




More information about the jboss-development mailing list