[JBoss JIRA] (JBTM-1298) Build timeout on SimpleIsolatedServers test
by Tom Jenkinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-1298?page=com.atlassian.jira.plugin.... ]
Tom Jenkinson updated JBTM-1298:
--------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/jbosstm/narayana/pull/250
> Build timeout on SimpleIsolatedServers test
> -------------------------------------------
>
> Key: JBTM-1298
> URL: https://issues.jboss.org/browse/JBTM-1298
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing
> Reporter: Paul Robinson
> Assignee: Tom Jenkinson
> Priority: Minor
> Fix For: 4.17.4, 5.0.0.M3
>
>
> {noformat}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode):
> "Transaction Reaper Worker 0" daemon prio=10 tid=0x000000001da1b800 nid=0x6ff1 in Object.wait() [0x0000000040141000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3cb850> (a java.util.LinkedList)
> at java.lang.Object.wait(Object.java:503)
> at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)
> - locked <0x00000000fa3cb850> (a java.util.LinkedList)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)
> "Transaction Reaper" daemon prio=10 tid=0x000000002334f800 nid=0x6ff0 in Object.wait() [0x0000000042766000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3cb5e8> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)
> - locked <0x00000000fa3cb5e8> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> "Transaction Expired Entry Monitor" daemon prio=10 tid=0x000000001f05a000 nid=0x6fef in Object.wait() [0x00000000414db000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3cedc0> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)
> - locked <0x00000000fa3cedc0> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> "Transaction Reaper Worker 0" daemon prio=10 tid=0x000000001ecd4000 nid=0x6fee in Object.wait() [0x0000000042867000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3a0ce0> (a java.util.LinkedList)
> at java.lang.Object.wait(Object.java:503)
> at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)
> - locked <0x00000000fa3a0ce0> (a java.util.LinkedList)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)
> "Transaction Reaper" daemon prio=10 tid=0x000000001e1c5000 nid=0x6fed in Object.wait() [0x0000000042665000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3a0a78> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)
> - locked <0x00000000fa3a0a78> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> "Transaction Expired Entry Monitor" daemon prio=10 tid=0x000000001eefa800 nid=0x6fec in Object.wait() [0x0000000042463000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3a4250> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)
> - locked <0x00000000fa3a4250> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> "Transaction Reaper Worker 0" daemon prio=10 tid=0x000000001f49f000 nid=0x6feb in Object.wait() [0x0000000042362000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3ee7d0> (a java.util.LinkedList)
> at java.lang.Object.wait(Object.java:503)
> at com.arjuna.ats.arjuna.coordinator.TransactionReaper.waitForCancellations(TransactionReaper.java:321)
> - locked <0x00000000fa3ee7d0> (a java.util.LinkedList)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:65)
> "Transaction Reaper" daemon prio=10 tid=0x000000001ee43000 nid=0x6fea in Object.wait() [0x0000000042564000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3ee568> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> at com.arjuna.ats.internal.arjuna.coordinator.ReaperThread.run(ReaperThread.java:90)
> - locked <0x00000000fa3ee568> (a com.arjuna.ats.arjuna.coordinator.TransactionReaper)
> "Transaction Expired Entry Monitor" daemon prio=10 tid=0x000000001deee800 nid=0x6fe9 in Object.wait() [0x000000004049d000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000fa3f1d40> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> at com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor.run(ExpiredEntryMonitor.java:190)
> - locked <0x00000000fa3f1d40> (a com.arjuna.ats.internal.arjuna.recovery.ExpiredEntryMonitor)
> "Attach Listener" daemon prio=10 tid=0x000000001e7ae800 nid=0x6f69 runnable [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Service Thread" daemon prio=10 tid=0x000000001d7a3000 nid=0x6f24 runnable [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread1" daemon prio=10 tid=0x000000001d7a0800 nid=0x6f23 waiting on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "C2 CompilerThread0" daemon prio=10 tid=0x000000001d795000 nid=0x6f22 waiting on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Signal Dispatcher" daemon prio=10 tid=0x000000001d793000 nid=0x6f21 waiting on condition [0x0000000000000000]
> java.lang.Thread.State: RUNNABLE
> "Finalizer" daemon prio=10 tid=0x000000001d740000 nid=0x6f20 in Object.wait() [0x0000000042160000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000f9c7add8> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
> - locked <0x00000000f9c7add8> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
> "Reference Handler" daemon prio=10 tid=0x000000001d73e000 nid=0x6f1f in Object.wait() [0x000000004205f000]
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0x00000000f9c7aa50> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:503)
> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
> - locked <0x00000000f9c7aa50> (a java.lang.ref.Reference$Lock)
> "main" prio=10 tid=0x000000001d6b1800 nid=0x6f1b runnable [0x0000000041245000]
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:150)
> at java.net.SocketInputStream.read(SocketInputStream.java:121)
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
> - locked <0x00000000fa41fbc0> (a java.io.InputStreamReader)
> at java.io.InputStreamReader.read(InputStreamReader.java:184)
> at java.io.BufferedReader.fill(BufferedReader.java:154)
> at java.io.BufferedReader.readLine(BufferedReader.java:317)
> - locked <0x00000000fa41fbc0> (a java.io.InputStreamReader)
> at java.io.BufferedReader.readLine(BufferedReader.java:382)
> at org.jboss.byteman.agent.submit.Submit$Comm.readResponse(Submit.java:931)
> at org.jboss.byteman.agent.submit.Submit.submitRequest(Submit.java:780)
> at org.jboss.byteman.agent.submit.Submit.addScripts(Submit.java:595)
> at org.jboss.byteman.agent.submit.Submit.addRulesFromFiles(Submit.java:547)
> at org.jboss.byteman.contrib.bmunit.BMUnit.loadScriptFile(BMUnit.java:305)
> at org.jboss.byteman.contrib.bmunit.BMUnitRunner$5.evaluate(BMUnitRunner.java:248)
> at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
> at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
> at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
> at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
> at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
> at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
> at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:172)
> at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:78)
> at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:70)
> "VM Thread" prio=10 tid=0x000000001d736000 nid=0x6f1e runnable
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x000000001d6bf800 nid=0x6f1c runnable
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x000000001d6c1000 nid=0x6f1d runnable
> "VM Periodic Task Thread" prio=10 tid=0x000000001d7ad800 nid=0x6f25 waiting on condition
> JNI global references: 3087
> Heap
> PSYoungGen total 29376K, used 854K [0x00000000fdeb0000, 0x0000000100000000, 0x0000000100000000)
> eden space 24576K, 3% used [0x00000000fdeb0000,0x00000000fdf85bc0,0x00000000ff6b0000)
> from space 4800K, 0% used [0x00000000ffb50000,0x00000000ffb50000,0x0000000100000000)
> to space 4736K, 0% used [0x00000000ff6b0000,0x00000000ff6b0000,0x00000000ffb50000)
> ParOldGen total 68288K, used 8371K [0x00000000f9c00000, 0x00000000fdeb0000, 0x00000000fdeb0000)
> object space 68288K, 12% used [0x00000000f9c00000,0x00000000fa42cea8,0x00000000fdeb0000)
> PSPermGen total 83968K, used 78621K [0x00000000f4a00000, 0x00000000f9c00000, 0x00000000f9c00000)
> object space 83968K, 93% used [0x00000000f4a00000,0x00000000f96c7478,0x00000000f9c00000)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 5 months
[JBoss JIRA] (JBTM-1562) can not run cxx tests
by Amos Feng (JIRA)
[ https://issues.jboss.org/browse/JBTM-1562?page=com.atlassian.jira.plugin.... ]
Amos Feng updated JBTM-1562:
----------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> can not run cxx tests
> ---------------------
>
> Key: JBTM-1562
> URL: https://issues.jboss.org/browse/JBTM-1562
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing
> Reporter: Amos Feng
> Assignee: Amos Feng
> Fix For: 5.0.0.M3
>
>
> cxx tests can not run with command './build.sh clean install'.
> {code}
> [INFO] --- blacktie-cpp-plugin:5.0.0.M3-SNAPSHOT:test-compile (test-compile) @ blacktie-core ---
> init:
> fileset.test.check:
> [echo] src/test/cpp **/*.c*
> gen-test-runner1:
> [mkdir] Created dir: /home/zhfeng/src/zhfeng/blacktie/core/target/generated-sources
> [mkdir] Created dir: /home/zhfeng/src/zhfeng/blacktie/core/target/generated/proto
> gen-test-runner2:
> gen-test-runner:
> package:
> test-compile:
> [mkdir] Created dir: /home/zhfeng/src/zhfeng/blacktie/core/target/cpp-test-classes
> [copy] Copying 5 files to /home/zhfeng/src/zhfeng/blacktie/core/target/cpp-test-classes
> [cc] 12 total files to be compiled.
> [cc] Starting link
> _test-compile-msvc:
> [INFO]
> [INFO] --- maven-surefire-plugin:2.7.2:test (default-test) @ blacktie-core ---
> [INFO] Surefire report directory: /home/zhfeng/src/zhfeng/blacktie/core/target/surefire-reports
> -------------------------------------------------------
> T E S T S
> -------------------------------------------------------
> There are no tests to run.
> Results :
> Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
> [INFO]
> [INFO] --- blacktie-cpp-plugin:5.0.0.M3-SNAPSHOT:test (test) @ blacktie-core ---
> init:
> fileset.test.check:
> [echo] src/test/cpp **/.c*
> test:
> [INFO] ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] ------------------------------------------------------------------------
> [INFO] Total time: 25.725s
> [INFO] Finished at: Mon Mar 25 11:33:00 CST 2013
> [INFO] Final Memory: 8M/148M
> [INFO] ------------------------------------------------------------------------
>
> {code}
> It looks like the issues with "test.includes" which is only valid in test-compile phase. I make the following changes and the tests work.
> {code}
> diff --git a/utils/cpp-plugin/src/main/resources/btcpp.build.xml b/utils/cpp-plugin/src/main/resources
> index 3934343..63b730c 100644
> --- a/utils/cpp-plugin/src/main/resources/btcpp.build.xml
> +++ b/utils/cpp-plugin/src/main/resources/btcpp.build.xml
> @@ -60,7 +60,7 @@
> <property name="src.main" value="src/main/cpp" />
> <property name="src.test" value="src/test/cpp" />
> <property name="src.excludes" value="" />
> - <property name="test.includes" value="" />
> + <property name="test.includes" value="*" />
> <property name="test.excludes" value="" />
> <property name="lib.type" value="shared" />
> <property name="runtime" value="dynamic" />
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 5 months
[JBoss JIRA] (JBTM-1522) "no XTS application recovery module found" during XTS Recovery Tests
by Amos Feng (JIRA)
[ https://issues.jboss.org/browse/JBTM-1522?page=com.atlassian.jira.plugin.... ]
Work on JBTM-1522 started by Amos Feng.
> "no XTS application recovery module found" during XTS Recovery Tests
> --------------------------------------------------------------------
>
> Key: JBTM-1522
> URL: https://issues.jboss.org/browse/JBTM-1522
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing, XTS
> Reporter: Paul Robinson
> Assignee: Amos Feng
> Priority: Critical
> Fix For: 5.0.0.M3
>
>
> See: http://172.17.131.2/view/Narayana+BlackTie/job/narayana/211/artifact/XTS/...
> Notice the following log is displayed repeatedly until the test gives up waiting for recovery:
> {code}
> WARN [com.arjuna.wsrecovery] (Periodic Recovery) ARJUNA046032: no XTS application recovery module found to help reactivate recovered WS-AT participant org.jboss.jbossts.xts.servicetests.DurableTestParticipant.0
> {code}
> This error comes from org.jboss.jbossts.xts.recovery.participant.at.XTSATRecoveryManagerImple#recoverParticipants(). In particular:
> {code}
> if (!found) {
> // we failed to find a helper to convert a participant record so log a warning
> // but leave it in the table for next time
> RecoveryLogger.i18NLogger.warn_participant_at_XTSATRecoveryModule_4(participantRecoveryRecord.getId());
> }
> {code}
> It looks like the code is unable to restore the participant from the log due to restoreParticipant(XTSATRecoveryModule module) returning false. There is ParticipantRecoveryRecord in the log as you can see it dumped to the console in the above log. Maybe there is a problem with that log, or maybe we are missing another log entry?
> This problem is intermittent, so it's unlikely that you will see this happen when you attach a debugger. However, we could attach a debugger to see what happens in the normal case and also to inspect the log to see if anything is missing in the failing case. But I have a cunning plan...
> h4.Cunning Plan
> We need to get a copy of the failing log, before recovery is attempted. We should then be able to use that log to reproduce the issue on our own machines. Steps to take:
> # Update BaseCrashTest to copy the contents of the tx-object-store to a unique folder location (So we can retrieve it later for a failed run). Make sure you create the folder structure under target/surefire-reports so that CI archives it off. Do the copy between controller.kill and controller.start. This way we get the log before the recovery manager has had chance to tamper with it.
> # Update the "narayana-JBTM-1522" job in CI to use your branch, containing the change above.
> # Configure the job to run @hourly until it fails with this problem.
> # Take a copy of the tx-object-store from the failing test and then put it in place on your AS8 build.
> # Boot the AS and confirm that the issue is reproduced.
> # You can now keep putting the tx-object-store back in place every time you need to reproduce the issue.
> # Attach a debugger to find out what the problem is.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 5 months
[JBoss JIRA] (JBTM-1522) "no XTS application recovery module found" during XTS Recovery Tests
by Amos Feng (JIRA)
[ https://issues.jboss.org/browse/JBTM-1522?page=com.atlassian.jira.plugin.... ]
Amos Feng updated JBTM-1522:
----------------------------
Assignee: Amos Feng (was: Paul Robinson)
> "no XTS application recovery module found" during XTS Recovery Tests
> --------------------------------------------------------------------
>
> Key: JBTM-1522
> URL: https://issues.jboss.org/browse/JBTM-1522
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing, XTS
> Reporter: Paul Robinson
> Assignee: Amos Feng
> Priority: Critical
> Fix For: 5.0.0.M3
>
>
> See: http://172.17.131.2/view/Narayana+BlackTie/job/narayana/211/artifact/XTS/...
> Notice the following log is displayed repeatedly until the test gives up waiting for recovery:
> {code}
> WARN [com.arjuna.wsrecovery] (Periodic Recovery) ARJUNA046032: no XTS application recovery module found to help reactivate recovered WS-AT participant org.jboss.jbossts.xts.servicetests.DurableTestParticipant.0
> {code}
> This error comes from org.jboss.jbossts.xts.recovery.participant.at.XTSATRecoveryManagerImple#recoverParticipants(). In particular:
> {code}
> if (!found) {
> // we failed to find a helper to convert a participant record so log a warning
> // but leave it in the table for next time
> RecoveryLogger.i18NLogger.warn_participant_at_XTSATRecoveryModule_4(participantRecoveryRecord.getId());
> }
> {code}
> It looks like the code is unable to restore the participant from the log due to restoreParticipant(XTSATRecoveryModule module) returning false. There is ParticipantRecoveryRecord in the log as you can see it dumped to the console in the above log. Maybe there is a problem with that log, or maybe we are missing another log entry?
> This problem is intermittent, so it's unlikely that you will see this happen when you attach a debugger. However, we could attach a debugger to see what happens in the normal case and also to inspect the log to see if anything is missing in the failing case. But I have a cunning plan...
> h4.Cunning Plan
> We need to get a copy of the failing log, before recovery is attempted. We should then be able to use that log to reproduce the issue on our own machines. Steps to take:
> # Update BaseCrashTest to copy the contents of the tx-object-store to a unique folder location (So we can retrieve it later for a failed run). Make sure you create the folder structure under target/surefire-reports so that CI archives it off. Do the copy between controller.kill and controller.start. This way we get the log before the recovery manager has had chance to tamper with it.
> # Update the "narayana-JBTM-1522" job in CI to use your branch, containing the change above.
> # Configure the job to run @hourly until it fails with this problem.
> # Take a copy of the tx-object-store from the failing test and then put it in place on your AS8 build.
> # Boot the AS and confirm that the issue is reproduced.
> # You can now keep putting the tx-object-store back in place every time you need to reproduce the issue.
> # Attach a debugger to find out what the problem is.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 5 months
[JBoss JIRA] (JBTM-1522) "no XTS application recovery module found" during XTS Recovery Tests
by Paul Robinson (JIRA)
[ https://issues.jboss.org/browse/JBTM-1522?page=com.atlassian.jira.plugin.... ]
Paul Robinson updated JBTM-1522:
--------------------------------
Priority: Critical (was: Major)
> "no XTS application recovery module found" during XTS Recovery Tests
> --------------------------------------------------------------------
>
> Key: JBTM-1522
> URL: https://issues.jboss.org/browse/JBTM-1522
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Testing, XTS
> Reporter: Paul Robinson
> Assignee: Paul Robinson
> Priority: Critical
> Fix For: 5.0.0.M3
>
>
> See: http://172.17.131.2/view/Narayana+BlackTie/job/narayana/211/artifact/XTS/...
> Notice the following log is displayed repeatedly until the test gives up waiting for recovery:
> {code}
> WARN [com.arjuna.wsrecovery] (Periodic Recovery) ARJUNA046032: no XTS application recovery module found to help reactivate recovered WS-AT participant org.jboss.jbossts.xts.servicetests.DurableTestParticipant.0
> {code}
> This error comes from org.jboss.jbossts.xts.recovery.participant.at.XTSATRecoveryManagerImple#recoverParticipants(). In particular:
> {code}
> if (!found) {
> // we failed to find a helper to convert a participant record so log a warning
> // but leave it in the table for next time
> RecoveryLogger.i18NLogger.warn_participant_at_XTSATRecoveryModule_4(participantRecoveryRecord.getId());
> }
> {code}
> It looks like the code is unable to restore the participant from the log due to restoreParticipant(XTSATRecoveryModule module) returning false. There is ParticipantRecoveryRecord in the log as you can see it dumped to the console in the above log. Maybe there is a problem with that log, or maybe we are missing another log entry?
> This problem is intermittent, so it's unlikely that you will see this happen when you attach a debugger. However, we could attach a debugger to see what happens in the normal case and also to inspect the log to see if anything is missing in the failing case. But I have a cunning plan...
> h4.Cunning Plan
> We need to get a copy of the failing log, before recovery is attempted. We should then be able to use that log to reproduce the issue on our own machines. Steps to take:
> # Update BaseCrashTest to copy the contents of the tx-object-store to a unique folder location (So we can retrieve it later for a failed run). Make sure you create the folder structure under target/surefire-reports so that CI archives it off. Do the copy between controller.kill and controller.start. This way we get the log before the recovery manager has had chance to tamper with it.
> # Update the "narayana-JBTM-1522" job in CI to use your branch, containing the change above.
> # Configure the job to run @hourly until it fails with this problem.
> # Take a copy of the tx-object-store from the failing test and then put it in place on your AS8 build.
> # Boot the AS and confirm that the issue is reproduced.
> # You can now keep putting the tx-object-store back in place every time you need to reproduce the issue.
> # Attach a debugger to find out what the problem is.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 5 months