]
Tom Jenkinson closed WFLY-4014.
-------------------------------
Resolution: Partially Completed
The ability for an app to provide its own CheckedAction was implemented in
which is in WFLY now. I don't think the
default behaviour should be to interrupt the drivers as per the discussion on the forum
post. Its an option for a savvy user to implement their own CheckedAction to do the
interrupt but they will do it with knowledge of how their set of drivers respond to
Thread.interrup() - if you need help how to implement a CheckedAction to interrupt threads
please do ask over on:
TransactionReaper wedged and not responding to interrupts
(ARJUNA012378, ARJUNA012120)
--------------------------------------------------------------------------------------
Key: WFLY-4014
URL:
https://issues.jboss.org/browse/WFLY-4014
Project: WildFly
Issue Type: Bug
Components: Transactions
Affects Versions: 8.1.0.Final
Environment: Darwin Keith-Yarbroughs-MBPro.local 13.4.0 Darwin Kernel Version
13.4.0: Sun Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64 x86_64
Reporter: Arcadiy Ivanov
Assignee: Tom Jenkinson
Attachments: cluster_logs.2014-10-23T23-37-03.tar.gz,
cluster_logs.2014-10-24T16-50-50.tar.gz, stuck-shutdown-wedged-reaper-logs.tar.gz
This issue is definitely intermittent and appeared first time ever in several months. It
is severe enough, however (server node becomes unresponsive and can only be killed with
SIGKILL) that I'm reporting it.
Issue occurred while running an Arquillian test. I don't know how to reproduce it.
The system is as follows:
* There is a multi-host multi-node WildFly domain cluster residing on a single machine
(127.0.0.(1+N) IPs, N > 0).
* There is a multi-node Postgres-XL cluster configured (127.0.1.(1+N) IPs, N > 0)
configured.
* There is a HAJDBC module configured. HAJDBC cluster is configured with datasources from
WildFly datasources subsystem which has a datasource for each node of Postgres-XL
cluster.
There is [another mention on the Inet of the same
problem|https://developer.jboss.org/thread/240172] without such an exotic setup, but
rather with simply a MySQL 5.6, although information is scarce.
{noformat}
2014-10-23 23:19:47,127 INFO [org.wildfly.extension.undertow] (MSC service thread 1-16)
JBAS017534: Registered web context: /test
2014-10-23 23:19:47,154 INFO [org.jboss.as.server] (ServerService Thread Pool -- 64)
JBAS018559: Deployed "1208cb8c-2b19-4d9a-a8b9-101f6e9e778f.ear" (runtime-name :
"1208cb8c-2b19-4d9a-a8b9-101f6e9e778f.ear")
2014-10-23 23:24:47,417 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117:
TransactionReaper::check timeout for TX 0:ffffc0a801f4:-475d22cc:5449ccbe:2f in state
RUN
2014-10-23 23:24:47,420 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0)
ARJUNA012095: Abort of action id 0:ffffc0a801f4:-475d22cc:5449ccbe:2f invoked while
multiple threads active within it.
2014-10-23 23:24:47,420 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0)
ARJUNA012108: CheckedAction::check - atomic action 0:ffffc0a801f4:-475d22cc:5449ccbe:2f
aborting with 1 threads active!
2014-10-23 23:24:47,918 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117:
TransactionReaper::check timeout for TX 0:ffffc0a801f4:-475d22cc:5449ccbe:2f in state
CANCEL
2014-10-23 23:24:47,920 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012378:
ReaperElement appears to be wedged: sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:229)
java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
org.jboss.jca.adapters.jdbc.BaseWrapperManagedConnection.lock(BaseWrapperManagedConnection.java:373)
org.jboss.jca.adapters.jdbc.local.LocalManagedConnection.rollback(LocalManagedConnection.java:113)
org.jboss.jca.core.tx.jbossts.LocalXAResourceImpl.rollback(LocalXAResourceImpl.java:242)
com.arjuna.ats.internal.jta.resources.arjunacore.XAOnePhaseResource.rollback(XAOnePhaseResource.java:196)
com.arjuna.ats.internal.arjuna.abstractrecords.LastResourceRecord.topLevelAbort(LastResourceRecord.java:126)
com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2939)
com.arjuna.ats.arjuna.coordinator.BasicAction.doAbort(BasicAction.java:2918)
com.arjuna.ats.arjuna.coordinator.BasicAction.Abort(BasicAction.java:1632)
com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.cancel(TwoPhaseCoordinator.java:116)
com.arjuna.ats.arjuna.AtomicAction.cancel(AtomicAction.java:215)
com.arjuna.ats.arjuna.coordinator.TransactionReaper.doCancellations(TransactionReaper.java:377)
com.arjuna.ats.internal.arjuna.coordinator.ReaperWorkerThread.run(ReaperWorkerThread.java:78)
2014-10-23 23:24:48,421 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117:
TransactionReaper::check timeout for TX 0:ffffc0a801f4:-475d22cc:5449ccbe:2f in state
CANCEL_INTERRUPTED
2014-10-23 23:24:48,422 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012120:
TransactionReaper::check worker Thread[Transaction Reaper Worker 0,5,main] not responding
to interrupt when cancelling TX 0:ffffc0a801f4:-475d22cc:5449ccbe:2f -- worker marked as
zombie and TX scheduled for mark-as-rollback
2014-10-23 23:24:48,422 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012110:
TransactionReaper::check successfuly marked TX 0:ffffc0a801f4:-475d22cc:5449ccbe:2f as
rollback only
{noformat}