[JBoss JIRA] (JBTM-1350) Deadlock in LockManager
by Mark Little (JIRA)
[ https://issues.jboss.org/browse/JBTM-1350?page=com.atlassian.jira.plugin.... ]
Mark Little commented on JBTM-1350:
-----------------------------------
Code commented out in 2PC methods ...
// ThreadActionData.pushAction(theTransaction); // unnecessary if context goes with all calls.
... can be uncommented for LockManager. I'm unsure when it was commented out, but it was a long time ago and presumably when we were doing less with LockManager or just being lucky enough not to run into these cc issues.
> Deadlock in LockManager
> -----------------------
>
> Key: JBTM-1350
> URL: https://issues.jboss.org/browse/JBTM-1350
> Project: JBoss Transaction Manager
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Transaction Core
> Affects Versions: 5.0.0.M1, 4.17.2
> Reporter: Michael Musgrove
> Assignee: Michael Musgrove
> Fix For: 5.0.0.M3
>
> Attachments: deadlock_via_intrinsic_lock, deadlock_via_ServerNestedAction, jstack.16785
>
> Original Estimate: 1 day
> Remaining Estimate: 1 day
>
> A deadlock can occur whilst calling LockManager.setlock if another thread tries to commit a transaction that has the same lock manager as a participant. The attached java stack dump, jstack.16785, (from test com.hp.mwtests.ts.jts.remote.hammer.DistributedHammer2) shows an example. It shows two threads interacting with a remote HammerObject:
> - Thread 1 updates the remote object;
> - Thread 2 commits a transaction that has the same HammerObject instance as a participant;
> Thread 1 calls setlock on HammerObject which synchronizes on BasicAction.Current() and LockManager.locksHeldLockObject and then activates the object (which triggers an object load from the object store). The activate call tries to lock StateManager.mutex and this is where thread 1 deadlocks.
> Meanwhile a commit request is issued which results in Thread 2 running at the same time thread 1 is calling activate. The commit asks HammerObject to prepare and commit. The participant commit asks HammerObject to release any locks it has held (LockManager.releaseAll). This call first tries to lock BasicAction.Current() which is null during commit time (since commit disassociates the transaction from the thread before committing the participants). Instead it locks StateManager.mutex which succeeds. Then it tries to lock LockManager.locksHeldLockObject and that is where Thread 2 deadlocks.
> Note that the problem arises because Thread 2 gets null when it calls BasicAction.Current() and instead locks StateManager.mutex. If BasicAction.Current() was not null thread 2 would try to lock it but would not obtain the lock until thread 1 had completed the activate call and released BasicAction.Current() thus allowing thread 2 to continue as normal.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (JBTM-1601) Failing qa testcase org.jboss.jbossts.qa.junit.testgroup.TestGroup_crashrecovery02_2 on windows machines with jacorb
by Ondřej Chaloupka (JIRA)
Ondřej Chaloupka created JBTM-1601:
--------------------------------------
Summary: Failing qa testcase org.jboss.jbossts.qa.junit.testgroup.TestGroup_crashrecovery02_2 on windows machines with jacorb
Key: JBTM-1601
URL: https://issues.jboss.org/browse/JBTM-1601
Project: JBoss Transaction Manager
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Testing
Affects Versions: 5.0.0.M2, 4.17.3
Reporter: Ondřej Chaloupka
Assignee: Tom Jenkinson
I'm hitting an issue on qa tests for windows machines. I'm currently testing EAP 6.1.0.ER3.
Testcase org.jboss.jbossts.qa.junit.testgroup.TestGroup_crashrecovery02_2 is failing when it's run on windows machines. It does not matter which JDK is used. It fails on 4.17 branch and master as well.
This happens for jacorb.
The fails consistently occur on 5 tests from the testcase - that are from CrashRecovery02_2_Test26 till CrashRecovery02_2_Test30.
All of them throw assertion:
{quote}
junit.framework.AssertionFailedError: task client1 printed Failed.
{quote}
These details apply to test CrashRecovery02_2_Test27:
The client implementation is org.jboss.jbossts.qa.CrashRecovery02.Client02a and the fail comes from line 114 (branch 4.17).
{code}
correct = correct && (resourceTrace1 == ResourceTrace.ResourceTraceCommit);
{code}
Where the value of resourceTrace1 is {{ResourceTraceNone}}.
I didn't get with investigation further so far.
Steps for reproducing could be handy (using narayana.sh script first):
1. export NARAYANA_BUILD=0
export NARAYANA_TESTS=0
export CP_NARAYANA_AS=0
export AS_BUILD=0
export XTS_AS_TESTS=0
export TXF_TESTS=0
export XTS_TESTS=0
export txbridge=0
export QA_TESTS=1
export SUN_ORB=0
export QA_TARGET=test
export QA_PROFILE="-Dtest=crashrecovery02_2"
export WORKSPACE=$PWD
2. run naryana.sh - there was problem with paths for me so the command looks like this at the end
{quote}
sh scripts/hudson/narayana.sh -Demma.jar.location=c:\\tmp\\ochaloup\\ext -Demma.enabled=false -Dorson.jar.location=\\tmp\\ochaloup\\ext
{quote}
You can check whole stacktrace from job on jenkins:
- https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JBossTS/view/JBossT...
- https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JBossTS/view/JBossT...
Do you think that you could check this?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months
[JBoss JIRA] (JBTM-1600) narayana.sh continually attempts to rebase despite non in progress
by Paul Robinson (JIRA)
Paul Robinson created JBTM-1600:
-----------------------------------
Summary: narayana.sh continually attempts to rebase despite non in progress
Key: JBTM-1600
URL: https://issues.jboss.org/browse/JBTM-1600
Project: JBoss Transaction Manager
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: Build System
Reporter: Paul Robinson
Assignee: Tom Jenkinson
Fix For: 5.0.0.M3
See "No rebase in progress?" lines in:
http://172.17.131.2/job/btny-pulls-narayana/417/
I think this can happen if you merge the last commit. The rebase is then complete, so "rebase --continue" fails, causing the script to continue it's loop. This is from memory, not had chance to check properly.
Also, it could be due to an upgrade in the CI nodes bringing a new version of git.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 9 months