[JBoss JIRA] (WFLY-1982) NPE in ModelControllerLock
by Emanuel Muckenhuber (JIRA)
[ https://issues.jboss.org/browse/WFLY-1982?page=com.atlassian.jira.plugin.... ]
Emanuel Muckenhuber resolved WFLY-1982.
---------------------------------------
Resolution: Done
> NPE in ModelControllerLock
> --------------------------
>
> Key: WFLY-1982
> URL: https://issues.jboss.org/browse/WFLY-1982
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Domain Management
> Reporter: Brian Stansberry
> Assignee: Emanuel Muckenhuber
> Fix For: 8.0.0.CR1
>
>
> Just noticed this in the host-controller.log while looking into a non-progressing RespawnTestCase:
> 22:23:50,552 ERROR [org.jboss.as.controller.management-operation] (proxy-threads - 1) JBAS014612: Operation ("register-server") failed - address: ([]): java.lang.NullPointerException
> at org.jboss.as.controller.ModelControllerLock$Sync.tryAcquire(ModelControllerLock.java:75) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) [rt.jar:1.7.0_15]
> at org.jboss.as.controller.ModelControllerLock.lockInterruptibly(ModelControllerLock.java:48) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.ModelControllerImpl.acquireLock(ModelControllerImpl.java:582) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.OperationContextImpl.takeWriteLock(OperationContextImpl.java:403) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.OperationContextImpl.acquireControllerLock(OperationContextImpl.java:700) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.host.controller.mgmt.ServerToHostProtocolHandler$ServerReconnectRequestHandler$1$1.execute(ServerToHostProtocolHandler.java:268)
> at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:610) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:488) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.AbstractOperationContext.completeStepInternal(AbstractOperationContext.java:277) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:272) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:257) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.controller.AbstractControllerService.internalExecute(AbstractControllerService.java:292) [wildfly-controller-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.host.controller.DomainModelControllerService.access$600(DomainModelControllerService.java:148)
> at org.jboss.as.host.controller.DomainModelControllerService$InternalExecutor.execute(DomainModelControllerService.java:899)
> at org.jboss.as.host.controller.mgmt.ServerToHostProtocolHandler$ServerReconnectRequestHandler$1.execute(ServerToHostProtocolHandler.java:282)
> at org.jboss.as.protocol.mgmt.AbstractMessageHandler$2$1.doExecute(AbstractMessageHandler.java:296) [wildfly-protocol-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at org.jboss.as.protocol.mgmt.AbstractMessageHandler$AsyncTaskRunner.run(AbstractMessageHandler.java:518) [wildfly-protocol-8.0.0.Beta1-SNAPSHOT.jar:8.0.0.Beta1-SNAPSHOT]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_15]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_15]
> at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_15]
> at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.0.Final.jar:2.1.0.Final]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month
[JBoss JIRA] (WFLY-364) a "failure-causes-rollback="false"" attribute for the filesystem scanner
by Emanuel Muckenhuber (JIRA)
[ https://issues.jboss.org/browse/WFLY-364?page=com.atlassian.jira.plugin.s... ]
Emanuel Muckenhuber reassigned WFLY-364:
----------------------------------------
Assignee: Emanuel Muckenhuber (was: Brian Stansberry)
> a "failure-causes-rollback="false"" attribute for the filesystem scanner
> ------------------------------------------------------------------------
>
> Key: WFLY-364
> URL: https://issues.jboss.org/browse/WFLY-364
> Project: WildFly
> Issue Type: Feature Request
> Security Level: Public(Everyone can see)
> Components: Domain Management
> Reporter: Max Rydahl Andersen
> Assignee: Emanuel Muckenhuber
> Fix For: 8.0.0.CR1
>
>
> JBIDE-11509, AS7-783 and TORQUE-576 all talk about the problem of all deployments found at startup is deployed in one operation and if one deployment fails all is rolled back resulting in some rather bad usability issues - especially at development time, but even also at production time for those using file deployments.
> Suggestion on irc was that there could be an option on the file scanner (possibly false by default?) to say that failure causes rollback.
> Individual deployments could then still fail, but at least not everything would be rolledback and it would still allow proper interdependent deployments to work.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month
[JBoss JIRA] (WFLY-88) Recovery not fully triggered when distributed transaction falls down at prepare phase of 2PC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-88?page=com.atlassian.jira.plugin.sy... ]
RH Bugzilla Integration commented on WFLY-88:
---------------------------------------------
Ondrej Chaloupka <ochaloup(a)redhat.com> changed the Status of [bug 952746|https://bugzilla.redhat.com/show_bug.cgi?id=952746] from ON_QA to ASSIGNED
> Recovery not fully triggered when distributed transaction falls down at prepare phase of 2PC
> --------------------------------------------------------------------------------------------
>
> Key: WFLY-88
> URL: https://issues.jboss.org/browse/WFLY-88
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: EJB, Remoting
> Reporter: Ivo Studensky
> Assignee: jaikiran pai
> Fix For: 8.0.0.Alpha1
>
> Attachments: logs_prepareHaltClient.tgz
>
>
> It looks like recovery process is not fully triggered on a distributed transaction when the transaction falls down at prepare phase of 2PC. In the new crash recovery tests over propagated transactions only one of two servers recovers from the crash, but the other keeps an unfinished tx in its tx log.
> It corresponds to prepareHaltClient and prepareHaltServer test methods of org.jboss.as.test.jbossts.crashrec.txpropagation.TxPropagationCrashRecoveryTestCase, see JBQA-2604 for general description of the new tests. The prepareHaltClient test crashes the server which initiated the transaction, while as the prepareHaltServer test crashes the second server.
> The tests are written against EAP6.x branch, so for reproducing this it is needed a built server from the 7.1 branch of AS7.
> Steps to reproduce.
> 1. git clone -b as7 git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-transactions.git
> 2. cd eap-tests-transactions
> 3. git checkout tx_propag_crashrec_tests
> 4a. mvn clean verify -Dtest=TxPropagationCrashRecoveryTestCase#prepareHaltClient -Djboss.dist=<path to jboss-as-7.1.3.Final-SNAPSHOT>
> or
> 4b. mvn clean verify -Dtest=TxPropagationCrashRecoveryTestCase#prepareHaltServer -Djboss.dist=<path to jboss-as-7.1.3.Final-SNAPSHOT>
> The logs of prepareHaltClient run attached to this jira.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month
[JBoss JIRA] (WFLY-88) Recovery not fully triggered when distributed transaction falls down at prepare phase of 2PC
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/WFLY-88?page=com.atlassian.jira.plugin.sy... ]
RH Bugzilla Integration commented on WFLY-88:
---------------------------------------------
Ondrej Chaloupka <ochaloup(a)redhat.com> made a comment on [bug 952746|https://bugzilla.redhat.com/show_bug.cgi?id=952746]
Hi David,
I've checked the current state of the issue (as it's longer time that I've been checking it) and I can say that there is still the problem in the waking up the ejb remote connection when the remote server (remote server which is called from client server - via outbound connection from client server) crashes and then comes up again. Then the client sever (it started the tx) does not know nothing about the remote server is up and that the recovery can be done.
This happen just for the distributed JTA transactions. The JTS transactions manage the distributed communication between nodes and the recovery starts without problem.
The workaround for the recovery is to call a remote method from the client server to the remote server after the remote server comes back to life. Then the crash recovery will start.
The test scenario when this problem occurs look:
- transaction is started on the client server
- the client server does call via outbound connection to the remote server (tx context is propagated to remote server)
- the remote server sends a message to a queue (simulation of some action done during the transaction)
- finishing the remote call and the bean method
- the transaction started 2PC. The prepare phase is done and the commit phase is started. The remote server crashes at the entry to the commit method
- client server is still alive
- remote server comes to life
- the crash recovery should proceed the commit as all the participant agreed on it
I would put here the explanation from Jaikiran:
When a connection breaks down between the server and the client, specifically when the client goes down and comes back up again, then the server and the client will not auto communicate with each other.
In other words, the server will have no knowledge (in EJB resource sense) that the client has come back up again. That effectively means that the EJB tx recovery process will have no clue of the EJB nodes to communicate with.
To deal with that, there should be some communication from the client (which is now up) to the server to reestablish that connection.
In a real application, it would be the first invocation from the client to the server.
I've checked that the call from the client server to remote one really establishes the connection and recovery starts.
B the next call from the client to server could take some time and meanwhile the transaction could be rollbacked because of the timeout.
What do you think about this?
I think that current behavior is not correct. We agreed on it with Jaikiran before as well but he haven't got a time to fix it (https://bugzilla.redhat.com/show_bug.cgi?id=952746#c15).
Thanks
Ondra
> Recovery not fully triggered when distributed transaction falls down at prepare phase of 2PC
> --------------------------------------------------------------------------------------------
>
> Key: WFLY-88
> URL: https://issues.jboss.org/browse/WFLY-88
> Project: WildFly
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: EJB, Remoting
> Reporter: Ivo Studensky
> Assignee: jaikiran pai
> Fix For: 8.0.0.Alpha1
>
> Attachments: logs_prepareHaltClient.tgz
>
>
> It looks like recovery process is not fully triggered on a distributed transaction when the transaction falls down at prepare phase of 2PC. In the new crash recovery tests over propagated transactions only one of two servers recovers from the crash, but the other keeps an unfinished tx in its tx log.
> It corresponds to prepareHaltClient and prepareHaltServer test methods of org.jboss.as.test.jbossts.crashrec.txpropagation.TxPropagationCrashRecoveryTestCase, see JBQA-2604 for general description of the new tests. The prepareHaltClient test crashes the server which initiated the transaction, while as the prepareHaltServer test crashes the second server.
> The tests are written against EAP6.x branch, so for reproducing this it is needed a built server from the 7.1 branch of AS7.
> Steps to reproduce.
> 1. git clone -b as7 git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-transactions.git
> 2. cd eap-tests-transactions
> 3. git checkout tx_propag_crashrec_tests
> 4a. mvn clean verify -Dtest=TxPropagationCrashRecoveryTestCase#prepareHaltClient -Djboss.dist=<path to jboss-as-7.1.3.Final-SNAPSHOT>
> or
> 4b. mvn clean verify -Dtest=TxPropagationCrashRecoveryTestCase#prepareHaltServer -Djboss.dist=<path to jboss-as-7.1.3.Final-SNAPSHOT>
> The logs of prepareHaltClient run attached to this jira.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 1 month