[jboss-jira] [JBoss JIRA] (WFCORE-263) Cancelling management op on slave HC tree is broken

Thu Nov 20 00:18:39 EST 2014

    [ https://issues.jboss.org/browse/WFCORE-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021403#comment-13021403 ] 

James Livingston commented on WFCORE-263:
-----------------------------------------

Doing that works correctly.

My understanding is that in this setup, there is a composite op running on the DC, a composite op running on the slave HC(s), and an op per server instance to do the undeployment. Cancelling the DC op works correctly, and tells the server instance to interrupt and it rolls back (obviously the web container thread to undeploy is still blocked). Cancelling the per-server op works correctly, as it triggers a rollback of the HC op and then DC op.

If you cancel the HC op, it says it does so and the HC op disappears, but the per-server op looks like it is still running. Cancelling that per-server op gets it back to a working state.

I'm not sure why, but it looks like if you tell it to cancel the HC-level op (rather than the DC op), it is not cancelling the per-server op.

> Cancelling management op on slave HC tree is broken
> ---------------------------------------------------
>
>                 Key: WFCORE-263
>                 URL: https://issues.jboss.org/browse/WFCORE-263
>             Project: WildFly Core
>          Issue Type: Bug
>          Components: Domain Management
>    Affects Versions: 1.0.0.Alpha9
>            Reporter: James Livingston
>            Assignee: Brian Stansberry
>
> If you have a DC with a slave HC, and perform a management operation which gets stuck, non-progressing operations will be reported for both the DC and the slave HC via:
> /host=master/core-service=management/service=management-operations:find-non-progressing-operation
> /host=slave/core-service=management/service=management-operations:find-non-progressing-operation
> Cancelling the operation under /host=master works as expected, pushing the cancellation down to the slave and the controllers become responsive again.
> If however you attempt to cancel the operation under /host=slave, it goes bad. { "outcome" => "success", "result" => undefined } is reported in the CLI, but the controllers are still unresponsive.
> Running :find-non-progressing-operation against the slave will report the {outcome=success,result=undefined} rather than that no non-progressing operations were found, and active-operation=*:read-resource() shows it as not cancelled.
> Once you attempt to cancel it on a slave, attempting to cancel it under /host=master will report success, but leave the slave op in a weird state, and things requiring the controller lock (such as the web UI) will still not respond.

--
This message was sent by Atlassian JIRA
(v6.3.8#6338)