[JBoss JIRA] (WFCORE-488) Failures of undeploy of a partially failed deployment

Monday, 23 February 2015

    [
https://issues.jboss.org/browse/WFCORE-488?page=com.atlassian.jira.plugin...
] 

Brian Stansberry commented on WFCORE-488:
-----------------------------------------

Here's where I expect to end up with this (reflected in the linked pull request):

1) It would be good to know how the deployment ended up not installed on one server, but
installed elsewhere. Hopefully with a reproducer. Unless a rollout plan specified that,
that's a bug. But we should treat that as a separate bug.

2) Once a server gets out of sync with the domain config (say because you made a change
with a rollout plan that allowed this), you can get failures like you report if you try
and make further changes. Options are:

a) stop the server that is inconsistent before making the changes
b) specify a rollout plan that will not rollback if 1 server fails in the given server
group. See "Rollout plans" in
https://developer.jboss.org/wiki/JBossAS7Command-lineOperationRequestFormat and
https://docs.jboss.org/author/display/WFLY9/Admin+Guide#AdminGuide-Operat...
.

The linked PR includes tests of 2)b). It worked as expected.

Alternatives to the current behavior are to no longer send config changes to servers that
are out of sync and basically force the user to restart them. That can introduce it's
own problems though, and requires a lot of care. I'm not sure it's a good idea.
See WFCORE-368.

3) There was a bug that resulted in "server-groups" data not getting reported to
the client when this failure occurs, contrary to what's specified in
https://docs.jboss.org/author/display/WFLY9/Admin+Guide#AdminGuide-MultiS....
The data was received properly on the master HC, but it misinterpreted what occurred as a
failure applying the change to the slave HCs and assumed no server results were available,
so it did not include them. The linked PR fixes this.

4) The linked PR also changes how the top level "failure-description" for the
response is generated, attempting to extract useful data from the server results and
including it in the top level failure-description. This will help improve the error
reporting from CLI high level commands and similar clients, where the full response may
not be displayed to the user, only the failure-description.

5) If I learn of any possibility for another 8.x release, I'll be happy to backport
any commits on this to the 8.x branch.

...
 Failures of undeploy of a partially failed deployment
 -----------------------------------------------------

                 Key: WFCORE-488
                 URL: https://issues.jboss.org/browse/WFCORE-488
             Project: WildFly Core
          Issue Type: Bug
          Components: CLI, Domain Management
            Reporter: Arcadiy Ivanov
            Assignee: Alexey Loubyansky

 {noformat}
 [domain@localhost:9990 /] undeploy 2e3b75f2-88ff-4eb9-a8b1-9107c452309e.ear
--all-relevant-server-groups
 Undeploy failed: JBAS010839: Operation failed or was rolled back on all servers.
 {noformat}
 There are several issues at play here:
 # Insufficient information is displayed along with JBAS010839
 # No actual failure logged either in DC or in individual host's logs
 # Failure to find a deployment unit on a particular host should not result in a failure
of a domain-wide undeploy in DeploymentUndeployHandler
 I'll start in a reverse order of causation:
 In DeploymentUndeployHandler.execute
 {noformat}
     public void execute(OperationContext context, ModelNode operation) throws
OperationFailedException {
         ModelNode model =
context.readResourceForUpdate(PathAddress.EMPTY_ADDRESS).getModel();
 ...
 {noformat}
 OperationContextImpl.readResourceForUpdate calls OperationContextImpl.requireChild:
 {noformat}
     private static Resource requireChild(final Resource resource, final PathElement
childPath, final PathAddress fullAddress) {
         if (resource.hasChild(childPath)) {
             return resource.requireChild(childPath);
         } else {
             PathAddress missing = PathAddress.EMPTY_ADDRESS;
             for (PathElement search : fullAddress) {
                 missing = missing.append(search);
                 if (search.equals(childPath)) {
                     break;
                 }
             }
             throw ControllerMessages.MESSAGES.managementResourceNotFound(missing);
         }
     }
 {noformat}
 The exception generated by {{throw
ControllerMessages.MESSAGES.managementResourceNotFound(missing);}}: 
 # does not propagate from the host to DC and to CLI
 # is not logged with any level (not even TRACE) either on host or DC
 # should be handled within DeploymentUndeployHandler not to fail the undeploy operation
in case of a partially failed deployment (e.g. EAR with unsatisfied CDI dependencies).

--
This message was sent by Atlassian JIRA
(v6.3.11#6341)

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006