Re: [wildfly-dev] Error reporting on deployment failure

Wednesday, 14 February 2018

On Tue, Feb 13, 2018 at 8:24 PM, Stuart Douglas <stuart.w.douglas(a)gmail.com&gt;
wrote:

...
 Hi Everyone,

 I have been thinking a bit about the way we report errors in WildFly, and
 I think this is something that we can improve on. At the moment I think we
 are way to liberal with what we report, which results in a ton of services
 being listed in the error report that have nothing to do with the actual
 failure.

 As an example to work from I have created [1], which is a simple EJB
 application. This consists of 10 EJB's, one of which has a reference to a
 non-existant data source, the rest are simply empty no-op EJB's (just
 @Stateless on an empty class).

 This app fails to deploy because the java:global/NonExistant data source
 is missing, which gives the failure description in [2]. This is ~120 lines
 long and lists multiple services for every single component in the
 application (part of the reason this is so long is because the failures are
 reported twice, once when the deployment fails and once when the server
 starts).

 I think we can improve on this. I think in every failure case there will
 be some root causes that are all the end user cares about, and we should
 limit our reporting to just these cases, rather than listing every internal
 service that can no longer start due to missing transitive deps.

 In particular these root causes are:
 1) A service threw and exception in its start() method and failed to start
 2) A dependency is actually missing (i.e. not installed, not just not
 started)

 I think that one or both of these two cases will be the root cause of any
 failure, and as such that is all we should be reporting on.

 We already do an OK job of handing case 1), services that have failed, as
 they get their own line item in the error report, however case 2) results
 in a huge report that lists every service that has not come up, no matter
 how far removed they are from the actual problem.

If the 2) case can be correctly determined, then +1 to reporting some new
section and not reporting the current "WFLYCTL0180: Services with
missing/unavailable dependencies" section. The WFLYCTL0180 section could
only be reported as a fallback if for some reason the 1) and 2) stuff is
empty.

...

 I think we could make a change to the way this is reported so that only
 direct problems are reported [3], so the error report would look something
 like [4] (note that this commit only changes the operation report, the
 container state reporting after boot is still quite verbose).

I think the container state reporting is ok. IMHO the proper fix to the
container state reporting is to rollback and fail boot if Stage.RUNTIME
failures occur. Configurable, but rollback by default. If we did that there
would be no container state reporting. If you deploy your broken app
post-boot you shouldn't see the container state reporting because by the
time the report is written the op should have rolled back and the services
are no longer "missing". It's only because we don't rollback on boot
that
this is reported.

...

 I am guessing that this is not as simple as it sounds, otherwise it would
 have already been addressed, but I think we can do better that the current
 state of affairs so I thought I would get a discussion started.

It sounds pretty simple. Any "problem" ServiceController exposes its
ServiceContainer, and if relying on that registry to check if a missing
dependency is installed is not correct for some reason, the
ModelControllerImpl exposes its ServiceRegistry via a package protected
getter. So AbstractOperationContext can provide that to the SVH.

...
 Stuart

 [1] https://github.com/stuartwdouglas/errorreporting
 [2] https://gist.github.com/stuartwdouglas/b52a85813913f3304301eeb1f389fa
 e8
 [3] https://github.com/stuartwdouglas/wildfly-core/commit/
 a1fbc831edf290971d54c13dd1c5d15719454f85
 [4] https://gist.github.com/stuartwdouglas/14040534da8d07f937d02f2f08099e
 8d

 _______________________________________________
 wildfly-dev mailing list
 wildfly-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/wildfly-dev

-- 
Brian Stansberry
Manager, Senior Principal Software Engineer
Red Hat

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [wildfly-dev] Error reporting on deployment failure