<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Yes, we had some bz/jiras opened about this before. We get cases
where customers application is failing and has pages upon pages of
dependency errors and the customer cannot easily determine the
issue. And even support has difficultly, we usually try searching
for common things like datasources or other JNDI references that
might be missing, but I have seen several where it was not a
datasource and took a while of tearing the apps apart to resolve.
It looks like there was some improvement in EAP 7.1 [2], but it
sounds like Stuart's PR may be even better.<br>
<br>
I found one example deployment on [1] that we could try and see what
the logging looks like with the new PR.<br>
I figure the service dump would show all of the failed dependencies
in case there was a need to look at the others?<br>
<br>
[1] <a class="moz-txt-link-freetext" href="https://bugzilla.redhat.com/show_bug.cgi?id=1283294">https://bugzilla.redhat.com/show_bug.cgi?id=1283294</a><br>
[2] <a class="moz-txt-link-freetext" href="https://issues.jboss.org/browse/JBEAP-5311">https://issues.jboss.org/browse/JBEAP-5311</a><br>
<br>
<div class="moz-cite-prefix">On 2/15/18 5:15 PM, Stuart Douglas
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAAoo=c6zt2wX22tKHCLsEE+S3StMXWvWXLSp8o5PhpGY9_TvOQ@mail.gmail.com">
<div dir="ltr">I have opened <a
href="https://github.com/wildfly/wildfly-core/pull/3114"
moz-do-not-send="true">https://github.com/wildfly/wildfly-core/pull/3114</a>
to allow for testing/further review.
<div><br>
</div>
<div>Stuart</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Feb 15, 2018 at 11:32 PM,
Stuart Douglas <span dir="ltr"><<a
href="mailto:stuart.w.douglas@gmail.com" target="_blank"
moz-do-not-send="true">stuart.w.douglas@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">
<div>
<div class="h5">On Thu, Feb 15, 2018 at 6:51 PM,
Brian Stansberry <span dir="ltr"><<a
href="mailto:brian.stansberry@redhat.com"
target="_blank" moz-do-not-send="true">brian.stansberry@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-">On
Wed, Feb 14, 2018 at 9:37 PM, Stuart
Douglas <span dir="ltr"><<a
href="mailto:stuart.w.douglas@gmail.com"
target="_blank"
moz-do-not-send="true">stuart.w.douglas@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote"><span>On
Wed, Feb 14, 2018 at 4:43 PM,
Brian Stansberry <span
dir="ltr"><<a
href="mailto:brian.stansberry@redhat.com"
target="_blank"
moz-do-not-send="true">brian.stansberry@redhat.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-">On
Tue, Feb 13, 2018 at
8:24 PM, Stuart
Douglas <span
dir="ltr"><<a
href="mailto:stuart.w.douglas@gmail.com"
target="_blank"
moz-do-not-send="true">stuart.w.douglas@gmail.com</a>></span> wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Hi
Everyone,
<div><br>
</div>
<div>I have been
thinking a bit
about the way
we report
errors in
WildFly, and I
think this is
something that
we can improve
on. At the
moment I think
we are way to
liberal with
what we
report, which
results in a
ton of
services being
listed in the
error report
that have
nothing to do
with the
actual
failure.</div>
<div><br>
</div>
<div>As an
example to
work from I
have created
[1], which is
a simple EJB
application.
This consists
of 10 EJB's,
one of which
has a
reference to a
non-existant
data source,
the rest are
simply empty
no-op EJB's
(just
@Stateless on
an empty
class).</div>
<div><br>
</div>
<div>This app
fails to
deploy because
the
java:global/NonExistant
data source is
missing, which
gives the
failure
description in
[2]. This is
~120 lines
long and lists
multiple
services for
every single
component in
the
application
(part of the
reason this is
so long is
because the
failures are
reported
twice, once
when the
deployment
fails and once
when the
server
starts).</div>
<div><br>
</div>
<div>I think we
can improve on
this. I think
in every
failure case
there will be
some root
causes that
are all the
end user cares
about, and we
should limit
our reporting
to just these
cases, rather
than listing
every internal
service that
can no longer
start due to
missing
transitive
deps.</div>
<div><br>
</div>
<div>In
particular
these root
causes are:</div>
<div>1) A
service threw
and exception
in its start()
method and
failed to
start</div>
<div>2) A
dependency is
actually
missing (i.e.
not installed,
not just not
started)</div>
<div><br>
</div>
<div>I think
that one or
both of these
two cases will
be the root
cause of any
failure, and
as such that
is all we
should be
reporting on.</div>
<div><br>
</div>
<div>We already
do an OK job
of handing
case 1),
services that
have failed,
as they get
their own line
item in the
error report,
however case
2) results in
a huge report
that lists
every service
that has not
come up, no
matter how far
removed they
are from the
actual
problem.</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>If the 2) case
can be correctly
determined, then +1
to reporting some
new section and not
reporting the
current "WFLYCTL0180:
Services with
missing/unavailable
dependencies"
section. The
WFLYCTL0180 section
could only be
reported as a
fallback if for some
reason the 1) and 2)
stuff is empty.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>I have adjusted this a bit
so a service with mode NEVER
is treated the same as if it
is missing. I am pretty sure
that with this change 1) and
2) will cover 100% of cases.</div>
<span>
<div><br>
</div>
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-">
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div><br>
</div>
<div>I think we
could make a
change to the
way this is
reported so
that only
direct
problems are
reported [3],
so the error
report would
look something
like [4] (note
that this
commit only
changes the
operation
report, the
container
state
reporting
after boot is
still quite
verbose).</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>I think the
container state
reporting is ok.
IMHO the proper fix
to the container
state reporting is
to rollback and fail
boot if
Stage.RUNTIME
failures occur.
Configurable, but
rollback by default.
If we did that there
would be no
container state
reporting. If you
deploy your broken
app post-boot you
shouldn't see the
container state
reporting because by
the time the report
is written the op
should have rolled
back and the
services are no
longer "missing".
It's only because we
don't rollback on
boot that this is
reported.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>I don't think it is
nessesary to report on
services that are only down
because their dependents are
down. It basically just adds
noise, as they are not really
related to the underlying
issue. I have expanded my
branch to also do this:</div>
<div><br>
</div>
<div><a
href="https://github.com/wildfly/wildfly-core/compare/master...stuartwdouglas:error-reporting?expand=1"
target="_blank"
moz-do-not-send="true">https://github.com/wildfly/wil<wbr>dfly-core/compare/master...stu<wbr>artwdouglas:error-reporting?ex<wbr>pand=1</a><br>
</div>
<div> </div>
<div>This ends up with very
concise reports that just
detail the services that are
the root cause of the
problem: <a
href="https://gist.github.com/stuartwdouglas/42a68aaaa130ceee38ca5f66d0040de3"
target="_blank"
moz-do-not-send="true">https://gist.github.c<wbr>om/stuartwdouglas/42a68aaaa130<wbr>ceee38ca5f66d0040de3</a></div>
<div><br>
</div>
<div>Does this approach seem
reasonable? lf a user really
does want a complete dump of
all services that are down
that information is still
available directly from MSC
anyway.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>It seems reasonable.</div>
<div><br>
</div>
<div>I'm going to get all lawyerly now.
This is because while we don't treat our
failure messages as "API" requiring
compatibility, for these particular ones
I think we should be as careful as
possible.</div>
<div><br>
</div>
<div>1) "WFLYCTL0180: Services with
missing/unavailable dependencies" =>
["<a
href="http://jboss.naming.context.java.co"
moz-do-not-send="true">jboss.naming.context.java.co</a><wbr>mp.\"error-reporting-1.0-SNAPS<wbr>HOT\".\"error-reporting-1.0-<wbr>SNAPSHOT\".ErrorEjb.env.\"com.<wbr>stuartdouglas.ErrorEjb\".nonEx<wbr>istant
is missing
[jboss.naming.context.java.glo<wbr>bal.NonExistant]"]</div>
<div><br>
</div>
<div>Here you've somewhat repurposed an
existing message. That can be quite ok
IMHO so long as what's gone is just
noise and the English meaning of the
message is still correct. Basically,
what did "missing/unavailable
dependencies" mean before, what does it
mean now, and is there a clear story
behind the shift from A to B. The
"missing" part is pretty clear -- not
installed or NEVER is "missing". For
"unavailable" now we've dropped the
installed but unstarted ones. If we're
including the ones that directly depend
on *failed* services then that's a
coherent definition of "unavailable". If
we're not then "unavailable" is
misleading. Sorry, I'm juggling stuff so
I haven't checked the code. :(</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</div>
</div>
<div>Previously this section would display every
service that was down due to its dependencies being
down. This would include services that were many
levels away from the actual problem (e.g. if A
depends on B which depends on C which depends on D
which is down, A, B and C would all be listed in
this section). This change displays the same
information, but only for direct dependents, so in
the example about only C would be listed in this
section.</div>
<div><br>
</div>
<div>The 'New missing/unsatisfied dependencies:'
section in the container state report is similar.
Previously it would list every service that had
failed to come up, now it will only list services
that are directly affected by a problem.</div>
<span class="">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>2) I think "38 additional services are
down due to their dependencies being
missing or failed" should have a message
code, not NONE. It's a separate message
that may or may not appear. Plus it's new.
And I think we're better off in these
complex message structures to be precise
vs trying to avoid codes for cosmetic
reasons.</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>Ok.</div>
<span class="HOEnZb"><font color="#888888">
<div><br>
</div>
<div>Stuart</div>
</font></span><span class="">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-">
<div><br>
</div>
<div><br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-m_1451192004882818368HOEnZb"><font
color="#888888">
<div><br>
</div>
<div>Stuart</div>
</font></span><span>
<div><br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote"><span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-">
<div> </div>
<blockquote
class="gmail_quote"
style="margin:0px
0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div><br>
</div>
<div>I am guessing
that this is not
as simple as it
sounds,
otherwise it
would have
already been
addressed, but I
think we can do
better that the
current state of
affairs so I
thought I would
get a discussion
started.</div>
</div>
</blockquote>
<div><br>
</div>
</span>
<div>It sounds pretty
simple. Any "problem"
ServiceController
exposes its
ServiceContainer, and
if relying on that
registry to check if a
missing dependency is
installed is not
correct for some
reason, the
ModelControllerImpl
exposes its
ServiceRegistry via a
package protected
getter. So
AbstractOperationContext
can provide that to
the SVH.</div>
<div><br>
</div>
<blockquote
class="gmail_quote"
style="margin:0px 0px
0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-">
<div dir="ltr">
<div><br>
</div>
<div>Stuart</div>
<div><br>
</div>
<div>[1] <a
href="https://github.com/stuartwdouglas/errorreporting"
target="_blank" moz-do-not-send="true">https://github.com/stuartwdoug<wbr>las/errorreporting</a></div>
<div>[2] <a
href="https://gist.github.com/stuartwdouglas/b52a85813913f3304301eeb1f389fae8"
target="_blank" moz-do-not-send="true">https://gist.github.com/stuart<wbr>wdouglas/b52a85813913f3304301e<wbr>eb1f389fae8</a> </div>
<div>[3] <a
href="https://github.com/stuartwdouglas/wildfly-core/commit/a1fbc831edf290971d54c13dd1c5d15719454f85"
target="_blank" moz-do-not-send="true">https://github.com/stuartw<wbr>douglas/wildfly-core/commit/a1<wbr>fbc831edf290971d54c13dd1c5d157<wbr>19454f85</a></div>
<div>[4] <a
href="https://gist.github.com/stuartwdouglas/14040534da8d07f937d02f2f08099e8d"
target="_blank" moz-do-not-send="true">https://gist.github.com/st<wbr>uartwdouglas/14040534da8d07f93<wbr>7d02f2f08099e8d</a></div>
</div>
<br>
</span>______________________________<wbr>_________________<br>
wildfly-dev mailing
list<br>
<a
href="mailto:wildfly-dev@lists.jboss.org"
target="_blank"
moz-do-not-send="true">wildfly-dev@lists.jboss.org</a><br>
<a
href="https://lists.jboss.org/mailman/listinfo/wildfly-dev"
rel="noreferrer"
target="_blank"
moz-do-not-send="true">https://lists.jboss.org/mailma<wbr>n/listinfo/wildfly-dev</a><span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-HOEnZb"><font
color="#888888"><br>
</font></span></blockquote>
</div>
<span
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-HOEnZb"><font
color="#888888"><br>
<br clear="all">
<div><br>
</div>
-- <br>
<div
class="m_-8471000774253265312gmail-m_1451192004882818368m_273302915848364358gmail-m_7631090312607809384gmail_signature">
<div dir="ltr">Brian
Stansberry
<div>Manager,
Senior Principal
Software
Engineer</div>
<div>Red Hat</div>
</div>
</div>
</font></span></div>
</div>
</blockquote>
</span></div>
<br>
</div>
</div>
</blockquote>
</span></div>
<span class="m_-8471000774253265312gmail-"><br>
<br clear="all">
<div><br>
</div>
-- <br>
<div
class="m_-8471000774253265312gmail-m_1451192004882818368gmail_signature">
<div dir="ltr">Brian Stansberry
<div>Manager, Senior Principal Software
Engineer</div>
<div>Red Hat</div>
</div>
</div>
</span></div>
</div>
</blockquote>
</span></div>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
wildfly-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:wildfly-dev@lists.jboss.org">wildfly-dev@lists.jboss.org</a>
<a class="moz-txt-link-freetext" href="https://lists.jboss.org/mailman/listinfo/wildfly-dev">https://lists.jboss.org/mailman/listinfo/wildfly-dev</a></pre>
</blockquote>
<br>
</body>
</html>