[jboss-jira] [JBoss JIRA] (WFLY-12952) MP Health returns UP when checks are expected but not installed yet.

Jeff Mesnil (Jira) issues at jboss.org
Fri Jan 10 04:51:51 EST 2020


    [ https://issues.redhat.com/browse/WFLY-12952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945530#comment-13945530 ] 

Jeff Mesnil commented on WFLY-12952:
------------------------------------

[~istraka] The culprit of this issue is the notion of "expected" health check procedures.

WildFly allows deployment of Jakarta EE application without any a priori knowledge. The application server does not when when an application will be deployed or if that application contains health check procedures.

That's why the default for the readiness endpoint is to return UP. It means that the app server does not expect any health check procedures from the deployments.
We also want to handle the case where the user knows that that there will be health check procedures in the deployment.
In that case, WildFly needs a hint about that and the user must specify the env var MP_HEALTH_EMPTY_READINESS_CHECKS_STATUS=DOWN, so that WildFly knows that it must return DOWN until there are application readiness probes to take into account to determine the overall readiness of the app server + application.

My understanding is that we respect the spec. We need a hint (or configuration) from the user but this is not against the spec.


> MP Health returns UP when checks are expected but not installed yet.
> --------------------------------------------------------------------
>
>                 Key: WFLY-12952
>                 URL: https://issues.redhat.com/browse/WFLY-12952
>             Project: WildFly
>          Issue Type: Bug
>          Components: MP Health
>    Affects Versions: 18.0.0.Final
>            Reporter: Ivan Straka
>            Assignee: Jeff Mesnil
>            Priority: Blocker
>
> MicroProfile Health 2.0 specification [link|https://github.com/eclipse/microprofile-health/blob/2.0/spec/src/main/asciidoc/protocol-wireformat.adoc] says:
> * A producer MUST support custom, application level health check procedures
> * A producer SHOULD support reasonable out-of-the-box procedures
> * A producer with no health check procedures expected or installed MUST return positive overall status (i.e. HTTP 200)
> * A producer with health check procedures expected but not yet installed MUST return negative overall status (i.e. HTTP 503)
> When I deploy and application with a readiness probe before WildFly is started, from my and namely OpenShift POV the health check procedure is expected from the very beginning.
> _Let me note that on OpenShift starting the served should mean starting the service._
> Hence I expect negative overall status till the probe is ready and is able to provide response.
> However WildFly with default setting responses with status UP:
> {code:bash}
> while true; do echo $(date +"%T.%3N") ;  curl   localhost:9990/health/ready; echo ""; done
> 17:17:56.438 curl: (7) Failed to connect to localhost port 9990: Connection refused
> 17:17:56.452 curl: (7) Failed to connect to localhost port 9990: Connection refused
> 17:17:56.466 {"status":"UP","checks":[]}
> ...
> 17:18:01.121 {"status":"UP","checks":[]}
> 17:18:01.133 {"status":"DOWN","checks":[{"name":"delayed-readiness","status":"DOWN"}]}
> {code}
> This violates (4) bullet in the specification. 
> WildFly provides option to set global Status when probes are not defined ([documentation|https://doc-stage.usersys.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform/7.3/html-single/configuration_guide/index#global_statuses_undefined_probes])
> Which would mean the scenario would behave according to the specification. Yet the default state violates it.
> If the default value were _DOWN_ we would run into an issue if WildFly without deployment were used (for example as backup for AMQ). The status would be just DOWN:
> {noformat}
> 17:22:41.719
> {"status":"DOWN","checks":[]}
> {noformat}
> And that would violate (3) in the specification.
> TCK tests do not cover the scenario well.
> Is there a way to return _DOWN_ until WildFly scan a deployment and if no health check is found (thus not expected) then start to return _UP_ ? The scan should happen during MP Health initialization. 
> If there is no deployment, the _UP_ it is.
> Setting the priority to blocker since WildFly 19 shall be EAP 7.3.0.CD19 which is supposed to run on OpenShift. With this behavior health check is not very useful because:
> * OpenShift starts the service wait some time and start asking for health status
> * WildFly responses _UP_ yet application helathcheck is not installed yet
> * With health status _UP_ OpenShift consider a pod ready
> * In this point application is installed, health status is _DOWN_  (a DB is down) however OpenShift flow is somewhere else



--
This message was sent by Atlassian Jira
(v7.13.8#713008)


More information about the jboss-jira mailing list