Re: [wildfly-dev] A look at Eclipse MicroProfile Healthcheck

Thursday, 6 July 2017

...
 Hi,

 I had a look at the Eclipse MicroProfile Healthcheck spec[1] and wanted to
 share some thoughts and experiments about it, how it relates to WildFly and
 its use in containers (such as OpenShift).

 # Eclipse MicroProfile Healthcheck

 The Eclipse MicroProfile Healthcheck (MPHC for short) is a specification to
 determine the healthiness of an application.
 It defines a Health Check Procedure (HCP for short) interface that can be
 implemented by application to determine its healthiness. It’s a single
 method that returns a Health Status: either UP or DOWN (+ some metadata).
 Typically, an application would provide one or more HCP to check healthiness
 of its parts.
 The overall healthiness of the application is determined by the aggregation
 of all the HCP provided by the application. If any HCP is DOWN, the overall
 outcome is DOWN. Else the application is considered as UP.

 The MPHC spec has a companion document[2] that specifies an HTTP format to
 check the healthiness of an application.

 Heiko is leading the spec and Swarm is the sample implementation for it
 (MicroProfile does not have the notion of reference implementation).
 The spec is still in flux and we have a good opportunity to contribute to it
 to ensure that it meets our requirements and use cases.

 # Use case

 Using the HTTP endpoint, a container can ask an application whether it is
 healthy. If it is not healthy, the container could stop the application and
 respin a new instance.
 For example, OpenShift/Kubernetes can configure liveness probes[3][4].

 Supporting MPHC in WildFly would allow a better integration with containers
 and ensure that any unhealthy WildFly  process is restarted promptly.

 # Prototype

 I’ve written a prototype of a WildFly extension to support MPHC for
 applications deployed in WildFly *and* add health check procedures inside
 WildFly:

 https://github.com/jmesnil/wildfly-microprofile-health

 and it passes the MPHC tck :)

 The microprofile-health subsystem supports an operation to check the health
 of the app server:

 [standalone@localhost:9990 /] /subsystem=microprofile-health:check
 {
     "outcome" => "success",
     "result" => {
         "checks" => [{
             "id" => "heap-memory",
             "result" => "UP",
             "data" => {
                 "max" => "477626368",
                 "used" => "156216336"
             }
         }],
         "outcome" => "UP"
     }
 }

 It also exposes an (unauthenticated) HTTP endpoint:

 $ curl http://localhost:8080/health/:
 {
    "checks":[
       {
          "id":"heap-memory",
          "result":"UP",
          "data":{
             "max":"477626368",
             "used":"160137128"
          }
       }
    ],
    "outcome":"UP"
 }

 This HTTP endpoint can be used by OpenShift for its liveness probe. 
Regarding the probes, three states would be best, if you can swing it, as OpenShift
defines two probe types: liveness and readiness.  Live is running, but unable to handle
requests, while ready means it's running and able to handle requests.  For example,
while the server is initializing, it's alive, but not ready.  Something to think
about.

...

 Any deployment that defines Health Check Procedures will have them registered
 to determine the overall healthiness of the process.

 # WildFly health check procedures

 The MPHC specification mainly targets user applications that can apply
 application logic to determine their healthiness.
 However I wonder if we could reuse the concepts *inside* WildFly. There are
 things that we could check to determine if the App server runtime is
 healthy, e.g.:
 * The amount of heap memory is close to the max
 * some deployments have failed
 * Excessive GC
 * Running out of disk space

 Subsystems inside WildFly could provide Health check procedures that would be
 queried to check the overall healthiness.
 We could for example provide a health check that the used heap memory is less
 that 90% of the max:

         HealthCheck.install(context, "heap-memory", () -> {
             MemoryMXBean memoryBean = ManagementFactory.getMemoryMXBean();
             long memUsed = memoryBean.getHeapMemoryUsage().getUsed();
             long memMax = memoryBean.getHeapMemoryUsage().getMax();
             HealthResponse response = HealthResponse.named("heap-memory")
                     .withAttribute("used", memUsed)
                     .withAttribute("max", memMax);
             // status is is down is used memory is greater than 90% of max
             memory.
             HealthStatus status = (memUsed < memMax * 0.9) ? response.up() :
             response.down();
             return status;
         });

 HealthCheck.install creates a MSC service and makes sure that is is
 registered by the health monitor that queries all the procedures.
 A subsystem would just have to call HealthCheck.install/uninstall with a
 Health check procedures to help determine the healthiness of the app server.

 What do you think about this use case?

 I even wonder if this is something that should be instead provided by our
 core-management subsystem with a private API (1 interface and some data
 structures).
 The microprofile-health extension would then map our private API to the MPHC
 spec and handled health check procedures coming from deployments.

 # Summary

 To better integrate WildFly with OpenShift, we should provide a way to let
 OpenShift checks the healthiness of WildFly. The MPHC spec is a good
 candidate to provide such feature.
 It is worth exploring how we could leverage it for user deployments and also
 for WildFly internals (when that makes sense).
 Swarm is providing an implementation of the MPHC, we also need to see how we
 can collaborate between WildFly and Swarm to avoid duplicating code and
 efforts from providing the same feature to our users.

 jeff

 [1]

https://github.com/eclipse/microprofile-evolution-process/blob/master/pro...
 [2]

https://github.com/eclipse/microprofile-evolution-process/blob/master/pro...
 [3]
 https://docs.openshift.com/enterprise/3.0/dev_guide/application_health.html
 [4]
 https://kubernetes.io/v1.0/docs/user-guide/walkthrough/k8s201.html#health...
 --
 Jeff Mesnil
 JBoss, a division of Red Hat
 http://jmesnil.net/

 _______________________________________________
 wildfly-dev mailing list
 wildfly-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/wildfly-dev 

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [wildfly-dev] A look at Eclipse MicroProfile Healthcheck