On 6 Jul 2017, at 16:13, Rob Cernich <rcernich(a)redhat.com>
wrote:
> Hi,
>
> I had a look at the Eclipse MicroProfile Healthcheck spec[1] and wanted to
> share some thoughts and experiments about it, how it relates to WildFly and
> its use in containers (such as OpenShift).
>
> # Eclipse MicroProfile Healthcheck
>
> The Eclipse MicroProfile Healthcheck (MPHC for short) is a specification to
> determine the healthiness of an application.
> It defines a Health Check Procedure (HCP for short) interface that can be
> implemented by application to determine its healthiness. It’s a single
> method that returns a Health Status: either UP or DOWN (+ some metadata).
> Typically, an application would provide one or more HCP to check healthiness
> of its parts.
> The overall healthiness of the application is determined by the aggregation
> of all the HCP provided by the application. If any HCP is DOWN, the overall
> outcome is DOWN. Else the application is considered as UP.
>
> The MPHC spec has a companion document[2] that specifies an HTTP format to
> check the healthiness of an application.
>
> Heiko is leading the spec and Swarm is the sample implementation for it
> (MicroProfile does not have the notion of reference implementation).
> The spec is still in flux and we have a good opportunity to contribute to it
> to ensure that it meets our requirements and use cases.
>
> # Use case
>
> Using the HTTP endpoint, a container can ask an application whether it is
> healthy. If it is not healthy, the container could stop the application and
> respin a new instance.
> For example, OpenShift/Kubernetes can configure liveness probes[3][4].
>
> Supporting MPHC in WildFly would allow a better integration with containers
> and ensure that any unhealthy WildFly process is restarted promptly.
>
> # Prototype
>
> I’ve written a prototype of a WildFly extension to support MPHC for
> applications deployed in WildFly *and* add health check procedures inside
> WildFly:
>
>
https://github.com/jmesnil/wildfly-microprofile-health
>
> and it passes the MPHC tck :)
>
> The microprofile-health subsystem supports an operation to check the health
> of the app server:
>
> [standalone@localhost:9990 /] /subsystem=microprofile-health:check
> {
> "outcome" => "success",
> "result" => {
> "checks" => [{
> "id" => "heap-memory",
> "result" => "UP",
> "data" => {
> "max" => "477626368",
> "used" => "156216336"
> }
> }],
> "outcome" => "UP"
> }
> }
>
> It also exposes an (unauthenticated) HTTP endpoint:
>
> $ curl
http://localhost:8080/health/:
> {
> "checks":[
> {
> "id":"heap-memory",
> "result":"UP",
> "data":{
> "max":"477626368",
> "used":"160137128"
> }
> }
> ],
> "outcome":"UP"
> }
>
> This HTTP endpoint can be used by OpenShift for its liveness probe.
Regarding the probes, three states would be best, if you can swing it, as OpenShift
defines two probe types: liveness and readiness. Live is running, but unable to handle
requests, while ready means it's running and able to handle requests. For example,
while the server is initializing, it's alive, but not ready. Something to think about.
Three states (red/orange/green) was discussed when the healthcheck API was proposed. It
was rejected as it puts the burden on the consumer to determine the overall healthiness.
Besides, Kubernetes expects a binary response from its probes. If the HTTP status code is
between 200 and 400, it is successful[1]. Anything else is considered as a failure.
Kubernetes distinguishes between readiness and liveness. As defined, the healthcheck API
deals mainly with liveness.
But it could be possible provide some annotation to specify that some health check
procedures can determine when an application is ready.
For example, WildFly could be considered ready (i.e. Kubernetes will start to route
request to it) when:
* it status health check is “up and running”
* its deployment health check verifies that all its deployments are enabled.
We could then provide a 2nd HTTP endpoint that Kubernetes could query to check that the
server is ready to serve requests.
jeff
[1]