[infinispan-issues] [JBoss JIRA] (ISPN-7494) Prevent Kubernetes from killing 2 nodes at the same time

Wed Feb 22 08:16:00 EST 2017

    [ https://issues.jboss.org/browse/ISPN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13367502#comment-13367502 ] 

Sebastian Łaskawiec edited comment on ISPN-7494 at 2/22/17 8:15 AM:
--------------------------------------------------------------------

I found the solution. It was the matter of Liveness and Readiness probes misconfiguration.

The main idea is that we need to perform multiple checks and mark the probe as failed if several of them fail. Here's an example:
{code}
          livenessProbe:
            exec:
              command:
              - /usr/local/bin/is_running.sh
            initialDelaySeconds: 10
            timeoutSeconds: 80
            periodSeconds: 60
            successThreshold: 1
            failureThreshold: 5
          readinessProbe:
             exec:
                command:
                - /usr/local/bin/is_healthy.sh
             initialDelaySeconds: 10
             timeoutSeconds: 40
             periodSeconds: 30
             successThreshold: 2
             failureThreshold: 5
{code}

was (Author: sebastian.laskawiec):
I found the solution. It was the matter of Liveness and Readiness probes misconfiguration.

The main idea is that we need to perform multiple checks and mark the probe as failed if several of them fail. Here's an example:
{code}
livenessProbe:
            exec:
              command:
              - /usr/local/bin/is_running.sh
            initialDelaySeconds: 10
            timeoutSeconds: 80
            periodSeconds: 60
            successThreshold: 1
            failureThreshold: 5
          readinessProbe:
             exec:
                command:
                - /usr/local/bin/is_healthy.sh
{code}

> Prevent Kubernetes from killing 2 nodes at the same time
> --------------------------------------------------------
>
>                 Key: ISPN-7494
>                 URL: https://issues.jboss.org/browse/ISPN-7494
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Cloud Integrations
>    Affects Versions: 9.0.0.Beta2
>         Environment: * OpenShift {{v1.5.0-alpha.2+e4b43ee}}
> * Infinispan Server 9.0.0.Beta2
>            Reporter: Sebastian Łaskawiec
>            Assignee: Sebastian Łaskawiec
>            Priority: Blocker
>
> When I was performing [Spring Session and Kubernetes Rolling Update demo|https://bluejeans.com/s/pYKUg/] I encountered a couple of problems.
> One of the is this:
> {noformat}
> [transactions-repository-1-hqz3v] *** JBossAS process (83) received TERM signal ***
> [transactions-repository-1-dwl81] 09:52:09,522 INFO  [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> [transactions-repository-1-dwl81] *** JBossAS process (80) received TERM signal ***
> [transactions-repository-1-hqz3v] 09:52:09,526 INFO  [org.jboss.as.server] (Thread-2) WFLYSRV0220: Server shutdown has been requested via an OS signal
> {noformat}
> Full logs from Rolling Update process might be found here: https://gist.github.com/slaskawi/2308b4c5e9bbf523fb3e02a7cc45fa24
> Steps to reproduce:
> * Start local OpenShift Cluster
> * invoke `./init_infrastructure.sh` from https://github.com/slaskawi/presentations/tree/ISPN-7487-reproducer
> * invoke `cd transaction-creator && mvn fabric8:run`
> * Start Spring Session Demo `cd session-demo && mvn fabric8:run`
> * Create a client which inserts data (`watch -n 0.5 curl http://<spring-session-demo-pod-ip>/sessions`) and at the same time invoke the rolling update: `oc deploy transactions-repository --latest -n myproject`
> * Observe logs `kubetail -l environment=infrastructure`

--
This message was sent by Atlassian JIRA
(v7.2.3#72005)