[jboss-jira] [JBoss JIRA] (WFWIP-176) Pod restarted because of failing liveness/rediness Probe

Brian Stansberry (Jira) issues at jboss.org
Mon Aug 19 20:07:00 EDT 2019


    [ https://issues.jboss.org/browse/WFWIP-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772734#comment-13772734 ] 

Brian Stansberry commented on WFWIP-176:
----------------------------------------

[~mchoma] Is it possible that in CD 16 the test simply completed and exited before the probe could fail?

Before CLOUD-2742 internally the python code would loop up to 30 times trying to execute the request to the server, with a 5 sleep each time. And then the yaml configures 3 tries before failure with more sleep. So that's a long time.

Now the python fails immediately.

> Pod restarted because of failing liveness/rediness Probe
> --------------------------------------------------------
>
>                 Key: WFWIP-176
>                 URL: https://issues.jboss.org/browse/WFWIP-176
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: OpenShift
>            Reporter: Martin Choma
>            Assignee: Ken Wills
>            Priority: Major
>
> During testing 73 image I came to case where really corner case is tested [0].
> Test is not using templates for deployment. 
> In tested scenario liveness/readiness probe fails. In CD 17 and eap 73 pod is restarted. In CD 16 however, there was no liveness/rediness failures in events. Pod was not restarted. 
> I dont see differences in pod yaml for CD16 case
> {code}
>       livenessProbe:
>         exec:
>           command:
>             - /bin/bash
>             - '-c'
>             - /opt/eap/bin/livenessProbe.sh
>         failureThreshold: 3
>         periodSeconds: 10
>         successThreshold: 1
>         timeoutSeconds: 1
>       name: weirdusername
>       readinessProbe:
>         exec:
>           command:
>             - /bin/bash
>             - '-c'
>             - /opt/eap/bin/readinessProbe.sh
>         failureThreshold: 3
>         periodSeconds: 10
>         successThreshold: 1
>         timeoutSeconds: 1
> {code}
> and CD 17 case
> {code}
>      livenessProbe:
>         exec:
>           command:
>             - /bin/bash
>             - '-c'
>             - /opt/eap/bin/livenessProbe.sh
>         failureThreshold: 3
>         periodSeconds: 10
>         successThreshold: 1
>         timeoutSeconds: 1
>       name: weirdusername
>       readinessProbe:
>         exec:
>           command:
>             - /bin/bash
>             - '-c'
>             - /opt/eap/bin/readinessProbe.sh
>         failureThreshold: 3
>         periodSeconds: 10
>         successThreshold: 1
>         timeoutSeconds: 1
> {code}
> What could cause this behaviour change? 
> [0] https://issues.jboss.org/browse/CLOUD-1988



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the jboss-jira mailing list