[infinispan-issues] [JBoss JIRA] (ISPN-6673) Implement Rolling Upgrades with Kubernetes

Wednesday, 27 July 2016

    [
https://issues.jboss.org/browse/ISPN-6673?page=com.atlassian.jira.plugin....
] 

Sebastian Łaskawiec edited comment on ISPN-6673 at 7/27/16 8:55 AM:
--------------------------------------------------------------------

The Rolling update for Kubernetes and OpenShift looks the following:
# Create a new app for infinispan (I'm using my own image with additional health and
readiness checks) - {{slaskawi/infinispan-ru-1}}:
{code}
/opt/jboss/infinispan-server/bin/is_ready.sh
#!/bin/bash
for i in `seq 1 10`;
do
  sleep 1s
  /opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=*:read-attribute(name=cache-rebalancing-status)'
| awk '/result/{gsub("\"", "", $3); print $3}' | awk
'{if(NR>1)print}' | grep -v 'PENDING\|IN_PROGRESS\|SUSPENDED'
  if [ $? -eq 0 ]; then
    exit 0
  fi
done
exit 1
{code}
{code}
/opt/jboss/infinispan-server/bin/is_healthy.sh
#!/bin/bash
for i in `seq 1 10`;
do
  sleep 1s
  /opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/:read-attribute(name=server-state)' | awk
'/result/{gsub("\"", "", $3); print $3}' | grep running
  if [ $? -eq 0 ]; then
    exit 0
  fi
done
exit 1
{code}
Since the rebalance status might vary from run to run (imagine a node joining the
cluster), there are two ways to deal with it - either use wait as I did or set
{{successThreshold}} to a number larger than 1 in the deployment configuration.
# Update the deployment configuration:
{code}
apiVersion: v1
kind: DeploymentConfig
metadata:
  name: infinispan-ru-1
  namespace: myproject
  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/infinispan-ru-1
  uid: 6def5411-53e2-11e6-97aa-54ee751d46e3
  resourceVersion: '6570'
  generation: 28
  creationTimestamp: '2016-07-27T10:11:05Z'
  labels:
    app: infinispan-ru-1
  annotations:
    openshift.io/deployment.instantiated: 'true'
    openshift.io/generated-by: OpenShiftNewApp
spec:
  strategy:
    type: Rolling
    rollingParams:
      updatePeriodSeconds: 1
      intervalSeconds: 1
      timeoutSeconds: 600
      maxUnavailable: 0%
      maxSurge: 25%
    resources:
  triggers:
    -
      type: ConfigChange
    -
      type: ImageChange
      imageChangeParams:
        automatic: true
        containerNames:
          - infinispan-ru-1
        from:
          kind: ImageStreamTag
          namespace: myproject
          name: 'infinispan-ru-1:latest'
        lastTriggeredImage:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
  replicas: 5
  test: false
  selector:
    app: infinispan-ru-1
    deploymentconfig: infinispan-ru-1
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: infinispan-ru-1
        deploymentconfig: infinispan-ru-1
      annotations:
        openshift.io/container.infinispan-ru-1.image.entrypoint:
'["/bin/sh","-c","/opt/jboss/infinispan-server/bin/standalone.sh
-c cloud.xml -Djboss.default.jgroups.stack=kubernetes \t-b `hostname -i` \t-bmanagement
`hostname -i`   --debug"]'
        openshift.io/generated-by: OpenShiftNewApp
    spec:
      containers:
        -
          name: infinispan-ru-1
          image:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
          ports:
            -
              containerPort: 8080
              protocol: TCP
            -
              containerPort: 8888
              protocol: TCP
            -
              containerPort: 8181
              protocol: TCP
            -
              containerPort: 9990
              protocol: TCP
            -
              containerPort: 11211
              protocol: TCP
            -
              containerPort: 11222
              protocol: TCP
          env:
            -
              name: OPENSHIFT_KUBE_PING_NAMESPACE
              value: myproject
          resources:
          livenessProbe:
            exec:
              command: [/opt/jboss/infinispan-server/bin/is_ready.sh]
            initialDelaySeconds: 60
            timeoutSeconds: 180
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            exec:
              command: [/opt/jboss/infinispan-server/bin/is_healthy.sh]
            initialDelaySeconds: 60
            timeoutSeconds: 180
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          imagePullPolicy: Always
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext:
status:
  latestVersion: 18
  observedGeneration: 28
  replicas: 5
  updatedReplicas: 5
  availableReplicas: 5
  details:
    causes:
      -
        type: ConfigChange
{code}
Key features:
** Use proper configuration for liveness and readiness probs
** Use {{maxUnavailable: 0%}} and {{maxSurge: 25%}} which means that OpenShift will first
create some new nodes, wait for rebalance and then will start destroying existing
# Redeploy (update config, image whatever) the application:
{code}
oc deploy infinispan-ru-1 --latest -n myproject
{code}
# Check if the number of entries is the same at the end of the procedure.
# Observations:
** It takes some time for the node to properly join the cluster. Readiness probe should
probably pass more than once in production configuration
** Even though the readiness probe passes - it doesn't necessarily mean that the
joined the cluster. During the testing I once had a slit brain (4 nodes vs 1 node). This
is a very dangerous situation. A readiness and health check should always validate if
number of nodes in the cluster is ok.
** The nodes are currently not killed properly (they should always perform a graceful
shutdown

was (Author: sebastian.laskawiec):
The Rolling update for Kubernetes and OpenShift looks the following:
# Create a new app for infinispan (I'm using my own image with additional health and
readiness checks) - {{slaskawi/infinispan-ru-1}}:
{code}
/opt/jboss/infinispan-server/bin/is_ready.sh
#!/bin/bash
for i in `seq 1 10`;
do
  sleep 1s
  /opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=*:read-attribute(name=cache-rebalancing-status)'
| awk '/result/{gsub("\"", "", $3); print $3}' | awk
'{if(NR>1)print}' | grep -v 'PENDING\|IN_PROGRESS\|SUSPENDED'
  if [ $? -eq 0 ]; then
    exit 0
  fi
done
exit 1
{code}
{code}
/opt/jboss/infinispan-server/bin/is_healthy.sh
#!/bin/bash
for i in `seq 1 10`;
do
  sleep 1s
  /opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/:read-attribute(name=server-state)' | awk
'/result/{gsub("\"", "", $3); print $3}' | grep running
  if [ $? -eq 0 ]; then
    exit 0
  fi
done
exit 1
{code}
Since the rebalance status might vary from run to run (imagine a node joining the
cluster), there are two ways to deal with it - either use wait as I did or set
{{successThreshold}} to a number larger than 1 in the deployment configuration.
# Update the deployment configuration:
{code}
apiVersion: v1
kind: DeploymentConfig
metadata:
  name: infinispan-ru-1
  namespace: myproject
  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/infinispan-ru-1
  uid: 6def5411-53e2-11e6-97aa-54ee751d46e3
  resourceVersion: '6570'
  generation: 28
  creationTimestamp: '2016-07-27T10:11:05Z'
  labels:
    app: infinispan-ru-1
  annotations:
    openshift.io/deployment.instantiated: 'true'
    openshift.io/generated-by: OpenShiftNewApp
spec:
  strategy:
    type: Rolling
    rollingParams:
      updatePeriodSeconds: 1
      intervalSeconds: 1
      timeoutSeconds: 600
      maxUnavailable: 0%
      maxSurge: 25%
    resources:
  triggers:
    -
      type: ConfigChange
    -
      type: ImageChange
      imageChangeParams:
        automatic: true
        containerNames:
          - infinispan-ru-1
        from:
          kind: ImageStreamTag
          namespace: myproject
          name: 'infinispan-ru-1:latest'
        lastTriggeredImage:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
  replicas: 5
  test: false
  selector:
    app: infinispan-ru-1
    deploymentconfig: infinispan-ru-1
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: infinispan-ru-1
        deploymentconfig: infinispan-ru-1
      annotations:
        openshift.io/container.infinispan-ru-1.image.entrypoint:
'["/bin/sh","-c","/opt/jboss/infinispan-server/bin/standalone.sh
-c cloud.xml -Djboss.default.jgroups.stack=kubernetes \t-b `hostname -i` \t-bmanagement
`hostname -i`   --debug"]'
        openshift.io/generated-by: OpenShiftNewApp
    spec:
      containers:
        -
          name: infinispan-ru-1
          image:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
          ports:
            -
              containerPort: 8080
              protocol: TCP
            -
              containerPort: 8888
              protocol: TCP
            -
              containerPort: 8181
              protocol: TCP
            -
              containerPort: 9990
              protocol: TCP
            -
              containerPort: 11211
              protocol: TCP
            -
              containerPort: 11222
              protocol: TCP
          env:
            -
              name: OPENSHIFT_KUBE_PING_NAMESPACE
              value: myproject
          resources:
          livenessProbe:
            exec:
              command: [/opt/jboss/infinispan-server/bin/is_ready.sh]
            initialDelaySeconds: 60
            timeoutSeconds: 180
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            exec:
              command: [/opt/jboss/infinispan-server/bin/is_healthy.sh]
            initialDelaySeconds: 60
            timeoutSeconds: 180
            periodSeconds: 10
            successThreshold: 1
            failureThreshold: 3
          terminationMessagePath: /dev/termination-log
          imagePullPolicy: Always
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext:
status:
  latestVersion: 18
  observedGeneration: 28
  replicas: 5
  updatedReplicas: 5
  availableReplicas: 5
  details:
    causes:
      -
        type: ConfigChange
{code}
Key features:
** Use proper configuration for liveness and readiness probs
** Use {{maxUnavailable: 0%}} and {{maxSurge: 25%}} which means that OpenShift will first
create some new nodes, wait for rebalance and then will start destroying existing
# Redeploy (update config, image whatever) the application:
{code}
oc deploy infinispan-ru-1 --latest -n myproject
{code}
# Check if the number of entries is the same at the end of the procedure.
# Observations:
* It takes some time for the node to properly join the cluster. Readiness probe should
probably pass more than once in production configuration
* Even though the readiness probe passes - it doesn't necessarly mean that the joined
the cluster. During the testing I once had a slit brain (4 nodes vs 1 node). This is a
very dangereus situation. A readiness and health check should always validate if number of
nodes in the cluster is ok.
* The nodes are currently not killed properly (they should always perform a graceful
shutdown

...
 Implement Rolling Upgrades with Kubernetes
 ------------------------------------------

                 Key: ISPN-6673
                 URL: https://issues.jboss.org/browse/ISPN-6673
             Project: Infinispan
          Issue Type: Feature Request
          Components: Cloud Integrations
            Reporter: Sebastian Łaskawiec
            Assignee: Sebastian Łaskawiec

 There are 2 mechanisms which seems to do the same but are totally different:
 * [Kubernetes Rolling Update|http://kubernetes.io/docs/user-guide/rolling-updates/] -
replaces Pods in controllable fashon
 * [Infinispan Rolling
Updgrate|http://infinispan.org/docs/stable/user_guide/user_guide.html#_Ro...] -
a procedure for upgrading Infinispan or changing the configuration
 Kubernetes Rolling Updates can be used very easily for changing the configuration however
if changes are not runtime-compatible, one might loss data. Potential way to avoid this is
to use a Cache Store. All other changes must be propagated using Infinispan Rolling
Upgrade procedure. 

--
This message was sent by Atlassian JIRA
(v6.4.11#64026)

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[infinispan-issues] [JBoss JIRA] (ISPN-6673) Implement Rolling Upgrades with Kubernetes