[
https://issues.jboss.org/browse/ISPN-6673?page=com.atlassian.jira.plugin....
]
Sebastian Łaskawiec edited comment on ISPN-6673 at 7/27/16 8:55 AM:
--------------------------------------------------------------------
The Rolling update for Kubernetes and OpenShift looks the following:
# Create a new app for infinispan (I'm using my own image with additional health and
readiness checks) - {{slaskawi/infinispan-ru-1}}:
{code}
/opt/jboss/infinispan-server/bin/is_ready.sh
#!/bin/bash
for i in `seq 1 10`;
do
sleep 1s
/opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=*:read-attribute(name=cache-rebalancing-status)'
| awk '/result/{gsub("\"", "", $3); print $3}' | awk
'{if(NR>1)print}' | grep -v 'PENDING\|IN_PROGRESS\|SUSPENDED'
if [ $? -eq 0 ]; then
exit 0
fi
done
exit 1
{code}
{code}
/opt/jboss/infinispan-server/bin/is_healthy.sh
#!/bin/bash
for i in `seq 1 10`;
do
sleep 1s
/opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/:read-attribute(name=server-state)' | awk
'/result/{gsub("\"", "", $3); print $3}' | grep running
if [ $? -eq 0 ]; then
exit 0
fi
done
exit 1
{code}
Since the rebalance status might vary from run to run (imagine a node joining the
cluster), there are two ways to deal with it - either use wait as I did or set
{{successThreshold}} to a number larger than 1 in the deployment configuration.
# Update the deployment configuration:
{code}
apiVersion: v1
kind: DeploymentConfig
metadata:
name: infinispan-ru-1
namespace: myproject
selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/infinispan-ru-1
uid: 6def5411-53e2-11e6-97aa-54ee751d46e3
resourceVersion: '6570'
generation: 28
creationTimestamp: '2016-07-27T10:11:05Z'
labels:
app: infinispan-ru-1
annotations:
openshift.io/deployment.instantiated: 'true'
openshift.io/generated-by: OpenShiftNewApp
spec:
strategy:
type: Rolling
rollingParams:
updatePeriodSeconds: 1
intervalSeconds: 1
timeoutSeconds: 600
maxUnavailable: 0%
maxSurge: 25%
resources:
triggers:
-
type: ConfigChange
-
type: ImageChange
imageChangeParams:
automatic: true
containerNames:
- infinispan-ru-1
from:
kind: ImageStreamTag
namespace: myproject
name: 'infinispan-ru-1:latest'
lastTriggeredImage:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
replicas: 5
test: false
selector:
app: infinispan-ru-1
deploymentconfig: infinispan-ru-1
template:
metadata:
creationTimestamp: null
labels:
app: infinispan-ru-1
deploymentconfig: infinispan-ru-1
annotations:
openshift.io/container.infinispan-ru-1.image.entrypoint:
'["/bin/sh","-c","/opt/jboss/infinispan-server/bin/standalone.sh
-c cloud.xml -Djboss.default.jgroups.stack=kubernetes \t-b `hostname -i` \t-bmanagement
`hostname -i` --debug"]'
openshift.io/generated-by: OpenShiftNewApp
spec:
containers:
-
name: infinispan-ru-1
image:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
ports:
-
containerPort: 8080
protocol: TCP
-
containerPort: 8888
protocol: TCP
-
containerPort: 8181
protocol: TCP
-
containerPort: 9990
protocol: TCP
-
containerPort: 11211
protocol: TCP
-
containerPort: 11222
protocol: TCP
env:
-
name: OPENSHIFT_KUBE_PING_NAMESPACE
value: myproject
resources:
livenessProbe:
exec:
command: [/opt/jboss/infinispan-server/bin/is_ready.sh]
initialDelaySeconds: 60
timeoutSeconds: 180
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
readinessProbe:
exec:
command: [/opt/jboss/infinispan-server/bin/is_healthy.sh]
initialDelaySeconds: 60
timeoutSeconds: 180
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
imagePullPolicy: Always
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext:
status:
latestVersion: 18
observedGeneration: 28
replicas: 5
updatedReplicas: 5
availableReplicas: 5
details:
causes:
-
type: ConfigChange
{code}
Key features:
** Use proper configuration for liveness and readiness probs
** Use {{maxUnavailable: 0%}} and {{maxSurge: 25%}} which means that OpenShift will first
create some new nodes, wait for rebalance and then will start destroying existing
# Redeploy (update config, image whatever) the application:
{code}
oc deploy infinispan-ru-1 --latest -n myproject
{code}
# Check if the number of entries is the same at the end of the procedure.
# Observations:
** It takes some time for the node to properly join the cluster. Readiness probe should
probably pass more than once in production configuration
** Even though the readiness probe passes - it doesn't necessarily mean that the
joined the cluster. During the testing I once had a slit brain (4 nodes vs 1 node). This
is a very dangerous situation. A readiness and health check should always validate if
number of nodes in the cluster is ok.
** The nodes are currently not killed properly (they should always perform a graceful
shutdown
was (Author: sebastian.laskawiec):
The Rolling update for Kubernetes and OpenShift looks the following:
# Create a new app for infinispan (I'm using my own image with additional health and
readiness checks) - {{slaskawi/infinispan-ru-1}}:
{code}
/opt/jboss/infinispan-server/bin/is_ready.sh
#!/bin/bash
for i in `seq 1 10`;
do
sleep 1s
/opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/subsystem=datagrid-infinispan/cache-container=clustered/distributed-cache=*:read-attribute(name=cache-rebalancing-status)'
| awk '/result/{gsub("\"", "", $3); print $3}' | awk
'{if(NR>1)print}' | grep -v 'PENDING\|IN_PROGRESS\|SUSPENDED'
if [ $? -eq 0 ]; then
exit 0
fi
done
exit 1
{code}
{code}
/opt/jboss/infinispan-server/bin/is_healthy.sh
#!/bin/bash
for i in `seq 1 10`;
do
sleep 1s
/opt/jboss/infinispan-server/bin/ispn-cli.sh -c --controller=$(hostname -i):9990
'/:read-attribute(name=server-state)' | awk
'/result/{gsub("\"", "", $3); print $3}' | grep running
if [ $? -eq 0 ]; then
exit 0
fi
done
exit 1
{code}
Since the rebalance status might vary from run to run (imagine a node joining the
cluster), there are two ways to deal with it - either use wait as I did or set
{{successThreshold}} to a number larger than 1 in the deployment configuration.
# Update the deployment configuration:
{code}
apiVersion: v1
kind: DeploymentConfig
metadata:
name: infinispan-ru-1
namespace: myproject
selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/infinispan-ru-1
uid: 6def5411-53e2-11e6-97aa-54ee751d46e3
resourceVersion: '6570'
generation: 28
creationTimestamp: '2016-07-27T10:11:05Z'
labels:
app: infinispan-ru-1
annotations:
openshift.io/deployment.instantiated: 'true'
openshift.io/generated-by: OpenShiftNewApp
spec:
strategy:
type: Rolling
rollingParams:
updatePeriodSeconds: 1
intervalSeconds: 1
timeoutSeconds: 600
maxUnavailable: 0%
maxSurge: 25%
resources:
triggers:
-
type: ConfigChange
-
type: ImageChange
imageChangeParams:
automatic: true
containerNames:
- infinispan-ru-1
from:
kind: ImageStreamTag
namespace: myproject
name: 'infinispan-ru-1:latest'
lastTriggeredImage:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
replicas: 5
test: false
selector:
app: infinispan-ru-1
deploymentconfig: infinispan-ru-1
template:
metadata:
creationTimestamp: null
labels:
app: infinispan-ru-1
deploymentconfig: infinispan-ru-1
annotations:
openshift.io/container.infinispan-ru-1.image.entrypoint:
'["/bin/sh","-c","/opt/jboss/infinispan-server/bin/standalone.sh
-c cloud.xml -Djboss.default.jgroups.stack=kubernetes \t-b `hostname -i` \t-bmanagement
`hostname -i` --debug"]'
openshift.io/generated-by: OpenShiftNewApp
spec:
containers:
-
name: infinispan-ru-1
image:
'slaskawi/infinispan-ru-1@sha256:6d2de3cad2970fcb1207df2b7f947a74c990f5be2e02bc9aaf9671098547bc82'
ports:
-
containerPort: 8080
protocol: TCP
-
containerPort: 8888
protocol: TCP
-
containerPort: 8181
protocol: TCP
-
containerPort: 9990
protocol: TCP
-
containerPort: 11211
protocol: TCP
-
containerPort: 11222
protocol: TCP
env:
-
name: OPENSHIFT_KUBE_PING_NAMESPACE
value: myproject
resources:
livenessProbe:
exec:
command: [/opt/jboss/infinispan-server/bin/is_ready.sh]
initialDelaySeconds: 60
timeoutSeconds: 180
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
readinessProbe:
exec:
command: [/opt/jboss/infinispan-server/bin/is_healthy.sh]
initialDelaySeconds: 60
timeoutSeconds: 180
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
imagePullPolicy: Always
restartPolicy: Always
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext:
status:
latestVersion: 18
observedGeneration: 28
replicas: 5
updatedReplicas: 5
availableReplicas: 5
details:
causes:
-
type: ConfigChange
{code}
Key features:
** Use proper configuration for liveness and readiness probs
** Use {{maxUnavailable: 0%}} and {{maxSurge: 25%}} which means that OpenShift will first
create some new nodes, wait for rebalance and then will start destroying existing
# Redeploy (update config, image whatever) the application:
{code}
oc deploy infinispan-ru-1 --latest -n myproject
{code}
# Check if the number of entries is the same at the end of the procedure.
# Observations:
* It takes some time for the node to properly join the cluster. Readiness probe should
probably pass more than once in production configuration
* Even though the readiness probe passes - it doesn't necessarly mean that the joined
the cluster. During the testing I once had a slit brain (4 nodes vs 1 node). This is a
very dangereus situation. A readiness and health check should always validate if number of
nodes in the cluster is ok.
* The nodes are currently not killed properly (they should always perform a graceful
shutdown
Implement Rolling Upgrades with Kubernetes
------------------------------------------
Key: ISPN-6673
URL:
https://issues.jboss.org/browse/ISPN-6673
Project: Infinispan
Issue Type: Feature Request
Components: Cloud Integrations
Reporter: Sebastian Łaskawiec
Assignee: Sebastian Łaskawiec
There are 2 mechanisms which seems to do the same but are totally different:
* [Kubernetes Rolling Update|http://kubernetes.io/docs/user-guide/rolling-updates/] -
replaces Pods in controllable fashon
* [Infinispan Rolling
Updgrate|http://infinispan.org/docs/stable/user_guide/user_guide.html#_Ro...] -
a procedure for upgrading Infinispan or changing the configuration
Kubernetes Rolling Updates can be used very easily for changing the configuration however
if changes are not runtime-compatible, one might loss data. Potential way to avoid this is
to use a Cache Store. All other changes must be propagated using Infinispan Rolling
Upgrade procedure.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)