New Hawkular Blog Post: Canary Deployment in OpenShift using OpenTracing based Application Metrics

Monday, 21 August 2017

New Hawkular blog post from noreply(a)hawkular.org (Gary Brown): http://ift.tt/2vWrA6Y

In a previous article we showed how OpenTracing instrumentation can be used to collect
application metrics, in addition to (but independent from) reported tracing data, from
services deployed within a cloud environment (e.g. Kubernetes or OpenShift).

In this article we will show how this information can be used to aid a Canary deployment
strategy within OpenShift.

Figure 1: Error ratio per service and version

The updated example application

We will be using the same example as used in the previous article.

However since writing that article, the configuration of the tracer and Prometheus metrics
support has been simplified. There is now no explicit configuration of either, with only
some auto configuration of MetricLabel beans to identify some custom labels to be added to
the Prometheus metrics, e.g.

Metrics configuration used in both services:

@Configuration
public class MetricsConfiguration {

    @Bean
    public MetricLabel transactionLabel() {
        return new BaggageMetricLabel(&quot;transaction&quot;,
&quot;n/a&quot;); (1)
    }

    @Bean
    public MetricLabel versionLabel() {
        return new ConstMetricLabel(&quot;version&quot;,
System.getenv(&quot;VERSION&quot;)); (2)
    }

}

1

This metric label identifies the business transaction associated with the metrics, which
can be used to isolate the specific number of requests, duration and errors that occurred
when the service was used within the particular business transaction

2

This metric label identifies the service version, which is especially useful in the Canary
deployment use case being discussed in this article

The first step is to following the instructions in the example for deploying and using the
services within OpenShift.

Once the ./genorders.sh script has been running for a while, to generate plenty of metrics
for version 0.0.1 of the services, then deploy the new version of the services. This is
achieved by:

updating the versions in the pom.xml files, within the simple/accountmgr and
simple/ordermgr folders from 0.0.1 to 0.0.2

re-run the mvn clean install docker:build command from the simple folder

deploy the canary versions of the services using the command oc create -f
services-canary-kubernetes.yml

As our services accountmgr and ordermgr determine the backing deployment based on the
respective labels app: accountmgr and app: ordermgr, simply having a second deployment
with these labels will make them serve requests in a round-robin manner.

This deployment script has been pre-configured with the 0.0.2 version, and to only start a
single instance of the new version of the services. This may be desirable if you want to
monitor the behaviour of the new service versions over a reasonable time period, but as we
want to see results faster we will scale them up to see more activity. You can do this by
expanding the deployment area for each service in the OpenShift web console and selecting
the up arrow to scale up each service:

Figure 2: Scaling up canary deployment

Now we can monitor the Prometheus dashboard, using the following query, to see the error
ratio per service and version:

sum(increase(span_count{error=&quot;true&quot;,span_kind=&quot;server&quot;}[1m]))
without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) /
sum(increase(span_count{span_kind=&quot;server&quot;}[1m])) without
(pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)

The result of this query can be seen in Figure 1 at the top of the article. This chart
shows that version 0.0.2 of the accountmgr service has not generated any errors, while the
0.0.2 of the ordermgr appears to be less error prone than version 0.0.1.

Based on these metrics, we could decide that the new versions of these services are better
than the previous, and therefore update the main service deployments to use the new
versions. In the OpenShift web console you can do this by clicking the three vertical dots
in the upper right hand side of the deployment region and selecting Edit YAML from the
menu. This will display an editor window where you can change the version from 0.0.1 to
0.0.2 in the YAML file.

Figure 3: Update the service version

After you save the YAML configuration file, in the web console you can see the service
going through a &quot;rolling update&quot; as OpenShift incrementally changes each
service instance over to the new version.

Figure 4: Rolling update

After the rolling update has completed for both the ordermgr and accountmgr services, then
you can scale down or completely remove the canary version of each deployment.

An alternative to performing the rolling update would simply be to name the canary version
something else (i.e. specific to the version being tested), and when it comes time to
switch over, simply scale down the previous deployment version. This would be more
straightforward, but wouldn’t show off the cool rolling update approach in the OpenShift
web console :-)

Figure 5: Scaling down canary deployment

Although we have updated both services at the same time, this is not necessary. Generally
microservices would be managed by separate teams and subject to their own deployment
lifecycles.

Conclusion

This article has shown how application metrics, captured by instrumenting services using
the OpenTracing API, can be used to support a simple Canary deployment strategy.

These metrics can similarly be used with other deployment strategies, such as A/B testing,
which can be achieved using a weighted load balancing capability within OpenShift.

Links

OpenTracing: http://opentracing.io

Github repository with demo: http://ift.tt/2rX5MoK

OpenTracing java metrics: http://ift.tt/2rWUNvF

Kubernetes: https://kubernetes.io

OpenShift: https://openshift.io

Jaeger: http://ift.tt/2eOSqHE

Prometheus: https://prometheus.io

from Hawkular Blog

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015