On Sat, Aug 20, 2016 at 9:19 AM, Bela Ban <bban@redhat.com> wrote:

If we add a DNS discovery protocol, it would only be another discovery protocol among many, and customers can choose which one to use.
I'm also thinking of adding the ability to JGroups to use multiple discovery protocols in the same stack and combine their result sets into one. Not sure though if it makes sense to use KUBE_PING and DNS in the same stack...

On 20/08/16 00:04, Rob Cernich wrote:

A couple of things...

re. volumes:
We also need to consider the mounting behavior for scale down scenarios
and for overage scenarios when doing upgrades. For the latter,
OpenShift can spin up pods of the new version before the older version
pods have terminated. This may mean that some volumes from the old pods
are orphaned. We did see this when testing A-MQ during upgrades. With
a single pod, the upgrade process caused the new version to have a new
mount and the original mount was left orphaned (another upgrade would
cause the newer pod to pick up the orphaned mount, leaving the new mount
orphaned). I believe we worked around this by specifying an overage of
0% during upgrades. This ensured the new pods would pick up the volumes
left behind by the old pods. (Actually, we were using subdirectories in
the mount, since all pods shared the same volume.)

re. dns:
DNS should work fine as-is, but there are a couple things that you need
to consider.
1. Service endpoints are only available in DNS after the pod becomes
ready (SVC records on the service name). Because infinispan attaches
itself to the cluster, this meant pods were all started as cluster of
one, then merged once they noticed the other pods. This had a
significant impact on startup. Since then, OpenShift has added the
ability to query the endpoints associated with a service as soon as the
pod is created, which would allow initialization to work correctly. To
make this work, we'd have to change the form of the DNS query to pick up
the service endpoints (I forget the naming scheme).

Another thing to keep in mind is that looking up pods by labels allows
any pod with the specified label to be added to the cluster. I'm not
sure of a use case for this, but it would allow other deployments to be
included in the cluster. (You could also argue that the service is the
authority for this and any pod with said label would be added as a
service endpoint, thus achieving the same behavior...probably more
simply too.)

Lastly, DNS was a little flaky when we first implemented this, which was
part of the reason we went straight to kubernetes. Users were using
dnsmasq with wildcards that worked well for routes, but ended up routing
services to the router ip instead of pod ip. Needless to say, there
were a lot of complications trying to use DNS and debug user problems
with service resolution.

Hope that helps,
Rob

------------------------------------------------------------------------

Hey Bela!

No no, the resolution can be done with pure JDK.

Thanks
Sebastian

On Fri, Aug 19, 2016 at 11:18 AM, Bela Ban <bban@redhat.com
<mailto:bban@redhat.com>> wrote:

Hi Sebastian

the usual restrictions apply: if DNS discovery depends on
external libs, then it should be hosted in jgroups-extras,
otherwise we can add it to JGroups itself.

On 19/08/16 11:00, Sebastian Laskawiec wrote:

Hey!

I've been playing with Kubernetes PetSets [1] for a while
and I'd like
to share some thoughts. Before I dig in, let me give you
some PetSets
highlights:

* PetSets are alpha resources for managing stateful apps
in Kubernetes
1.3 (and OpenShift Origin 1.3).
* Since this is an alpha resource, there are no
guarantees about
backwards compatibility. Alpha resources can also be
disabled in
some public cloud providers (you can control which API
versions are
accessible [2]).
* PetSets allows starting pods in sequence (not relevant
for us, but
this is a killer feature for master-slave systems).
* Each Pod has it's own unique entry in DNS, which makes
discovery
very simple (I'll dig into that a bit later)
* Volumes are always mounted to the same Pods, which is
very important
in Cache Store scenarios when we restart pods (e.g.
Rolling Upgrades
[3]).

Thoughts and ideas after spending some time playing with
this feature:

* PetSets make discovery a lot easier. It's a combination
of two
things - Headless Services [4] which create multiple A
records in
DNS and predictable host names. Each Pod has it's own
unique DNS
entry following pattern:
{PetSetName}-{PodIndex}.{ServiceName} [5].
Here's an example of an Infinispan PetSet deployed on
my local
cluster [6]. As you can see we have all domain names
and IPs from a
single DNS query.
* Maybe we could perform discovery using this mechanism?
I'm aware of
DNS discovery implemented in KUBE_PING [7][8] but the
code looks
trivial [9] so maybe it should be implement inside
JGroups? @Bela -
WDYT?
* PetSets do not integrate well with OpenShift 'new-app'
command. In
other words, our users will need to use provided yaml
(or json)
files to create Infinispan cluster. It's not a
show-stopper but it's
a bit less convenient than 'oc new-app'.
* Since PetSets are alpha resources they need to be
considered as
secondary way to deploy Infinispan on Kubernetes and
OpenShift.
* Finally, the persistent volumes - since a Pod always
gets the same
volume, it would be safe to use any file-based cache store.

If you'd like to play with PetSets on your local
environment, here are
necessary yaml files [10].

Thanks
Sebastian

[1] http://kubernetes.io/docs/user-guide/petset/
[2] For checking which APIs are accessible, use 'kubectl
api-versions'
[3]
http://infinispan.org/docs/stable/user_guide/user_guide.html#_Rolling_chapter
[4]
http://kubernetes.io/docs/user-guide/services/#headless-services
[5] http://kubernetes.io/docs/user-guide/petset/#peer-discovery
[6]
https://gist.github.com/slaskawi/0866e63a39276f8ab66376229716a676
[7]
https://github.com/jboss-openshift/openshift-ping/tree/master/dns
[8]
https://github.com/jgroups-extras/jgroups-kubernetes/tree/master/dns
[9] http://stackoverflow.com/a/12405896/562699
[10] You might need to adjust ImageStream.
https://gist.github.com/slaskawi/7cffb5588dabb770f654557579c5f2d0

--
Bela Ban, JGroups lead (http://www.jgroups.org)

--
Bela Ban, JGroups lead (http://www.jgroups.org)