[infinispan-dev] Infinispan and OpenShift/Kubernetes PetSets
Sebastian Laskawiec
slaskawi at redhat.com
Mon Aug 22 02:02:23 EDT 2016
+1, I think using multiple discovery protocols could be a good idea.
I think it makes sense to use KUBE_PING and DNS together. The same
configuration could have been used for both PetSets (which use DNS) and
normal Kubernetes/OpenShift deployments (which use KUBE_PING). But that is
something that should be tested carefully.
Thanks
Sebastian
On Sat, Aug 20, 2016 at 9:19 AM, Bela Ban <bban at redhat.com> wrote:
> If we add a DNS discovery protocol, it would only be another discovery
> protocol among many, and customers can choose which one to use.
> I'm also thinking of adding the ability to JGroups to use multiple
> discovery protocols in the same stack and combine their result sets into
> one. Not sure though if it makes sense to use KUBE_PING and DNS in the same
> stack...
>
>
> On 20/08/16 00:04, Rob Cernich wrote:
>
>> A couple of things...
>>
>> re. volumes:
>> We also need to consider the mounting behavior for scale down scenarios
>> and for overage scenarios when doing upgrades. For the latter,
>> OpenShift can spin up pods of the new version before the older version
>> pods have terminated. This may mean that some volumes from the old pods
>> are orphaned. We did see this when testing A-MQ during upgrades. With
>> a single pod, the upgrade process caused the new version to have a new
>> mount and the original mount was left orphaned (another upgrade would
>> cause the newer pod to pick up the orphaned mount, leaving the new mount
>> orphaned). I believe we worked around this by specifying an overage of
>> 0% during upgrades. This ensured the new pods would pick up the volumes
>> left behind by the old pods. (Actually, we were using subdirectories in
>> the mount, since all pods shared the same volume.)
>>
>> re. dns:
>> DNS should work fine as-is, but there are a couple things that you need
>> to consider.
>> 1. Service endpoints are only available in DNS after the pod becomes
>> ready (SVC records on the service name). Because infinispan attaches
>> itself to the cluster, this meant pods were all started as cluster of
>> one, then merged once they noticed the other pods. This had a
>> significant impact on startup. Since then, OpenShift has added the
>> ability to query the endpoints associated with a service as soon as the
>> pod is created, which would allow initialization to work correctly. To
>> make this work, we'd have to change the form of the DNS query to pick up
>> the service endpoints (I forget the naming scheme).
>>
>> Another thing to keep in mind is that looking up pods by labels allows
>> any pod with the specified label to be added to the cluster. I'm not
>> sure of a use case for this, but it would allow other deployments to be
>> included in the cluster. (You could also argue that the service is the
>> authority for this and any pod with said label would be added as a
>> service endpoint, thus achieving the same behavior...probably more
>> simply too.)
>>
>> Lastly, DNS was a little flaky when we first implemented this, which was
>> part of the reason we went straight to kubernetes. Users were using
>> dnsmasq with wildcards that worked well for routes, but ended up routing
>> services to the router ip instead of pod ip. Needless to say, there
>> were a lot of complications trying to use DNS and debug user problems
>> with service resolution.
>>
>> Hope that helps,
>> Rob
>>
>> ------------------------------------------------------------------------
>>
>> Hey Bela!
>>
>> No no, the resolution can be done with pure JDK.
>>
>> Thanks
>> Sebastian
>>
>> On Fri, Aug 19, 2016 at 11:18 AM, Bela Ban <bban at redhat.com
>> <mailto:bban at redhat.com>> wrote:
>>
>> Hi Sebastian
>>
>> the usual restrictions apply: if DNS discovery depends on
>> external libs, then it should be hosted in jgroups-extras,
>> otherwise we can add it to JGroups itself.
>>
>> On 19/08/16 11:00, Sebastian Laskawiec wrote:
>>
>> Hey!
>>
>> I've been playing with Kubernetes PetSets [1] for a while
>> and I'd like
>> to share some thoughts. Before I dig in, let me give you
>> some PetSets
>> highlights:
>>
>> * PetSets are alpha resources for managing stateful apps
>> in Kubernetes
>> 1.3 (and OpenShift Origin 1.3).
>> * Since this is an alpha resource, there are no
>> guarantees about
>> backwards compatibility. Alpha resources can also be
>> disabled in
>> some public cloud providers (you can control which API
>> versions are
>> accessible [2]).
>> * PetSets allows starting pods in sequence (not relevant
>> for us, but
>> this is a killer feature for master-slave systems).
>> * Each Pod has it's own unique entry in DNS, which makes
>> discovery
>> very simple (I'll dig into that a bit later)
>> * Volumes are always mounted to the same Pods, which is
>> very important
>> in Cache Store scenarios when we restart pods (e.g.
>> Rolling Upgrades
>> [3]).
>>
>> Thoughts and ideas after spending some time playing with
>> this feature:
>>
>> * PetSets make discovery a lot easier. It's a combination
>> of two
>> things - Headless Services [4] which create multiple A
>> records in
>> DNS and predictable host names. Each Pod has it's own
>> unique DNS
>> entry following pattern:
>> {PetSetName}-{PodIndex}.{ServiceName} [5].
>> Here's an example of an Infinispan PetSet deployed on
>> my local
>> cluster [6]. As you can see we have all domain names
>> and IPs from a
>> single DNS query.
>> * Maybe we could perform discovery using this mechanism?
>> I'm aware of
>> DNS discovery implemented in KUBE_PING [7][8] but the
>> code looks
>> trivial [9] so maybe it should be implement inside
>> JGroups? @Bela -
>> WDYT?
>> * PetSets do not integrate well with OpenShift 'new-app'
>> command. In
>> other words, our users will need to use provided yaml
>> (or json)
>> files to create Infinispan cluster. It's not a
>> show-stopper but it's
>> a bit less convenient than 'oc new-app'.
>> * Since PetSets are alpha resources they need to be
>> considered as
>> secondary way to deploy Infinispan on Kubernetes and
>> OpenShift.
>> * Finally, the persistent volumes - since a Pod always
>> gets the same
>> volume, it would be safe to use any file-based cache
>> store.
>>
>> If you'd like to play with PetSets on your local
>> environment, here are
>> necessary yaml files [10].
>>
>> Thanks
>> Sebastian
>>
>>
>> [1] http://kubernetes.io/docs/user-guide/petset/
>> [2] For checking which APIs are accessible, use 'kubectl
>> api-versions'
>> [3]
>> http://infinispan.org/docs/stable/user_guide/user_guide.html
>> #_Rolling_chapter
>> [4]
>> http://kubernetes.io/docs/user-guide/services/#headless-serv
>> ices
>> [5] http://kubernetes.io/docs/user
>> -guide/petset/#peer-discovery
>> [6]
>> https://gist.github.com/slaskawi/0866e63a39276f8ab6637622971
>> 6a676
>> [7]
>> https://github.com/jboss-openshift/openshift-ping/tree/maste
>> r/dns
>> [8]
>> https://github.com/jgroups-extras/jgroups-kubernetes/tree/
>> master/dns
>> [9] http://stackoverflow.com/a/12405896/562699
>> [10] You might need to adjust ImageStream.
>> https://gist.github.com/slaskawi/7cffb5588dabb770f654557579c
>> 5f2d0
>>
>>
>> --
>> Bela Ban, JGroups lead (http://www.jgroups.org)
>>
>>
>>
>>
> --
> Bela Ban, JGroups lead (http://www.jgroups.org)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20160822/53c0e94e/attachment-0001.html
More information about the infinispan-dev
mailing list