[keycloak-dev] Infinispan error during update in ha configuration

Mon Jul 8 07:35:10 EDT 2019

I've tried split_clusters_during_rolling_update option, but KUBE_PING start
failing with NPE:
Caused by: java.lang.NullPointerException
        at java.util.Objects.requireNonNull(Objects.java:203)
        at java.util.Optional.<init>(Optional.java:96)
        at java.util.Optional.of(Optional.java:108)
        at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:193)
        at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:190)
        at
java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
        at
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at
java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464)
        at
org.jgroups.protocols.kubernetes.KUBE_PING.findMembers(KUBE_PING.java:240)
        at org.jgroups.protocols.Discovery.findMembers(Discovery.java:211)
        at org.jgroups.protocols.Discovery.down(Discovery.java:350)

In debug I see code in KUBE_PING that tries to find pod's deployment
                String senderParentDeployment = hosts.stream()
                      .filter(pod -> senderIp.contains(pod.getIp()))
                      .map(Pod::getParentDeployment)
                      .findFirst().orElse(null);
This code fails with NPE, as Pod::getParentDeployment returns null. Parent
deployment not set because KUBE_PING cannot read deployment from json
describing pod.
There is already a bug on KUBE_PING for it:
https://github.com/jgroups-extras/jgroups-kubernetes/issues/50


пн, 8 июл. 2019 г. в 10:25, Sebastian Laskawiec <slaskawi at redhat.com>:

> Ok, I'm glad it worked.
>
> Just for the future - with KUBE_PING, there's an additional option:
> "split_clusters_during_rolling_update" set to true. See the documentation:
> https://github.com/jgroups-extras/jgroups-kubernetes#kube_ping-configuration
>
> On Wed, Jul 3, 2019 at 4:32 PM Мартынов Илья <imartynovsp at gmail.com>
> wrote:
>
>> Thanks Sebastian,
>> I use Kube_ping JGroups discovery plugin, it seems to be more reliable
>> than IP Multicasting.
>> Finally I've utilized choise #1 simply changing Strategy of Deployment in
>> kubernetes from default "RollingUpdate" to "Recreate". Recreate does
>> exactly what we need, first drops all existing pods and after that creates
>> pods with new version.
>>
>>
>>
>> ср, 3 июл. 2019 г. в 15:15, Sebastian Laskawiec <slaskawi at redhat.com>:
>>
>>> If you're using standalone-ha.xml without any extra parameters, you're
>>> using UDP JGroups stack with IP Multicasting discovery. You also suggested
>>> that you're using Pods, so I'm assuming you're using Kubernetes with
>>> Flannel network plugin (as far as I know, other plugins do not support IP
>>> Multicasting out of the box).
>>>
>>> So effectively you have 2 choices:
>>> - Turn everything off, do the migration and start it again. Note, that
>>> you need to shut everything down (this is not a rolling update procedure).
>>> - Reconfigure JGroups not to join the same cluster. The easiest thing to
>>> do here is to modify the cluster attribute and make sure it's different for
>>> the old and new cluster.
>>>
>>> Thanks,
>>> Sebastian
>>>
>>> On Wed, Jul 3, 2019 at 11:10 AM Мартынов Илья <imartynovsp at gmail.com>
>>> wrote:
>>>
>>>> Just another question. Actually, I don't even need zero-downtime
>>>> upgrade, I
>>>> need just any upgrade. Now upgrade is blocked because new pod cannot
>>>> start.
>>>> Can we somehow make new pod to ignore non-readable messages from old pod
>>>> and continue loading?
>>>> Yes, session state won't be replicated, but it's more automated way then
>>>> manual stop of old pod and starting a new pod after that.
>>>>
>>>>
>>>>
>>>> ср, 3 июл. 2019 г. в 09:39, Мартынов Илья <imartynovsp at gmail.com>:
>>>>
>>>> > Hi Marek,
>>>> >
>>>> > Ok, I got it, thank you for response!
>>>> >
>>>> > вт, 2 июля 2019 г., 19:25 Marek Posolda <mposolda at redhat.com>:
>>>> >
>>>> >> I think you can't mix old and new Keycloak servers in same cluster.
>>>> And
>>>> >> rolling upgrade (zero downtime upgrade) is not yet supported. We
>>>> plan to
>>>> >> add support for it, but it won't be in a very near future as it will
>>>> >> likely require quite a lot of work...
>>>> >>
>>>> >> In shortcut, it will be recommended to stop all pods with 4.5.0 and
>>>> then
>>>> >> start pods with 6.0.1.
>>>> >>
>>>> >> Marek
>>>> >>
>>>> >>
>>>> >> On 02/07/2019 16:25, Мартынов Илья wrote:
>>>> >> > Hello!
>>>> >> > I have Keycloak 4.5.0.Final deployed in standalone-ha
>>>> configuration in
>>>> >> k8s
>>>> >> > cluster. When I try to update Keycloak to version 6.0.1, the
>>>> following
>>>> >> > happens:
>>>> >> > 1. K8s starts new pod with version 6.0.1
>>>> >> > 2. Old pod still running, it will be terminated on successfull
>>>> readiness
>>>> >> > probe of the new pod
>>>> >> > 3. New pod discovers old pod via JGroups, cache synchronization
>>>> started
>>>> >> > 4. Exception in new pod:
>>>> >> > 02-07-2019;13:34:29,220 WARN [stateTransferExecutor-thread--p25-t1]
>>>> >> > org.infinispan.statetransfer.InboundTransferTask ISPN000210:
>>>> Failed to
>>>> >> > request state of cache work from node idp-6569c544b
>>>> >> > -hsd6g, segments {0-255}: org.infinispan.remoting.RemoteException:
>>>> >> > ISPN000217: Received exception from idp-6569c544b-hsd6g, see cause
>>>> for
>>>> >> > remote stack trace
>>>> >> >          at org.infinispan at 9.4.8.Final
>>>> >> >
>>>> >>
>>>> //org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
>>>> >> > ...
>>>> >> > Caused by: java.io.IOException: Unknown type: 132
>>>> >> >          at
>>>> >> >
>>>> >>
>>>> org.infinispan.marshall.core.GlobalMarshaller.readNonNullableObject(GlobalMarshaller.java:681)
>>>> >> >          at
>>>> >> >
>>>> >>
>>>> org.infinispan.marshall.core.GlobalMarshaller.readNullableObject(GlobalMarshaller.java:355)
>>>> >> >          at
>>>> >> >
>>>> >>
>>>> org.infinispan.marshall.core.BytesObjectInput.readObject(BytesObjectInput.java:40)
>>>> >> >
>>>> >> > Looks like this exception blocks further Keycloak startup, because
>>>> >> nothing
>>>> >> > happens in logs afterwards. Also, my rest service deployed as
>>>> JAX-RS
>>>> >> bean
>>>> >> > also doesn't respond, so pod is not treated as alive by Kubernetes.
>>>> >> > Please help.
>>>> >> > _______________________________________________
>>>> >> > keycloak-dev mailing list
>>>> >> > keycloak-dev at lists.jboss.org
>>>> >> > https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>> >>
>>>> >>
>>>> >>
>>>> _______________________________________________
>>>> keycloak-dev mailing list
>>>> keycloak-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>
>>>