[keycloak-dev] Infinispan error during update in ha configuration

Mon Jul 8 08:27:46 EDT 2019

oooh right, I guess you hit this issue:
https://github.com/jgroups-extras/jgroups-kubernetes/issues/50

Sorry to hear that. Perhaps you'd be interested in contributing a fix to
KUBE_PING?

On Mon, Jul 8, 2019 at 1:35 PM Мартынов Илья <imartynovsp at gmail.com> wrote:

> I've tried split_clusters_during_rolling_update option, but KUBE_PING
> start failing with NPE:
> Caused by: java.lang.NullPointerException
>         at java.util.Objects.requireNonNull(Objects.java:203)
>         at java.util.Optional.<init>(Optional.java:96)
>         at java.util.Optional.of(Optional.java:108)
>         at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:193)
>         at java.util.stream.FindOps$FindSink$OfRef.get(FindOps.java:190)
>         at
> java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
>         at
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at
> java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:464)
>         at
> org.jgroups.protocols.kubernetes.KUBE_PING.findMembers(KUBE_PING.java:240)
>         at org.jgroups.protocols.Discovery.findMembers(Discovery.java:211)
>         at org.jgroups.protocols.Discovery.down(Discovery.java:350)
>
> In debug I see code in KUBE_PING that tries to find pod's deployment
>                 String senderParentDeployment = hosts.stream()
>                       .filter(pod -> senderIp.contains(pod.getIp()))
>                       .map(Pod::getParentDeployment)
>                       .findFirst().orElse(null);
> This code fails with NPE, as Pod::getParentDeployment returns null. Parent
> deployment not set because KUBE_PING cannot read deployment from json
> describing pod.
> There is already a bug on KUBE_PING for it:
> https://github.com/jgroups-extras/jgroups-kubernetes/issues/50
>
>
> пн, 8 июл. 2019 г. в 10:25, Sebastian Laskawiec <slaskawi at redhat.com>:
>
>> Ok, I'm glad it worked.
>>
>> Just for the future - with KUBE_PING, there's an additional option:
>> "split_clusters_during_rolling_update" set to true. See the documentation:
>> https://github.com/jgroups-extras/jgroups-kubernetes#kube_ping-configuration
>>
>> On Wed, Jul 3, 2019 at 4:32 PM Мартынов Илья <imartynovsp at gmail.com>
>> wrote:
>>
>>> Thanks Sebastian,
>>> I use Kube_ping JGroups discovery plugin, it seems to be more reliable
>>> than IP Multicasting.
>>> Finally I've utilized choise #1 simply changing Strategy of Deployment
>>> in kubernetes from default "RollingUpdate" to "Recreate". Recreate does
>>> exactly what we need, first drops all existing pods and after that creates
>>> pods with new version.
>>>
>>>
>>>
>>> ср, 3 июл. 2019 г. в 15:15, Sebastian Laskawiec <slaskawi at redhat.com>:
>>>
>>>> If you're using standalone-ha.xml without any extra parameters, you're
>>>> using UDP JGroups stack with IP Multicasting discovery. You also suggested
>>>> that you're using Pods, so I'm assuming you're using Kubernetes with
>>>> Flannel network plugin (as far as I know, other plugins do not support IP
>>>> Multicasting out of the box).
>>>>
>>>> So effectively you have 2 choices:
>>>> - Turn everything off, do the migration and start it again. Note, that
>>>> you need to shut everything down (this is not a rolling update procedure).
>>>> - Reconfigure JGroups not to join the same cluster. The easiest thing
>>>> to do here is to modify the cluster attribute and make sure it's different
>>>> for the old and new cluster.
>>>>
>>>> Thanks,
>>>> Sebastian
>>>>
>>>> On Wed, Jul 3, 2019 at 11:10 AM Мартынов Илья <imartynovsp at gmail.com>
>>>> wrote:
>>>>
>>>>> Just another question. Actually, I don't even need zero-downtime
>>>>> upgrade, I
>>>>> need just any upgrade. Now upgrade is blocked because new pod cannot
>>>>> start.
>>>>> Can we somehow make new pod to ignore non-readable messages from old
>>>>> pod
>>>>> and continue loading?
>>>>> Yes, session state won't be replicated, but it's more automated way
>>>>> then
>>>>> manual stop of old pod and starting a new pod after that.
>>>>>
>>>>>
>>>>>
>>>>> ср, 3 июл. 2019 г. в 09:39, Мартынов Илья <imartynovsp at gmail.com>:
>>>>>
>>>>> > Hi Marek,
>>>>> >
>>>>> > Ok, I got it, thank you for response!
>>>>> >
>>>>> > вт, 2 июля 2019 г., 19:25 Marek Posolda <mposolda at redhat.com>:
>>>>> >
>>>>> >> I think you can't mix old and new Keycloak servers in same cluster.
>>>>> And
>>>>> >> rolling upgrade (zero downtime upgrade) is not yet supported. We
>>>>> plan to
>>>>> >> add support for it, but it won't be in a very near future as it will
>>>>> >> likely require quite a lot of work...
>>>>> >>
>>>>> >> In shortcut, it will be recommended to stop all pods with 4.5.0 and
>>>>> then
>>>>> >> start pods with 6.0.1.
>>>>> >>
>>>>> >> Marek
>>>>> >>
>>>>> >>
>>>>> >> On 02/07/2019 16:25, Мартынов Илья wrote:
>>>>> >> > Hello!
>>>>> >> > I have Keycloak 4.5.0.Final deployed in standalone-ha
>>>>> configuration in
>>>>> >> k8s
>>>>> >> > cluster. When I try to update Keycloak to version 6.0.1, the
>>>>> following
>>>>> >> > happens:
>>>>> >> > 1. K8s starts new pod with version 6.0.1
>>>>> >> > 2. Old pod still running, it will be terminated on successfull
>>>>> readiness
>>>>> >> > probe of the new pod
>>>>> >> > 3. New pod discovers old pod via JGroups, cache synchronization
>>>>> started
>>>>> >> > 4. Exception in new pod:
>>>>> >> > 02-07-2019;13:34:29,220 WARN
>>>>> [stateTransferExecutor-thread--p25-t1]
>>>>> >> > org.infinispan.statetransfer.InboundTransferTask ISPN000210:
>>>>> Failed to
>>>>> >> > request state of cache work from node idp-6569c544b
>>>>> >> > -hsd6g, segments {0-255}: org.infinispan.remoting.RemoteException:
>>>>> >> > ISPN000217: Received exception from idp-6569c544b-hsd6g, see
>>>>> cause for
>>>>> >> > remote stack trace
>>>>> >> >          at org.infinispan at 9.4.8.Final
>>>>> >> >
>>>>> >>
>>>>> //org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
>>>>> >> > ...
>>>>> >> > Caused by: java.io.IOException: Unknown type: 132
>>>>> >> >          at
>>>>> >> >
>>>>> >>
>>>>> org.infinispan.marshall.core.GlobalMarshaller.readNonNullableObject(GlobalMarshaller.java:681)
>>>>> >> >          at
>>>>> >> >
>>>>> >>
>>>>> org.infinispan.marshall.core.GlobalMarshaller.readNullableObject(GlobalMarshaller.java:355)
>>>>> >> >          at
>>>>> >> >
>>>>> >>
>>>>> org.infinispan.marshall.core.BytesObjectInput.readObject(BytesObjectInput.java:40)
>>>>> >> >
>>>>> >> > Looks like this exception blocks further Keycloak startup, because
>>>>> >> nothing
>>>>> >> > happens in logs afterwards. Also, my rest service deployed as
>>>>> JAX-RS
>>>>> >> bean
>>>>> >> > also doesn't respond, so pod is not treated as alive by
>>>>> Kubernetes.
>>>>> >> > Please help.
>>>>> >> > _______________________________________________
>>>>> >> > keycloak-dev mailing list
>>>>> >> > keycloak-dev at lists.jboss.org
>>>>> >> > https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> _______________________________________________
>>>>> keycloak-dev mailing list
>>>>> keycloak-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>>
>>>>