[infinispan-issues] [JBoss JIRA] (ISPN-9345) TimeutException involving the org.infinispan.CONFIG cache

Dan Berindei (JIRA) issues at jboss.org
Tue Jul 3 10:31:02 EDT 2018


    [ https://issues.jboss.org/browse/ISPN-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600253#comment-13600253 ] 

Dan Berindei edited comment on ISPN-9345 at 7/3/18 10:30 AM:
-------------------------------------------------------------

Don't need separate JVMs, starting 2 nodes in the same JVM also reproduces the problem.

I've tracked it down to the fact that JGroups uses IPv6 addresses by default on Linux. Even if {{UDP.mcast_addr}} is set to an IPv4 address, JGroups converts it to an IPv6 address. [IPv6 never fragments packets|https://labs.ripe.net/Members/gih/fragmenting-ipv6], so packets bigger than the MTU are dropped:

{noformat}16:33:40.584803 IP6 denulu-tp3 > ff0e::e406:708: frag (0|1448) 40689 > 46655: UDP, bad length 1453 > 1440{noformat}

(This happens on a wifi network with {{mtu=1500}}, according to ifconfig, and {{FRAG3.frag_size="1391"}})

It would be great if we could report this kind of misconfiguration to the user, but I don't think we can discover this without native code. WDYT [~belaban], maybe we should broadcast a huge message on startup and fail if we don't get a reply back from the coordinator?

As a workaround we can switch to the IPv4 stack with {{\-Djava.net.preferIPv4Stack=true}}, switch to the TCP stack, or reduce the fragment size. Apparently 1280 is the minimum supported MTU in IPv6, so I suggest we change the default fragment size to {{1200}} (1280-(1453-1391)=1218) *and* use the TCP stack by default. We can add a paragraph to the user guide suggesting UDP with a higher fragment size as a riskier and possibly faster alternative.


was (Author: dan.berindei):
Don't need separate JVMs, starting 2 nodes in the same JVM also reproduces the problem.

I've tracked it down to the fact that JGroups uses IPv6 addresses by default on Linux. Even if {{UDP.mcast_addr}} is set to an IPv4 address, JGroups converts it to an IPv6 address. [IPv6 never fragments packets|https://labs.ripe.net/Members/gih/fragmenting-ipv6], so packets bigger than the MTU are dropped:

{noformat}16:33:40.584803 IP6 denulu-tp3 > ff0e::e406:708: frag (0|1448) 40689 > 46655: UDP, bad length 1453 > 1440{noformat}

(This happens on a wifi network with {{mtu=1500}}, according to ifconfig, and {{FRAG3.frag_size="1391"}})

It would be great if we could report this kind of misconfiguration to the user, but I don't think we can discover this without native code. WDYT [~belaban], maybe we should broadcast a huge message on startup and fail if we don't get a reply back from the coordinator?

As a workaround we can switch to the IPv4 stack with {{-Djava.net.preferIPv4Stack=true}}, switch to the TCP stack, or reduce the fragment size. Apparently 1280 is the minimum supported MTU in IPv6, so I suggest we change the default fragment size to {{1200}} (1280-(1453-1391)=1218) *and* use the TCP stack by default. We can add a paragraph to the user guide suggesting UDP with a higher fragment size as a riskier and possibly faster alternative.

> TimeutException involving the org.infinispan.CONFIG cache
> ---------------------------------------------------------
>
>                 Key: ISPN-9345
>                 URL: https://issues.jboss.org/browse/ISPN-9345
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 9.3.0.Final
>            Reporter: Gustavo Fernandes
>
> {noformat}
> Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache org.infinispan.CONFIG on jedha-64980
>     at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:233)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:79)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.5.0#75005)


More information about the infinispan-issues mailing list