[infinispan-issues] [JBoss JIRA] (ISPN-9345) TimeutException involving the org.infinispan.CONFIG cache
Dan Berindei (JIRA)
issues at jboss.org
Tue Jul 3 10:31:02 EDT 2018
[ https://issues.jboss.org/browse/ISPN-9345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600253#comment-13600253 ]
Dan Berindei edited comment on ISPN-9345 at 7/3/18 10:30 AM:
-------------------------------------------------------------
Don't need separate JVMs, starting 2 nodes in the same JVM also reproduces the problem.
I've tracked it down to the fact that JGroups uses IPv6 addresses by default on Linux. Even if {{UDP.mcast_addr}} is set to an IPv4 address, JGroups converts it to an IPv6 address. [IPv6 never fragments packets|https://labs.ripe.net/Members/gih/fragmenting-ipv6], so packets bigger than the MTU are dropped:
{noformat}16:33:40.584803 IP6 denulu-tp3 > ff0e::e406:708: frag (0|1448) 40689 > 46655: UDP, bad length 1453 > 1440{noformat}
(This happens on a wifi network with {{mtu=1500}}, according to ifconfig, and {{FRAG3.frag_size="1391"}})
It would be great if we could report this kind of misconfiguration to the user, but I don't think we can discover this without native code. WDYT [~belaban], maybe we should broadcast a huge message on startup and fail if we don't get a reply back from the coordinator?
As a workaround we can switch to the IPv4 stack with {{\-Djava.net.preferIPv4Stack=true}}, switch to the TCP stack, or reduce the fragment size. Apparently 1280 is the minimum supported MTU in IPv6, so I suggest we change the default fragment size to {{1200}} (1280-(1453-1391)=1218) *and* use the TCP stack by default. We can add a paragraph to the user guide suggesting UDP with a higher fragment size as a riskier and possibly faster alternative.
was (Author: dan.berindei):
Don't need separate JVMs, starting 2 nodes in the same JVM also reproduces the problem.
I've tracked it down to the fact that JGroups uses IPv6 addresses by default on Linux. Even if {{UDP.mcast_addr}} is set to an IPv4 address, JGroups converts it to an IPv6 address. [IPv6 never fragments packets|https://labs.ripe.net/Members/gih/fragmenting-ipv6], so packets bigger than the MTU are dropped:
{noformat}16:33:40.584803 IP6 denulu-tp3 > ff0e::e406:708: frag (0|1448) 40689 > 46655: UDP, bad length 1453 > 1440{noformat}
(This happens on a wifi network with {{mtu=1500}}, according to ifconfig, and {{FRAG3.frag_size="1391"}})
It would be great if we could report this kind of misconfiguration to the user, but I don't think we can discover this without native code. WDYT [~belaban], maybe we should broadcast a huge message on startup and fail if we don't get a reply back from the coordinator?
As a workaround we can switch to the IPv4 stack with {{-Djava.net.preferIPv4Stack=true}}, switch to the TCP stack, or reduce the fragment size. Apparently 1280 is the minimum supported MTU in IPv6, so I suggest we change the default fragment size to {{1200}} (1280-(1453-1391)=1218) *and* use the TCP stack by default. We can add a paragraph to the user guide suggesting UDP with a higher fragment size as a riskier and possibly faster alternative.
> TimeutException involving the org.infinispan.CONFIG cache
> ---------------------------------------------------------
>
> Key: ISPN-9345
> URL: https://issues.jboss.org/browse/ISPN-9345
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.3.0.Final
> Reporter: Gustavo Fernandes
>
> {noformat}
> Caused by: org.infinispan.commons.CacheException: Initial state transfer timed out for cache org.infinispan.CONFIG on jedha-64980
> at org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:233)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:79)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the infinispan-issues
mailing list