Dan Berindei created ISPN-12235:
-----------------------------------
Summary: Perform multicast test on startup
Key: ISPN-12235
URL:
https://issues.redhat.com/browse/ISPN-12235
Project: Infinispan
Issue Type: Feature Request
Components: Core
Affects Versions: 11.0.3.Final
Reporter: Dan Berindei
Fix For: 12.0.0.Final
UDP and IP multicast in particular are not reliable in some environments. E.g.
* In our test environment datagrams bigger than 9KB are sometimes dropped, causing poor
performance.
* In some IPv6 environments datagrams bigger than the MTU are dropped instead of being
fragmented when receiving a ICMP 4 "The datagram is too big. Packet fragmentation is
required but the 'don't fragment' (DF) flag is on." packet.
* Multicast groups sometimes disappear [with IGMP snooping
enabled|https://access.redhat.com/solutions/22169].
Users can diagnose these problems by using JGroups' {{McastSenderTest}} and
{{McastReceiverTest}} on all the nodes, but it's a manual process, and it requires an
investigation to know that the network might have a problem. This kind of issue first
appears in the log as a generic timeout error, e.g.
{noformat}
org.infinispan.commons.CacheException: Initial state transfer timed out for cache
org.infinispan.CONFIG on Node
at
org.infinispan.statetransfer.StateTransferManagerImpl.waitForInitialStateTransferToComplete(StateTransferManagerImpl.java:246)
{noformat}
We should try to help the user by send a big multicast message at the beginning and
failing if we don't get responses from all other members.
An alternative would be to expose a multicast test as a Console/CLI operation and invoking
it automatically when a diagnostic report is generated.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)