]
Tristan Tarrant updated ISPN-4846:
----------------------------------
Fix Version/s: 9.4.2.Final
(was: 9.4.1.Final)
State transfer keeps trying to fetch transaction data after the cache
was stopped
---------------------------------------------------------------------------------
Key: ISPN-4846
URL:
https://issues.jboss.org/browse/ISPN-4846
Project: Infinispan
Issue Type: Bug
Components: Core, State Transfer
Affects Versions: 7.0.0.CR1
Reporter: Dan Berindei
Priority: Major
Fix For: 9.4.2.Final
StateConsumerImpl doesn't check if the cache is stopped while fetching transaction
data, it only stops when it's no longer able to find providers for transactions.
However, JGroupsTransport throws a generic CacheException when the channel is stopped.
The state transfer thread can enter a busy-wait loop, retrying to get the transaction data
and immediately getting the CacheException, filling the log with messages like this:
{noformat}
19:32:28,237 WARN (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209:
Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21,
20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55,
54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not
connected
at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
at
org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
at
org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
at
org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
at
org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
at
org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
at
org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
at
org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
at
org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
at
org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
{noformat}
We should check is the cache is stopped before retrying in
StateConsumerImpl.requestTransactions. I also think we should change the stop order - it
would make sense to stop the remote executor threads and the RpcDispatcher before we stop
the channel.