[infinispan-issues] [JBoss JIRA] (ISPN-4846) State transfer keeps trying to fetch transaction data after the cache was stopped

Tue May 17 08:22:13 EDT 2016

     [ https://issues.jboss.org/browse/ISPN-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tristan Tarrant updated ISPN-4846:
----------------------------------
    Fix Version/s: 9.0.0.Alpha3
                       (was: 9.0.0.Alpha2)


> State transfer keeps trying to fetch transaction data after the cache was stopped
> ---------------------------------------------------------------------------------
>
>                 Key: ISPN-4846
>                 URL: https://issues.jboss.org/browse/ISPN-4846
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core, State Transfer
>    Affects Versions: 7.0.0.CR1
>            Reporter: Dan Berindei
>             Fix For: 9.0.0.Alpha3
>
>
> StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.
> However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:
> {noformat}
> 19:32:28,237 WARN  (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
> org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected
> 	at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
> 	at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
> 	at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
> 	at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
> 	at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
> 	at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
> 	at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
> 	at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
> 	at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
> 	at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
> 	at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
> {noformat}
> We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.


--
This message was sent by Atlassian JIRA
(v6.4.11#64026)