[JBoss JIRA] (ISPN-4846) State transfer keeps trying to fetch transaction data after the cache was stopped
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4846?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4846:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1152458
> State transfer keeps trying to fetch transaction data after the cache was stopped
> ---------------------------------------------------------------------------------
>
> Key: ISPN-4846
> URL: https://issues.jboss.org/browse/ISPN-4846
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.0.CR1
> Reporter: Dan Berindei
> Fix For: 7.1.0.Alpha1
>
>
> StateConsumerImpl doesn't check if the cache is stopped while fetching transaction data, it only stops when it's no longer able to find providers for transactions.
> However, JGroupsTransport throws a generic CacheException when the channel is stopped. The state transfer thread can enter a busy-wait loop, retrying to get the transaction data and immediately getting the CacheException, filling the log with messages like this:
> {noformat}
> 19:32:28,237 WARN (remote-thread-NodeN-p42592-t1:) [StateConsumerImpl] ISPN000209: Failed to retrieve transactions for segments [10, 11, 12, 13, 14, 15, 17, 16, 19, 18, 21, 20, 23, 22, 25, 24, 27, 26, 29, 28, 42, 43, 40, 41, 46, 47, 44, 45, 51, 50, 49, 48, 55, 54, 53, 52, 59, 58, 57, 56] of cache testCache from node NodeM-53416
> org.infinispan.commons.CacheException: java.lang.IllegalStateException: channel is not connected
> at org.infinispan.commons.util.Util.rewrapAsCacheException(Util.java:655)
> at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:176)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:536)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:290)
> at org.infinispan.statetransfer.StateConsumerImpl.getTransactions(StateConsumerImpl.java:766)
> at org.infinispan.statetransfer.StateConsumerImpl.requestTransactions(StateConsumerImpl.java:685)
> at org.infinispan.statetransfer.StateConsumerImpl.addTransfers(StateConsumerImpl.java:629)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:331)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:195)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:43)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.rebalance(StateTransferManagerImpl.java:116)
> {noformat}
> We should check is the cache is stopped before retrying in StateConsumerImpl.requestTransactions. I also think we should change the stop order - it would make sense to stop the remote executor threads and the RpcDispatcher before we stop the channel.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months
[JBoss JIRA] (ISPN-4444) After state transfer, a node is able to read keys it no longer owns from its data container
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4444?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4444:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1163665
> After state transfer, a node is able to read keys it no longer owns from its data container
> -------------------------------------------------------------------------------------------
>
> Key: ISPN-4444
> URL: https://issues.jboss.org/browse/ISPN-4444
> Project: Infinispan
> Issue Type: Bug
> Components: Core, State Transfer
> Affects Versions: 7.0.0.Alpha4
> Reporter: Dan Berindei
> Priority: Critical
> Fix For: 7.1.0.Alpha1
>
>
> When state transfer ends and each node receives a CH_UPDATE command from the coordinator, it first installs the new topology and then it starts invalidating entries it no longer owns.
> However, there are two cases when the node can still read its stale values:
> 1. If L1 is enabled, it will look in the local DataContainer first, regardless of the key's location.
> 2. If L1 is disabled, but the key was removed on the new owners, the node will still look up the key in the local DataContainer after receiving a null response.
> The problem can be reproduced with {{TxReadAfterLosingOwnershipTest}} and its subclasses, by replacing the {{operation.update(cache(1));}} line with {{operation.update(cache(0));}}
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months
[JBoss JIRA] (ISPN-4969) Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-4969?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4969:
-----------------------------------
Fix Version/s: 6.0.3.Final
> Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-4969
> URL: https://issues.jboss.org/browse/ISPN-4969
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 5.2.8.Final, 6.0.2.Final, 7.0.0.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
> Priority: Critical
> Fix For: 7.0.1.Final, 5.2.9.Final, 6.0.3.Final
>
>
> We've had several reports in the WildFly forums of application runtime failures following undeployment of a separate application.
> WF creates a cache instance for each web application within the same cache container. However, KeyAffinityServiceImpl registers a cache manager listener that calls stop() on a @CacheStoppedEvent. However, this event might be triggered by any cache, not necessarily the cache with to which the KeyAffinityService is associated.
> The KeyAffinityServiceImpl.handleCacheStopped(CacheStoppedEvent) should only call stop() if the event.getCacheName() equals the name of the cache to which the affinity service is associated.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months
[JBoss JIRA] (ISPN-4969) Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-4969?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño resolved ISPN-4969.
------------------------------------
Resolution: Done
> Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-4969
> URL: https://issues.jboss.org/browse/ISPN-4969
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 5.2.8.Final, 6.0.2.Final, 7.0.0.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
> Priority: Critical
> Fix For: 7.0.1.Final, 5.2.9.Final, 6.0.3.Final
>
>
> We've had several reports in the WildFly forums of application runtime failures following undeployment of a separate application.
> WF creates a cache instance for each web application within the same cache container. However, KeyAffinityServiceImpl registers a cache manager listener that calls stop() on a @CacheStoppedEvent. However, this event might be triggered by any cache, not necessarily the cache with to which the KeyAffinityService is associated.
> The KeyAffinityServiceImpl.handleCacheStopped(CacheStoppedEvent) should only call stop() if the event.getCacheName() equals the name of the cache to which the affinity service is associated.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months
[JBoss JIRA] (ISPN-4969) Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-4969?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño updated ISPN-4969:
-----------------------------------
Fix Version/s: 5.2.9.Final
> Stopping a cache will stop all KeyAffinityServices created for other caches in the cache manager
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-4969
> URL: https://issues.jboss.org/browse/ISPN-4969
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 5.2.8.Final, 6.0.2.Final, 7.0.0.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
> Priority: Critical
> Fix For: 7.0.1.Final, 5.2.9.Final
>
>
> We've had several reports in the WildFly forums of application runtime failures following undeployment of a separate application.
> WF creates a cache instance for each web application within the same cache container. However, KeyAffinityServiceImpl registers a cache manager listener that calls stop() on a @CacheStoppedEvent. However, this event might be triggered by any cache, not necessarily the cache with to which the KeyAffinityService is associated.
> The KeyAffinityServiceImpl.handleCacheStopped(CacheStoppedEvent) should only call stop() if the event.getCacheName() equals the name of the cache to which the affinity service is associated.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months
[JBoss JIRA] (ISPN-4975) Cross site state transfer - status of push gets stuck at "SENDING" after being cancelled
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4975?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration updated ISPN-4975:
------------------------------------------
Bugzilla Update: Perform
Bugzilla References: https://bugzilla.redhat.com/show_bug.cgi?id=1163337
> Cross site state transfer - status of push gets stuck at "SENDING" after being cancelled
> ----------------------------------------------------------------------------------------
>
> Key: ISPN-4975
> URL: https://issues.jboss.org/browse/ISPN-4975
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 7.0.0.Final
> Reporter: Matej Čimbora
> Assignee: Pedro Ruivo
>
> After invoking: site --cancelpush backupSite on the producer site, status of the push operation seems to get stuck at "SENDING" value (tested by site --pushstatus), even if state transfer is not currently in progress.
> Invoking site --cancelreceive mainSite on the consumer site works correctly. New invocation of site --push backupsite leads to "X-Site state transfer to '%s' already started!" being displayed. The issue seems to be caused by XSiteStateTransferManagerImpl.siteCollector not being cleared.
> Used configuration:
> distributed caches, site A: 2 nodes, site B: 3 nodes, B is a backup for A.
> Scenario
> - Start A,B
> - Take B offline using takeSiteOffline
> - Load data into A
> - Push state into B
> - CancelPushState B
> -- PushStateStatus remains stuck at SENDING & new push is not possible
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 5 months