[infinispan-issues] [JBoss JIRA] (ISPN-6665) StateTransferManager should be the first component to stop

Brad Maxwell (JIRA) issues at jboss.org
Wed May 18 20:16:00 EDT 2016


Brad Maxwell created ISPN-6665:
----------------------------------

             Summary: StateTransferManager should be the first component to stop
                 Key: ISPN-6665
                 URL: https://issues.jboss.org/browse/ISPN-6665
             Project: Infinispan
          Issue Type: Bug
          Components: Core
    Affects Versions: 8.2.0.CR1
            Reporter: Brad Maxwell
            Assignee: Dan Berindei
             Fix For: 8.2.1.Final, 9.0.0.Alpha1, 8.1.4.Final


When a cache stops, it first removes the component registry from the {{GlobalComponentsRegistry}}'s {{namedComponents}} map, which means the node (let's call it {{A}}) will reply with a {{CacheNotFoundResponse}} to any remote command.

Another node {{B}} trying to execute a write/transactional command will receive the {{CacheNotFoundResponse}}, assume that a new cache topology with id {{current topology id + 1}} is coming soon, and wait for that new topology before retrying.

Normally this is not a problem, because {{StateTransferManagerImpl.stop()}} sends a {{CacheTopologyControlCommand(LEAVE)}} to the coordinator quickly enough, then {{B}} receives the {{current topology id + 1}} topology and retries the command.

But in some cases, the cache components that stop before {{StateTransferManagerImpl}} can take a long time to do so. In particular, because of {{ISPN-5507}}, {{TransactionTable}} can block for {{cacheStopTimeout}} if there are remote transactions in progress, even though the cache can no longer process remote commands.

We should give {{StateTransferManagerImpl.stop()}} a priority of {{0}}, so that the {{CacheTopologyControlCommand(LEAVE)}} comand is sent as soon as possible.



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the infinispan-issues mailing list