[JBoss JIRA] (ISPN-6341) StateTransferManager should be the first component to stop
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6341?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant reopened ISPN-6341:
-----------------------------------
> StateTransferManager should be the first component to stop
> ----------------------------------------------------------
>
> Key: ISPN-6341
> URL: https://issues.jboss.org/browse/ISPN-6341
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.2.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 8.2.1.Final, 9.0.0.Alpha1
>
>
> When a cache stops, it first removes the component registry from the {{GlobalComponentsRegistry}}'s {{namedComponents}} map, which means the node (let's call it {{A}}) will reply with a {{CacheNotFoundResponse}} to any remote command.
> Another node {{B}} trying to execute a write/transactional command will receive the {{CacheNotFoundResponse}}, assume that a new cache topology with id {{current topology id + 1}} is coming soon, and wait for that new topology before retrying.
> Normally this is not a problem, because {{StateTransferManagerImpl.stop()}} sends a {{CacheTopologyControlCommand(LEAVE)}} to the coordinator quickly enough, then {{B}} receives the {{current topology id + 1}} topology and retries the command.
> But in some cases, the cache components that stop before {{StateTransferManagerImpl}} can take a long time to do so. In particular, because of {{ISPN-5507}}, {{TransactionTable}} can block for {{cacheStopTimeout}} if there are remote transactions in progress, even though the cache can no longer process remote commands.
> We should give {{StateTransferManagerImpl.stop()}} a priority of {{0}}, so that the {{CacheTopologyControlCommand(LEAVE)}} comand is sent as soon as possible.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6357) Deadlock during server start
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6357?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-6357:
----------------------------------
Fix Version/s: 8.1.4.Final
> Deadlock during server start
> ----------------------------
>
> Key: ISPN-6357
> URL: https://issues.jboss.org/browse/ISPN-6357
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 8.2.1.Final, 9.0.0.Alpha1, 8.1.4.Final
>
> Attachments: s0.txt, s1.txt, server1.txt, server2.txt
>
>
> This happens frequently when starting servers in parallel, the more servers, the easier to reproduce.
> Attached the stack trace of server1 and server2 after hanging.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6357) Deadlock during server start
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6357?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant resolved ISPN-6357.
-----------------------------------
Resolution: Done
> Deadlock during server start
> ----------------------------
>
> Key: ISPN-6357
> URL: https://issues.jboss.org/browse/ISPN-6357
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 8.1.4.Final, 9.0.0.Alpha1, 8.2.1.Final
>
> Attachments: s0.txt, s1.txt, server1.txt, server2.txt
>
>
> This happens frequently when starting servers in parallel, the more servers, the easier to reproduce.
> Attached the stack trace of server1 and server2 after hanging.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-5507) Transactions committed immediately before cache stop can block shutdown
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5507?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant resolved ISPN-5507.
-----------------------------------
Resolution: Done
> Transactions committed immediately before cache stop can block shutdown
> -----------------------------------------------------------------------
>
> Key: ISPN-5507
> URL: https://issues.jboss.org/browse/ISPN-5507
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.2.1.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 8.1.4.Final, 9.0.0.Alpha1, 8.2.1.Final
>
>
> This is causing random failures in {{DistributedEntryRetrieverTxTest.verifyNodeLeavesBeforeGettingData}}.
> The test inserts some values into the cache, starts an iteration, and then kills one of the nodes. In rare instances, the killed cache only receives the TxCompletionCommand for one of the writes after it started the shutdown, and ignores it. That leaves the remote tx on-going, and {{TransactionTable.shutDownGracefully()}} blocks for 30 seconds - causing a {{TimeoutException}} elsewhere in the test.
> {noformat}
> 10:52:18,129 TRACE (remote-thread-NodeAM-p12133-t6:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=null} for command CommitCommand {gtx=GlobalTransaction:<NodeAL-45757>:22325:remote, cacheName='org.infinispan.iteration.DistributedEntryRetrieverTxTest', topologyId=4}
> 10:52:18,129 TRACE (testng-DistributedEntryRetrieverTxTest:) [JGroupsTransport] dests=[NodeAM-45518, NodeAL-45757], command=TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} , mode=ASYNCHRONOUS, timeout=15000
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [CacheImpl] Stopping cache org.infinispan.iteration.DistributedEntryRetrieverTxTest on NodeAM-45518
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} [sender=NodeAL-45757]
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Silently ignoring that org.infinispan.iteration.DistributedEntryRetrieverTxTest cache is not defined
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] Wait for on-going transactions to finish for 30 seconds.
> 10:52:48,139 WARN (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] ISPN000100: Stopping, but there are 0 local transactions and 1 remote transactions that did not finish in time.
> 10:52:48,386 ERROR (testng-DistributedEntryRetrieverTxTest:) [UnitTestTestNGListener] Test verifyNodeLeavesBeforeGettingData(org.infinispan.iteration.DistributedEntryRetrieverTxTest) failed.
> java.lang.IllegalStateException: Thread already timed out waiting for event pre_send_response_released
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:116)
> at org.infinispan.iteration.DistributedEntryRetrieverTest.verifyNodeLeavesBeforeGettingData(DistributedEntryRetrieverTest.java:105)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6357) Deadlock during server start
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6357?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant reopened ISPN-6357:
-----------------------------------
> Deadlock during server start
> ----------------------------
>
> Key: ISPN-6357
> URL: https://issues.jboss.org/browse/ISPN-6357
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Server
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 8.2.1.Final, 9.0.0.Alpha1
>
> Attachments: s0.txt, s1.txt, server1.txt, server2.txt
>
>
> This happens frequently when starting servers in parallel, the more servers, the easier to reproduce.
> Attached the stack trace of server1 and server2 after hanging.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-5507) Transactions committed immediately before cache stop can block shutdown
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5507?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant reopened ISPN-5507:
-----------------------------------
> Transactions committed immediately before cache stop can block shutdown
> -----------------------------------------------------------------------
>
> Key: ISPN-5507
> URL: https://issues.jboss.org/browse/ISPN-5507
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.2.1.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 8.2.1.Final, 9.0.0.Alpha1
>
>
> This is causing random failures in {{DistributedEntryRetrieverTxTest.verifyNodeLeavesBeforeGettingData}}.
> The test inserts some values into the cache, starts an iteration, and then kills one of the nodes. In rare instances, the killed cache only receives the TxCompletionCommand for one of the writes after it started the shutdown, and ignores it. That leaves the remote tx on-going, and {{TransactionTable.shutDownGracefully()}} blocks for 30 seconds - causing a {{TimeoutException}} elsewhere in the test.
> {noformat}
> 10:52:18,129 TRACE (remote-thread-NodeAM-p12133-t6:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=null} for command CommitCommand {gtx=GlobalTransaction:<NodeAL-45757>:22325:remote, cacheName='org.infinispan.iteration.DistributedEntryRetrieverTxTest', topologyId=4}
> 10:52:18,129 TRACE (testng-DistributedEntryRetrieverTxTest:) [JGroupsTransport] dests=[NodeAM-45518, NodeAL-45757], command=TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} , mode=ASYNCHRONOUS, timeout=15000
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [CacheImpl] Stopping cache org.infinispan.iteration.DistributedEntryRetrieverTxTest on NodeAM-45518
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} [sender=NodeAL-45757]
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Silently ignoring that org.infinispan.iteration.DistributedEntryRetrieverTxTest cache is not defined
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] Wait for on-going transactions to finish for 30 seconds.
> 10:52:48,139 WARN (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] ISPN000100: Stopping, but there are 0 local transactions and 1 remote transactions that did not finish in time.
> 10:52:48,386 ERROR (testng-DistributedEntryRetrieverTxTest:) [UnitTestTestNGListener] Test verifyNodeLeavesBeforeGettingData(org.infinispan.iteration.DistributedEntryRetrieverTxTest) failed.
> java.lang.IllegalStateException: Thread already timed out waiting for event pre_send_response_released
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:116)
> at org.infinispan.iteration.DistributedEntryRetrieverTest.verifyNodeLeavesBeforeGettingData(DistributedEntryRetrieverTest.java:105)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-5507) Transactions committed immediately before cache stop can block shutdown
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-5507?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-5507:
----------------------------------
Fix Version/s: 8.1.4.Final
> Transactions committed immediately before cache stop can block shutdown
> -----------------------------------------------------------------------
>
> Key: ISPN-5507
> URL: https://issues.jboss.org/browse/ISPN-5507
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.2.1.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 8.2.1.Final, 9.0.0.Alpha1, 8.1.4.Final
>
>
> This is causing random failures in {{DistributedEntryRetrieverTxTest.verifyNodeLeavesBeforeGettingData}}.
> The test inserts some values into the cache, starts an iteration, and then kills one of the nodes. In rare instances, the killed cache only receives the TxCompletionCommand for one of the writes after it started the shutdown, and ignores it. That leaves the remote tx on-going, and {{TransactionTable.shutDownGracefully()}} blocks for 30 seconds - causing a {{TimeoutException}} elsewhere in the test.
> {noformat}
> 10:52:18,129 TRACE (remote-thread-NodeAM-p12133-t6:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=null} for command CommitCommand {gtx=GlobalTransaction:<NodeAL-45757>:22325:remote, cacheName='org.infinispan.iteration.DistributedEntryRetrieverTxTest', topologyId=4}
> 10:52:18,129 TRACE (testng-DistributedEntryRetrieverTxTest:) [JGroupsTransport] dests=[NodeAM-45518, NodeAL-45757], command=TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} , mode=ASYNCHRONOUS, timeout=15000
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [CacheImpl] Stopping cache org.infinispan.iteration.DistributedEntryRetrieverTxTest on NodeAM-45518
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} [sender=NodeAL-45757]
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Silently ignoring that org.infinispan.iteration.DistributedEntryRetrieverTxTest cache is not defined
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] Wait for on-going transactions to finish for 30 seconds.
> 10:52:48,139 WARN (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] ISPN000100: Stopping, but there are 0 local transactions and 1 remote transactions that did not finish in time.
> 10:52:48,386 ERROR (testng-DistributedEntryRetrieverTxTest:) [UnitTestTestNGListener] Test verifyNodeLeavesBeforeGettingData(org.infinispan.iteration.DistributedEntryRetrieverTxTest) failed.
> java.lang.IllegalStateException: Thread already timed out waiting for event pre_send_response_released
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:116)
> at org.infinispan.iteration.DistributedEntryRetrieverTest.verifyNodeLeavesBeforeGettingData(DistributedEntryRetrieverTest.java:105)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6235) ClusterTopologyManagerImpl join during cluster status recovery
by Tristan Tarrant (JIRA)
[ https://issues.jboss.org/browse/ISPN-6235?page=com.atlassian.jira.plugin.... ]
Tristan Tarrant updated ISPN-6235:
----------------------------------
Fix Version/s: 8.1.4.Final
> ClusterTopologyManagerImpl join during cluster status recovery
> --------------------------------------------------------------
>
> Key: ISPN-6235
> URL: https://issues.jboss.org/browse/ISPN-6235
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 8.1.3.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 8.2.0.CR1, 8.1.4.Final
>
>
> If the joiner has the correct view id, but the current status is
> RECOVERING_CLUSTER, we should wait for the cluster status recovery to
> finish before adding the new member.
> We are currently not doing that, so the new member could be erased by the status recovery process that's in progress. This can happen if the coordinator joiner already had been a member of the JGroups cluster for some time, and there's no view change when they actually start their caches (exactly the scenario in {{ConcurrentStartTest}}).
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (ISPN-6467) Management console - can access console after logging off
by Vladimir Blagojevic (JIRA)
[ https://issues.jboss.org/browse/ISPN-6467?page=com.atlassian.jira.plugin.... ]
Vladimir Blagojevic updated ISPN-6467:
--------------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Management console - can access console after logging off
> ---------------------------------------------------------
>
> Key: ISPN-6467
> URL: https://issues.jboss.org/browse/ISPN-6467
> Project: Infinispan
> Issue Type: Bug
> Components: Console
> Affects Versions: 8.2.1.Final
> Reporter: Jiří Holuša
> Assignee: Vladimir Blagojevic
> Fix For: 9.0.0.Alpha2, 8.2.2.Final
>
> Attachments: screenshot.png
>
>
> Steps to reproduce:
> * log in
> * log out
> * hit "Back" button in browser
> * The top menu bar appears and you can still access the whole application, except the background is now broken, see the screenshot.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months