[JBoss JIRA] (ISPN-6341) StateTransferManager should be the first component to stop
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-6341?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-6341:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1315393|https://bugzilla.redhat.com/show_bug.cgi?id=1315393] from POST to MODIFIED
> StateTransferManager should be the first component to stop
> ----------------------------------------------------------
>
> Key: ISPN-6341
> URL: https://issues.jboss.org/browse/ISPN-6341
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.2.0.CR1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 9.0.0.Alpha1, 8.2.1.Final
>
>
> When a cache stops, it first removes the component registry from the {{GlobalComponentsRegistry}}'s {{namedComponents}} map, which means the node (let's call it {{A}}) will reply with a {{CacheNotFoundResponse}} to any remote command.
> Another node {{B}} trying to execute a write/transactional command will receive the {{CacheNotFoundResponse}}, assume that a new cache topology with id {{current topology id + 1}} is coming soon, and wait for that new topology before retrying.
> Normally this is not a problem, because {{StateTransferManagerImpl.stop()}} sends a {{CacheTopologyControlCommand(LEAVE)}} to the coordinator quickly enough, then {{B}} receives the {{current topology id + 1}} topology and retries the command.
> But in some cases, the cache components that stop before {{StateTransferManagerImpl}} can take a long time to do so. In particular, because of {{ISPN-5507}}, {{TransactionTable}} can block for {{cacheStopTimeout}} if there are remote transactions in progress, even though the cache can no longer process remote commands.
> We should give {{StateTransferManagerImpl.stop()}} a priority of {{0}}, so that the {{CacheTopologyControlCommand(LEAVE)}} comand is sent as soon as possible.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months
[JBoss JIRA] (ISPN-5507) Transactions committed immediately before cache stop can block shutdown
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5507?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5507:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1315393|https://bugzilla.redhat.com/show_bug.cgi?id=1315393] from POST to MODIFIED
> Transactions committed immediately before cache stop can block shutdown
> -----------------------------------------------------------------------
>
> Key: ISPN-5507
> URL: https://issues.jboss.org/browse/ISPN-5507
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Test Suite - Core
> Affects Versions: 7.2.1.Final, 8.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Blocker
> Labels: testsuite_stability
> Fix For: 9.0.0.Alpha1, 8.2.1.Final
>
>
> This is causing random failures in {{DistributedEntryRetrieverTxTest.verifyNodeLeavesBeforeGettingData}}.
> The test inserts some values into the cache, starts an iteration, and then kills one of the nodes. In rare instances, the killed cache only receives the TxCompletionCommand for one of the writes after it started the shutdown, and ignores it. That leaves the remote tx on-going, and {{TransactionTable.shutDownGracefully()}} blocks for 30 seconds - causing a {{TimeoutException}} elsewhere in the test.
> {noformat}
> 10:52:18,129 TRACE (remote-thread-NodeAM-p12133-t6:) [CommandAwareRpcDispatcher] About to send back response SuccessfulResponse{responseValue=null} for command CommitCommand {gtx=GlobalTransaction:<NodeAL-45757>:22325:remote, cacheName='org.infinispan.iteration.DistributedEntryRetrieverTxTest', topologyId=4}
> 10:52:18,129 TRACE (testng-DistributedEntryRetrieverTxTest:) [JGroupsTransport] dests=[NodeAM-45518, NodeAL-45757], command=TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} , mode=ASYNCHRONOUS, timeout=15000
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [CacheImpl] Stopping cache org.infinispan.iteration.DistributedEntryRetrieverTxTest on NodeAM-45518
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: TxCompletionNotificationCommand{ xid=null, internalId=0, topologyId=4, gtx=GlobalTransaction:<NodeAL-45757>:22325:local, cacheName=org.infinispan.iteration.DistributedEntryRetrieverTxTest} [sender=NodeAL-45757]
> 10:52:18,133 TRACE (OOB-2,NodeAM-45518:) [GlobalInboundInvocationHandler] Silently ignoring that org.infinispan.iteration.DistributedEntryRetrieverTxTest cache is not defined
> 10:52:18,133 DEBUG (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] Wait for on-going transactions to finish for 30 seconds.
> 10:52:48,139 WARN (testng-DistributedEntryRetrieverTxTest:) [TransactionTable] ISPN000100: Stopping, but there are 0 local transactions and 1 remote transactions that did not finish in time.
> 10:52:48,386 ERROR (testng-DistributedEntryRetrieverTxTest:) [UnitTestTestNGListener] Test verifyNodeLeavesBeforeGettingData(org.infinispan.iteration.DistributedEntryRetrieverTxTest) failed.
> java.lang.IllegalStateException: Thread already timed out waiting for event pre_send_response_released
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:131)
> at org.infinispan.test.fwk.CheckPoint.trigger(CheckPoint.java:116)
> at org.infinispan.iteration.DistributedEntryRetrieverTest.verifyNodeLeavesBeforeGettingData(DistributedEntryRetrieverTest.java:105)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months
[JBoss JIRA] (ISPN-6043) TransactionTable should ignore view changes during shutdown
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-6043?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-6043:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1310583|https://bugzilla.redhat.com/show_bug.cgi?id=1310583] from POST to MODIFIED
> TransactionTable should ignore view changes during shutdown
> -----------------------------------------------------------
>
> Key: ISPN-6043
> URL: https://issues.jboss.org/browse/ISPN-6043
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 8.1.0.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 8.2.0.Beta1, 8.2.0.Final, 8.1.1.Final
>
>
> During shutdown, {{TransactionTable}} unregisters itself as a view change listener, but it can still receive view change notifications after it stopped the executor service. When that happens, it causes a {{RejectedExecutionException}} that is eventually logged by JGroups:
> {noformat}
> pbcast.GMS - JGRP000027: failed passing message up
> java.lang.RuntimeException: org.infinispan.commons.CacheListenerException: ISPN000280: Caught exception [java.util.concurrent.RejectedExecutionException] while invoking method [public void org.infinispan.transaction.TransactionTable.onViewChange(org.infinispan.notifications.cachemanagerlistener.event.ViewChangedEvent)] on listener instance: org.infinispan.transaction.TransactionTable@3d5ab0ba
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:682)
> at org.jgroups.JChannel.up(JChannel.java:733)
> at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1029)
> at org.jgroups.protocols.RSVP.up(RSVP.java:201)
> at org.jgroups.protocols.FRAG2.up(FRAG2.java:165)
> at org.jgroups.protocols.FlowControl.up(FlowControl.java:394)
> at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:732)
> at org.jgroups.protocols.pbcast.ParticipantGmsImpl.handleViewChange(ParticipantGmsImpl.java:146)
> at org.jgroups.protocols.pbcast.GMS.up(GMS.java:922)
> at org.jgroups.stack.Protocol.up(Protocol.java:412)
> at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:294)
> at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:474)
> at org.jgroups.protocols.pbcast.NAKACK2.deliverBatch(NAKACK2.java:982)
> at org.jgroups.protocols.pbcast.NAKACK2.removeAndPassUp(NAKACK2.java:912)
> at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:846)
> at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:618)
> at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155)
> at org.jgroups.protocols.FD.up(FD.java:255)
> at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:297)
> at org.jgroups.protocols.MERGE3.up(MERGE3.java:288)
> at org.jgroups.protocols.Discovery.up(Discovery.java:291)
> at org.jgroups.protocols.TP.passMessageUp(TP.java:1572)
> at org.jgroups.protocols.TP$MyHandler.run(TP.java:1791)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.infinispan.commons.CacheListenerException: ISPN000280: Caught exception [java.util.concurrent.RejectedExecutionException] while invoking method [public void org.infinispan.transaction.TransactionTable.onViewChange(org.infinispan.notifications.cachemanagerlistener.event.ViewChangedEvent)] on listener instance: org.infinispan.transaction.TransactionTable@3d5ab0ba
> at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocationImpl$1.run(AbstractListenerImpl.java:287)
> at org.infinispan.util.concurrent.WithinThreadExecutor.execute(WithinThreadExecutor.java:22)
> at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocationImpl.invoke(AbstractListenerImpl.java:305)
> at org.infinispan.notifications.cachemanagerlistener.CacheManagerNotifierImpl.notifyViewChange(CacheManagerNotifierImpl.java:88)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport$NotifyViewChange.emitNotification(JGroupsTransport.java:638)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.viewAccepted(JGroupsTransport.java:708)
> at org.jgroups.blocks.MessageDispatcher.handleUpEvent(MessageDispatcher.java:602)
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:679)
> ... 25 more
> Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@1f5986a3 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@2e964769[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1696]
> at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325)
> at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:546)
> at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:646)
> at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:641)
> at org.infinispan.transaction.TransactionTable.onViewChange(TransactionTable.java:491)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocationImpl$1.run(AbstractListenerImpl.java:282)
> ... 32 more
> {noformat}
> The exception is harmless for the stopping cache, the problem is that the following view change listeners are also skipped. We should fix both {{TransactionTable}} to avoid throwing the exception, and {{CacheManagerNotifier}} to ignore any exceptions during view changes.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months
[JBoss JIRA] (ISPN-6276) Non-threadsafe use of HashSet in AdvancedAsyncCacheLoader
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-6276?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-6276:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1312186|https://bugzilla.redhat.com/show_bug.cgi?id=1312186] from POST to MODIFIED
> Non-threadsafe use of HashSet in AdvancedAsyncCacheLoader
> ----------------------------------------------------------
>
> Key: ISPN-6276
> URL: https://issues.jboss.org/browse/ISPN-6276
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 6.0.2.Final
> Reporter: Dennis Reed
> Assignee: Sebastian Łaskawiec
>
> org.infinispan.persistence.async.AdvancedAsyncCacheLoader$process creates a HashSet, and passes it to loadAllKeys().
> loadAllKeys() creates a task to get each key and add it to the HashSet.
> This task is run by org.infinispan.persistence.file.SingleFileStore#process, which runs it in multiple threads at once (one thread per key).
> There is no synchronization on that HashSet that is shared by the multiple threads.
> HashSet is not thread safe. One known side effect of non-synchronized access by multiple threads is infinite loops, which has been witnessed here.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months
[JBoss JIRA] (ISPN-3938) AdvancedAsyncCacheLoader.process() concurrency issues
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-3938?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-3938:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1312186|https://bugzilla.redhat.com/show_bug.cgi?id=1312186] from POST to MODIFIED
> AdvancedAsyncCacheLoader.process() concurrency issues
> -----------------------------------------------------
>
> Key: ISPN-3938
> URL: https://issues.jboss.org/browse/ISPN-3938
> Project: Infinispan
> Issue Type: Bug
> Components: Core, Loaders and Stores
> Affects Versions: 6.0.0.Final
> Reporter: Dan Berindei
> Assignee: Sebastian Łaskawiec
> Fix For: 8.2.0.CR1
>
>
> {{AdvancedAsyncCacheLoader.process()}} calls {{advancedLoader().process()}} to collect all the keys in the store, but the HashSet used to collect the keys it not thread-safe. This can cause problems, e.g. during state transfer:
> {noformat}
> WARN cheTopologyControlCommand | ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=sessions, type=CH_UPDATE, sender=alfie-lt-46127, joinInfo=null, topologyId=3, currentCH=DefaultConsistentHash{numSegments=60, numOwners=1, members=[alfie-lt-46127]}, pendingCH=null, throwable=null, viewId=1}java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926)
> at java.util.HashMap$KeyIterator.next(HashMap.java:960)
> at org.infinispan.persistence.async.AdvancedAsyncCacheLoader.process(AdvancedAsyncCacheLoader.java:80)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.processOnAllStores(PersistenceManagerImpl.java:414)
> at org.infinispan.statetransfer.StateConsumerImpl.invalidateSegments(StateConsumerImpl.java:910)
> at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:393)
> at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:178)
> at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:38)
> at org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:100)
> at org.infinispan.topology.LocalTopologyManagerImpl.handleConsistentHashUpdate(LocalTopologyManagerImpl.java:191)
> at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:152)
> at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:124)
> at org.infinispan.topology.ClusterTopologyManagerImpl$3.run(ClusterTopologyManagerImpl.java:606)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months
[JBoss JIRA] (ISPN-5857) HotRod client should use the consistent hash in replicated mode
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5857?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5857:
-----------------------------------------------
Sat6QE Jenkins <sat6-jenkins(a)redhat.com> changed the Status of [bug 1272389|https://bugzilla.redhat.com/show_bug.cgi?id=1272389] from POST to MODIFIED
> HotRod client should use the consistent hash in replicated mode
> ---------------------------------------------------------------
>
> Key: ISPN-5857
> URL: https://issues.jboss.org/browse/ISPN-5857
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 8.0.1.Final, 8.1.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Critical
> Labels: hotrod
> Fix For: 8.1.0.Alpha2, 8.1.0.Final
>
>
> The HotRod server assumes that in replicated mode the server accessed by the client doesn't matter, so it doesn't send a consistent hash in the topology updates.
> However, since replicated mode is now implemented on top of distributed mode, hitting the primary owner or another node makes a big difference. We should change the server to send the consistent hash for replicated caches as well.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 6 months