[JBoss JIRA] (ISPN-9762) Cache hangs during rebalancing
by Dan Berindei (Jira)
[ https://issues.jboss.org/browse/ISPN-9762?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-9762:
----------------------------------
Assignee: Ryan Emerson
> Cache hangs during rebalancing
> ------------------------------
>
> Key: ISPN-9762
> URL: https://issues.jboss.org/browse/ISPN-9762
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 9.4.2.Final
> Reporter: Sergey Chernolyas
> Assignee: Ryan Emerson
> Priority: Blocker
> Attachments: hang_node.txt, normal_node.txt, stat_bad_node.png, stat_good_node.png
>
>
> I have a cluster with two nodes. One node starts without problem. Second node hangs on rebalancing cache DEVICES.
> Configuration of the cache:
> {code:xml}
> <distributed-cache name="DEVICES" owners="2" segments="256" mode="SYNC">
> <state-transfer await-initial-transfer="true" enabled="true" timeout="2400000" chunk-size="2048"/>
> <partition-handling when-split="ALLOW_READ_WRITES" merge-policy="PREFERRED_ALWAYS"/>
> <memory>
> <object size="300000" strategy="REMOVE"/>
> </memory>
> <rocksdb-store preload="true" path="/data/rocksdb/devices/data">
> <expiration path="/data/rocksdb/devices/expired"/>
> </rocksdb-store>
> <indexing index="LOCAL">
> <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
> <property name="default.directory_provider">infinispan</property>
> <property name="default.worker.execution">async</property>
> <property name="default.index_flush_interval">500</property>
> <property name="default.indexwriter.merge_factor">30</property>
> <property name="default.indexwriter.merge_max_size">1024</property>
> <property name="default.indexwriter.ram_buffer_size">256</property>
> <property name="default.locking_cachename">LuceneIndexesLocking_devices</property>
> <property name="default.data_cachename">LuceneIndexesData_devices</property>
> <property name="default.metadata_cachename">LuceneIndexesMetadata_devices</property>
> </indexing>
> <expiration max-idle="172800000"/>
> </distributed-cache>
> {code}
> The cache contains 70 000 elements.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9701) TransactionTable does not shutdown gracefully
by Radoslav Husar (Jira)
[ https://issues.jboss.org/browse/ISPN-9701?page=com.atlassian.jira.plugin.... ]
Radoslav Husar updated ISPN-9701:
---------------------------------
Status: Resolved (was: Pull Request Sent)
Fix Version/s: 10.0.0.Alpha2
9.4.3.Final
Resolution: Done
Merged in 10.x and 9.4.x.
9.3.x, and 8.2.x are pending.
> TransactionTable does not shutdown gracefully
> ---------------------------------------------
>
> Key: ISPN-9701
> URL: https://issues.jboss.org/browse/ISPN-9701
> Project: Infinispan
> Issue Type: Bug
> Components: Transactions
> Affects Versions: 9.2.4.Final, 9.3.5.Final, 9.4.1.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
> Priority: Critical
> Fix For: 10.0.0.Alpha2, 9.4.3.Final
>
>
> Here's a sample stacktrace during shutdown:
> {noformat}
> 16:54:15,033 WARN [org.wildfly.clustering.web.undertow] (default task-1) ISPN000472: Cache manager is stopping: org.infinispan.IllegalLifecycleStateException: ISPN000472: Cache manager is stopping
> at org.infinispan.marshall.core.GlobalMarshaller.getExternalizer(GlobalMarshaller.java:420)
> at org.infinispan.marshall.core.GlobalMarshaller.writeNonNullableObject(GlobalMarshaller.java:400)
> at org.infinispan.marshall.core.GlobalMarshaller.writeNullableObject(GlobalMarshaller.java:355)
> at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:183)
> at org.infinispan.marshall.core.GlobalMarshaller.writeObjectOutput(GlobalMarshaller.java:176)
> at org.infinispan.marshall.core.GlobalMarshaller.objectToBuffer(GlobalMarshaller.java:305)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.marshallRequest(JGroupsTransport.java:1009)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.sendCommand(JGroupsTransport.java:1209)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.performAsyncRemoteInvocation(JGroupsTransport.java:1105)
> at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotelyAsync(JGroupsTransport.java:246)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotelyAsync(RpcManagerImpl.java:291)
> at org.infinispan.remoting.rpc.RpcManagerImpl.invokeRemotely(RpcManagerImpl.java:323)
> at org.infinispan.transaction.impl.TransactionTable.removeTransactionInfoRemotely(TransactionTable.java:900)
> at org.infinispan.transaction.impl.TransactionTable.releaseLocksForCompletedTransaction(TransactionTable.java:886)
> at org.infinispan.transaction.xa.XaTransactionTable.forgetSuccessfullyCompletedTransaction(XaTransactionTable.java:195)
> at org.infinispan.transaction.xa.XaTransactionTable.commit(XaTransactionTable.java:128)
> at org.infinispan.transaction.xa.TransactionXaAdapter.commit(TransactionXaAdapter.java:68)
> at org.infinispan.commons.tx.TransactionImpl.finishResource(TransactionImpl.java:419)
> at org.infinispan.commons.tx.TransactionImpl.commitResources(TransactionImpl.java:466)
> at org.infinispan.commons.tx.TransactionImpl.runCommit(TransactionImpl.java:335)
> at org.infinispan.commons.tx.TransactionImpl.commit(TransactionImpl.java:110)
> {noformat}
> The problem seems to be that shutDownGracefully() first waits for the localTransactions map to be empty. However, when the cache is clustered, releaseLocksForCompletedTransaction(...) removes the transaction from the localTransactions map *before* invoking removeTransactionInfoRemotely(...), which means that the subsequent TxCompletionNotificationCommand can fail to marshal (see above), or the transport might close before this command is sent.
> A naive fix would simply reorder the removeLocalTransaction(...) to happen after the call to removeTransactionInfoRemotely(...) within the releaseLocksForCompletedTransaction(...) method, but I'm sure there's more to it.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9415) Client topology is not updated after cache becomes degraded
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-9415?page=com.atlassian.jira.plugin.... ]
Diego Lovison updated ISPN-9415:
--------------------------------
Labels: (was: on-hold)
> Client topology is not updated after cache becomes degraded
> -----------------------------------------------------------
>
> Key: ISPN-9415
> URL: https://issues.jboss.org/browse/ISPN-9415
> Project: Infinispan
> Issue Type: Bug
> Components: Server
> Affects Versions: 9.4.0.Beta1, 9.3.1.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Major
> Fix For: 9.4.0.CR3
>
>
> When a new server is started, or after a merge, the other servers may see it as a an owner in the consistent hash before the other servers see its server address in the address cache ({{___hotRodTopologyCache}}). When a server needs to send a topology update but some of the servers are missing from the address cache, it can't send the topology update, so it tries to send a "partial update" that excludes the missing servers from the segment owners. In order to send the full topology update when the address cache is populated, the partial topology update has to be sent a smaller topology id, and that means it is only send if {{serverTopologyId >= clientTopologyId + 2}}.
> When the cluster splits and the cache becomes degraded, the servers in the other partition are removed from the address cache, but the list of segment owners is not updated, and the topology id is only incremented by 1. The address cache is incomplete, but a partial update cannot be sent, so the client keeps the old topology and keeps trying to connect to the servers in the other partition.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9475) AbstractInfinispanTest mistakenly detects some test with params as duplicates
by Diego Lovison (Jira)
[ https://issues.jboss.org/browse/ISPN-9475?page=com.atlassian.jira.plugin.... ]
Diego Lovison closed ISPN-9475.
-------------------------------
> AbstractInfinispanTest mistakenly detects some test with params as duplicates
> -----------------------------------------------------------------------------
>
> Key: ISPN-9475
> URL: https://issues.jboss.org/browse/ISPN-9475
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.4.0.CR1
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Priority: Major
> Labels: on-hold
> Fix For: 9.4.0.CR3, 9.3.3.Final
>
>
> Tests with undeclared params will not have the test name properly generated, so AbstractInfinispanTest will see all tests created by a factory as having the same name and will try to mark them as failed. Unfortunately the exception thrown in the method interceptor does not fail the test, it just manages to get it ignored.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months
[JBoss JIRA] (ISPN-9761) Cannot wire or start components while the registry is not running
by Radoslav Husar (Jira)
[ https://issues.jboss.org/browse/ISPN-9761?page=com.atlassian.jira.plugin.... ]
Radoslav Husar commented on ISPN-9761:
--------------------------------------
This is actually most likely caused by ISPN-9701.
> Cannot wire or start components while the registry is not running
> ------------------------------------------------------------------
>
> Key: ISPN-9761
> URL: https://issues.jboss.org/browse/ISPN-9761
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.4.1.Final
> Reporter: Radoslav Husar
> Priority: Major
>
> Looks like yet another clean shutdown issue surfaced after we changed WFLY-11324 to avoid some graceful shutdown bugs with the XA transaction table.
> {noformat}
> 16:28:38,874 ERROR [org.infinispan.commons.tx.TransactionImpl] (default task-2) ISPN000926: afterCompletion() failed for SynchronizationAdapter{localTransaction=LocalTransaction{remoteLockedNodes=[node-1, node-2], isMarkedForRollback=false, lockedKeys=[], backupKeyLocks=[SessionAccessMetaDataKey(2sYbjnFh2n-m_gkaOP2iz53e0ms4cbuARoeIdYJ5), SessionCreationMetaDataKey(2sYbjnFh2n-m_gkaOP2iz53e0ms4cbuARoeIdYJ5), SessionAttributesKey(2sYbjnFh2n-m_gkaOP2iz53e0ms4cbuARoeIdYJ5)], topologyId=5, stateTransferFlag=null} org.infinispan.transaction.synchronization.SyncLocalTransaction@72} org.infinispan.transaction.synchronization.SynchronizationAdapter@91: org.infinispan.IllegalLifecycleStateException: Cannot wire or start components while the registry is not running
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.prepareWrapperChange(BasicComponentRegistryImpl.java:610)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl.wireWrapper(BasicComponentRegistryImpl.java:158)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.wire(BasicComponentRegistryImpl.java:736)
> at org.infinispan.factories.impl.BasicComponentRegistryImpl$ComponentWrapper.running(BasicComponentRegistryImpl.java:712)
> at org.infinispan.transaction.impl.TransactionCoordinator.commit(TransactionCoordinator.java:148)
> at org.infinispan.transaction.impl.TransactionTable.afterCompletion(TransactionTable.java:861)
> at org.infinispan.transaction.synchronization.SynchronizationAdapter.afterCompletion(SynchronizationAdapter.java:33)
> at org.infinispan.commons.tx.TransactionImpl.notifyAfterCompletion(TransactionImpl.java:506)
> at org.infinispan.commons.tx.TransactionImpl.runCommit(TransactionImpl.java:338)
> at org.infinispan.commons.tx.TransactionImpl.commit(TransactionImpl.java:110)
> at org.wildfly.clustering.ee.infinispan.InfinispanBatch.close(InfinispanBatch.java:97)
> at org.wildfly.clustering.web.undertow.session.DistributableSession.requestDone(DistributableSession.java:87)
> at io.undertow.servlet.spec.ServletContextImpl.updateSessionAccessTime(ServletContextImpl.java:945)
> at io.undertow.servlet.spec.HttpServletResponseImpl.responseDone(HttpServletResponseImpl.java:579)
> at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:346)
> at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
> at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
> at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
> at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
> at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
> at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
> at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
> at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
> at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
> at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
> at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
> at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
> at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
> at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
> at io.undertow.server.Connectors.executeRootHandler(Connectors.java:360)
> at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
> at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
> at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1985)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1487)
> at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1378)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
7 years, 4 months