[infinispan-issues] [JBoss JIRA] (ISPN-9762) Cache hangs during rebalancing

Thu Nov 22 07:52:00 EST 2018

    [ https://issues.jboss.org/browse/ISPN-9762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665147#comment-13665147 ] 

Sergey Chernolyas commented on ISPN-9762:
-----------------------------------------

2018-11-22 15:50:23,671 WARN  [org.infinispan.remoting.inboundhandler.NonTotalOrderPerCacheInboundInvocationHandler] (remote-thread--p2-t7) ISPN000071: Caught exception when handling command StateResponseCommand{cache=DEVICES, pushTransfer=false, stateChunks=[StateChunk{
segmentId=0, cacheEntries=10, isLastChunk=false}, StateChunk{segmentId=1, cacheEntries=7, isLastChunk=false}, 
 org.infinispan.util.concurrent.TimeoutException: Timed out applying state
        at org.infinispan.statetransfer.StateConsumerImpl.applyState(StateConsumerImpl.java:583)
        at org.infinispan.statetransfer.StateResponseCommand.invokeAsync(StateResponseCommand.java:88)
        at org.infinispan.remoting.inboundhandler.BasePerCacheInboundInvocationHandler.invokeCommand(BasePerCacheInboundInvocationHandler.java:117)
        at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.invoke(BaseBlockingRunnable.java:99)
        at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.runAsync(BaseBlockingRunnable.java:71)
        at org.infinispan.remoting.inboundhandler.BaseBlockingRunnable.run(BaseBlockingRunnable.java:40)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


> Cache hangs during rebalancing
> ------------------------------
>
>                 Key: ISPN-9762
>                 URL: https://issues.jboss.org/browse/ISPN-9762
>             Project: Infinispan
>          Issue Type: Bug
>    Affects Versions: 9.4.2.Final
>            Reporter: Sergey Chernolyas
>            Priority: Blocker
>         Attachments: hang_node.txt, normal_node.txt, stat_bad_node.png, stat_good_node.png
>
>
> I have a cluster with two nodes. One node starts without problem. Second node hangs on rebalancing  cache DEVICES.
> Configuration of the cache:
> {code:xml}
>   <distributed-cache name="DEVICES" owners="2" segments="256"  mode="SYNC">
>                     <state-transfer await-initial-transfer="true" enabled="true" timeout="2400000" chunk-size="2048"/>
>                     <partition-handling when-split="ALLOW_READ_WRITES" merge-policy="PREFERRED_ALWAYS"/>
>                     <memory>
>                         <object size="300000" strategy="REMOVE"/>
>                     </memory>
>                     <rocksdb-store preload="true" path="/data/rocksdb/devices/data">
>                         <expiration path="/data/rocksdb/devices/expired"/>
>                     </rocksdb-store>
>                     <indexing index="LOCAL">
>                         <property name="default.indexmanager">org.infinispan.query.indexmanager.InfinispanIndexManager</property>
>                         <property name="default.directory_provider">infinispan</property>
>                         <property name="default.worker.execution">async</property>
>                         <property name="default.index_flush_interval">500</property>
>                         <property name="default.indexwriter.merge_factor">30</property>
>                         <property name="default.indexwriter.merge_max_size">1024</property>
>                         <property name="default.indexwriter.ram_buffer_size">256</property>
>                         <property name="default.locking_cachename">LuceneIndexesLocking_devices</property>
>                         <property name="default.data_cachename">LuceneIndexesData_devices</property>
>                         <property name="default.metadata_cachename">LuceneIndexesMetadata_devices</property>
>                     </indexing>
>                     <expiration max-idle="172800000"/>
>                 </distributed-cache>
> {code}
> The cache contains 70 000 elements.


--
This message was sent by Atlassian Jira
(v7.12.1#712002)