[infinispan-issues] [JBoss JIRA] (ISPN-11099) Improve rebalance phase time spent during the conflict resolution

Tue Dec 17 14:06:00 EST 2019

    [ https://issues.redhat.com/browse/ISPN-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936525#comment-13936525 ] 

Diego Lovison commented on ISPN-11099:
--------------------------------------

Once fixed this issue, we could rerun the scalability tests

> Improve rebalance phase time spent during the conflict resolution
> -----------------------------------------------------------------
>
>                 Key: ISPN-11099
>                 URL: https://issues.redhat.com/browse/ISPN-11099
>             Project: Infinispan
>          Issue Type: Enhancement
>          Components: Core
>    Affects Versions: 10.1.0.CR1
>            Reporter: Diego Lovison
>            Priority: Critical
>         Attachments: log.txt
>
>
> Give a sync replica cache with 3 nodes
> Introduce a network partition
> Change the cache value
> Fix the network partition
> Check the new cache value
> https://github.com/diegolovison/jgroups-chaos
> mvn test -Dtest=InfinispanPartitionHandlingTest -Dlog4j.configurationFile=log4j2-ispn-conflict-manager.xml
> config:
> {code:java}
>          cacheConfigurationBuilder.clustering().cacheMode(CacheMode.REPL_SYNC)
>             .partitionHandling().mergePolicy(MergePolicy.REMOVE_ALL);
> {code}
> logs:
> {noformat}
> 15:44:31,827 INFO  (stateTransferExecutor-thread--p22-t1) [CLUSTER] [Context=org.infinispan.CONFIG] ISPN100002: Starting rebalance with members [Diegos-MacBook-Pro-40446, Diegos-MacBook-Pro-6797, Diegos-MacBook-Pro-52614], phase READ_OLD_WRITE_ALL, topology id 14
> 15:44:56,725 TRACE (stateTransferExecutor-thread--p22-t1) [StateReceiverImpl] Cache fooCache attempting to receive replicas for segment 83 from [Diegos-MacBook-Pro-40446, Diegos-MacBook-Pro-6797, Diegos-MacBook-Pro-52614] with topologyId=12, timeout=215112
> {noformat}
> expected result: for one key, running all cluster in the same machine. I think that we can improve the business rules to avoid taking more than 10 seconds to finish the test.


--
This message was sent by Atlassian Jira
(v7.13.8#713008)