[infinispan-issues] [JBoss JIRA] (ISPN-9044) In Cluster - Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring complete data to Joining Node
Debashish Bharali (JIRA)
issues at jboss.org
Mon Apr 9 03:59:00 EDT 2018
[ https://issues.jboss.org/browse/ISPN-9044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557814#comment-13557814 ]
Debashish Bharali edited comment on ISPN-9044 at 4/9/18 3:58 AM:
-----------------------------------------------------------------
{color:red}*---ReplStateTransferCacheLoaderTest.java---*{color}
@Test(groups = "functional", testName = "statetransfer.ReplStateTransferCacheLoaderTest")
@CleanupAfterMethod
public class ReplStateTransferCacheLoaderTest extends MultipleCacheManagersTest implements Serializable{
/**
*
*/
private static final long serialVersionUID = 1L;
private static final Log log = LogFactory.getLog(ReplStateTransferCacheLoaderTest.class);
private File tmpDir;
private ConfigurationBuilder builder;
@Override
protected void createCacheManagers() {
tmpDir = new File(TestingUtil.tmpDirectory(this.getClass()));
Util.recursiveFileRemove(tmpDir);
// reproduce the MODE-1754 config as closely as possible
builder = getDefaultClusteredCacheConfig(CacheMode.REPL_SYNC, true, true);
builder.transaction().*transactionMode(TransactionMode.NON_TRANSACTIONAL)*.*lockingMode(LockingMode.OPTIMISTIC)*
.transactionManagerLookup(new DummyTransactionManagerLookup())
.eviction().*maxEntries(100)*.*strategy(EvictionStrategy.LRU)*
.locking().lockAcquisitionTimeout(20000)
.*concurrencyLevel(1000)* // lowering this to 50 makes the test pass also on 5.2 but it's just a temporary workaround
.useLockStriping(false).writeSkewCheck(false).isolationLevel(IsolationLevel.READ_COMMITTED)
.dataContainer().storeAsBinary()
.clustering().sync().*replTimeout(120000)*
.*stateTransfer().timeout(480000)*.*fetchInMemoryState(true)*.chunkSize(10000).*awaitInitialTransfer(true)*
.persistence().passivation(false).addSingleFileStore().location(new File(tmpDir, "store0").getAbsolutePath()).shared(false).preload(false)
.*fetchPersistentState(true)*
.ignoreModifications(false)
.purgeOnStartup(false);
createCluster(builder, 1);
waitForClusterToForm();
}
@AfterClass
protected void clearTempDir() {
// Util.recursiveFileRemove(tmpDir);
}
public void *testStateTransfer*() throws Exception {
final Long *numKeys = 100000l*;
for (Long i = 0l; i < numKeys; i++) {
TestEntity testEntity = new TestEntity(i,"DEBA_"+i);
cache(0).put(i, testEntity);
}
log.info("Finished putting keys");
System.out.println("Debashish -- "+"Finished putting keys");
for (Long i = 0l; i < numKeys; i++) {
assertEquals(i, ((TestEntity)cache(0).get(i)).getId());
}
log.info("Adding a new node ..");
System.out.println("Debashish -- "+"Adding a new node ..");
builder.persistence().clearStores().addSingleFileStore().location(new File(tmpDir, "store1").getAbsolutePath()) // make sure this node writes in a different location
.fetchPersistentState(true)
.ignoreModifications(false)
.purgeOnStartup(false);
addClusterEnabledCacheManager(builder);
log.info("Added a new node");
System.out.println("Debashish -- "+"Added a new node");
for (Long i = 0l; i < numKeys; i++) {
assertEquals(i, ((TestEntity)cache(1).get(i)).getId());
// assertEquals(i, cache(1).get(i)); // some keys are lost in 5.2
}
}
}
was (Author: debashish.bharali):
{color:red}*---ReplStateTransferCacheLoaderTest.java---*{color}
@Test(groups = "functional", testName = "statetransfer.ReplStateTransferCacheLoaderTest")
@CleanupAfterMethod
public class ReplStateTransferCacheLoaderTest extends MultipleCacheManagersTest implements Serializable{
/**
*
*/
private static final long serialVersionUID = 1L;
private static final Log log = LogFactory.getLog(ReplStateTransferCacheLoaderTest.class);
private File tmpDir;
private ConfigurationBuilder builder;
@Override
protected void createCacheManagers() {
tmpDir = new File(TestingUtil.tmpDirectory(this.getClass()));
Util.recursiveFileRemove(tmpDir);
// reproduce the MODE-1754 config as closely as possible
builder = getDefaultClusteredCacheConfig(CacheMode.REPL_SYNC, true, true);
builder.transaction().transactionMode(TransactionMode.NON_TRANSACTIONAL).lockingMode(LockingMode.OPTIMISTIC)
.transactionManagerLookup(new DummyTransactionManagerLookup())
.eviction().maxEntries(100).strategy(EvictionStrategy.LRU)
.locking().lockAcquisitionTimeout(20000)
.concurrencyLevel(1000) // lowering this to 50 makes the test pass also on 5.2 but it's just a temporary workaround
.useLockStriping(false).writeSkewCheck(false).isolationLevel(IsolationLevel.READ_COMMITTED)
.dataContainer().storeAsBinary()
.clustering().sync().replTimeout(120000)
.stateTransfer().timeout(480000).fetchInMemoryState(true).chunkSize(10000).awaitInitialTransfer(true)
.persistence().passivation(false).addSingleFileStore().location(new File(tmpDir, "store0").getAbsolutePath()).shared(false).preload(false)
.fetchPersistentState(true)
.ignoreModifications(false)
.purgeOnStartup(false);
createCluster(builder, 1);
waitForClusterToForm();
}
@AfterClass
protected void clearTempDir() {
// Util.recursiveFileRemove(tmpDir);
}
public void testStateTransfer() throws Exception {
final Long numKeys = 100000l;
for (Long i = 0l; i < numKeys; i++) {
TestEntity testEntity = new TestEntity(i,"DEBA_"+i);
cache(0).put(i, testEntity);
}
log.info("Finished putting keys");
System.out.println("Debashish -- "+"Finished putting keys");
for (Long i = 0l; i < numKeys; i++) {
assertEquals(i, ((TestEntity)cache(0).get(i)).getId());
}
log.info("Adding a new node ..");
System.out.println("Debashish -- "+"Adding a new node ..");
builder.persistence().clearStores().addSingleFileStore().location(new File(tmpDir, "store1").getAbsolutePath()) // make sure this node writes in a different location
.fetchPersistentState(true)
.ignoreModifications(false)
.purgeOnStartup(false);
addClusterEnabledCacheManager(builder);
log.info("Added a new node");
System.out.println("Debashish -- "+"Added a new node");
for (Long i = 0l; i < numKeys; i++) {
assertEquals(i, ((TestEntity)cache(1).get(i)).getId());
// assertEquals(i, cache(1).get(i)); // some keys are lost in 5.2
}
}
}
> In Cluster - Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring complete data to Joining Node
> -----------------------------------------------------------------------------------------------------------------------------
>
> Key: ISPN-9044
> URL: https://issues.jboss.org/browse/ISPN-9044
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 8.2.5.Final
> Reporter: Debashish Bharali
> Priority: Critical
> Attachments: neutrino-hibernatesearch-infinispan.xml
>
>
> Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring complete data to Joining Node.
> Related to ISPN-8980 (https://issues.jboss.org/browse/ISPN-8980).
> We are using Hibernate Search Indexes - Lucene indexes being stored on Infinispan with SingleFileStore.
> In case of more than 1 node. For example 4 nodes. We are observing below behaviour.
> Below are the steps:
> # We startup the first node *'N1'* in maintenance mode - with MassIndexer - creating initial indexes.
> # Now after all the MassIndexer/EntityLoader threads ends (after 1-2 Hrs). I.e. MassIndexing has been completed. We startup all other 3 nodes *'N2' , 'N3' and 'N4'*. Without MassIndexer.
> # Now on moderate to heavy application usage (concurrency), we are again getting the same exception of *Exception occurred java.io.FileNotFoundException: Error loading metadata for index file. Which indicates, {color:red}Some entries are not present in cache.{color}*
> # *But this exception comes only on the other 3 nodes (N2, N3 and N4). Not on the first node N1.*
> # On checking the sizes of the Cache stores in all the Nodes, the 3 Nodes (N2,N3 and N4) are having almost equal size (600 MB), which is 50%-70% of the size of Cache Stores of N1 (1.2 GB).
> # We have repeated these steps multiple times. Even switched MassIndexing node to other 3 nodes too. We have even reduced the number of nodes to 2.
> # *But the behaviour is exactly same. I.e. Exception on all the nodes except the initial node doing MassIndexing.*
> # {color:red} It seems like, *'N1's* cache-store's persistent state is not getting fetched by *'N2' 'N3' and 'N4'*, when these node joins joins.{color}
> # This is indicated by the fact that, FileNotFoundException doesn't comes in 'N1'. It comes in other nodes only (who joined later -- like N2, N3 & N4). And size of cache store's *'.DAT'* files are smaller then *'N1's*.
> Require urgent support.
> Attaching the corresponding Infinispan config file (neutrino-hibernatesearch-infinispan.xml)
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
More information about the infinispan-issues
mailing list