[
https://issues.jboss.org/browse/ISPN-9044?page=com.atlassian.jira.plugin....
]
Debashish Bharali commented on ISPN-9044:
-----------------------------------------
[~dan.berindei] I have done multiple iterations of the suggested test *successfully*.
Corresponding files are attached.
We didn't face issue while doing state transfer.
We tried with 100000 objects with total size of CacheStoreFile approx 1.1 GB.
But we did face some issue when we tried to increase the individual object size, but it
gave proper exception of StateTransferTimeOut.
From these initial tests, it seems like, the issue is not related to
StateTransfer and Fetching Persistent State of Cache Store for normal objects.
*But
somehow, we are facing this issue for storing lucene indexes.*
Please comment,
In Cluster - Infinispan - SingleFileStore -
fetchPersistentState/StateTransfer not transferring complete data to Joining Node
-----------------------------------------------------------------------------------------------------------------------------
Key: ISPN-9044
URL:
https://issues.jboss.org/browse/ISPN-9044
Project: Infinispan
Issue Type: Bug
Components: Lucene Directory
Affects Versions: 8.2.5.Final
Reporter: Debashish Bharali
Priority: Critical
Attachments: neutrino-hibernatesearch-infinispan.xml
Infinispan - SingleFileStore - fetchPersistentState/StateTransfer not transferring
complete data to Joining Node.
Related to ISPN-8980 (
https://issues.jboss.org/browse/ISPN-8980).
We are using Hibernate Search Indexes - Lucene indexes being stored on Infinispan with
SingleFileStore.
In case of more than 1 node. For example 4 nodes. We are observing below behaviour.
Below are the steps:
# We startup the first node *'N1'* in maintenance mode - with MassIndexer -
creating initial indexes.
# Now after all the MassIndexer/EntityLoader threads ends (after 1-2 Hrs). I.e.
MassIndexing has been completed. We startup all other 3 nodes *'N2' , 'N3'
and 'N4'*. Without MassIndexer.
# Now on moderate to heavy application usage (concurrency), we are again getting the same
exception of *Exception occurred java.io.FileNotFoundException: Error loading metadata for
index file. Which indicates, {color:red}Some entries are not present in cache.{color}*
# *But this exception comes only on the other 3 nodes (N2, N3 and N4). Not on the first
node N1.*
# On checking the sizes of the Cache stores in all the Nodes, the 3 Nodes (N2,N3 and N4)
are having almost equal size (600 MB), which is 50%-70% of the size of Cache Stores of N1
(1.2 GB).
# We have repeated these steps multiple times. Even switched MassIndexing node to other 3
nodes too. We have even reduced the number of nodes to 2.
# *But the behaviour is exactly same. I.e. Exception on all the nodes except the initial
node doing MassIndexing.*
# {color:red} It seems like, *'N1's* cache-store's persistent state is not
getting fetched by *'N2' 'N3' and 'N4'*, when these node joins
joins.{color}
# This is indicated by the fact that, FileNotFoundException doesn't comes in
'N1'. It comes in other nodes only (who joined later -- like N2, N3 & N4). And
size of cache store's *'.DAT'* files are smaller then *'N1's*.
Require urgent support.
Attaching the corresponding Infinispan config file
(neutrino-hibernatesearch-infinispan.xml)
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)