[JBoss JIRA] (ISPN-9028) Write-only segments should be invalidated during the READ_NEW phase
by Dan Berindei (JIRA)
Dan Berindei created ISPN-9028:
Summary: Write-only segments should be invalidated during the READ_NEW phase
Key: ISPN-9028
URL: https://issues.jboss.org/browse/ISPN-9028
Project: Infinispan
Issue Type: Bug
Components: Core
Affects Versions: 9.2.1.Final
Reporter: Dan Berindei
Fix For: 9.3.0.Alpha1
When a rebalance removes a segment X from node A, node A keeps updating entries in segment X until the rebalance finishes, and only deletes the entries of segment X after entering the NO_REBALANCE phase.
This is problematic for tests that work with the data container directly, because {{waitForNoRebalance()}} doesn't wait for the removal of stale entries. The test will work without an explicit wait most of the time, so this is a recipe for random test failures (e.g. ISPN-8728).
As described in ISPN-5021, we can prevent any writes to segment X at the start of the READ_NEW_WRITE_ALL phase, send the phase confirmation to the coordinator, and then remove the entries asynchronously. We just need to keep track of the removal task and only install/confirm the NO_REBALANCE phase once all the entries that we don't own have been removed.
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-9027) Distinguishing different Cache Store configurations is impossible
by Sebastian Łaskawiec (JIRA)
[ https://issues.jboss.org/browse/ISPN-9027?page=com.atlassian.jira.plugin.... ]
Sebastian Łaskawiec updated ISPN-9027:
Attachment: clustered.xml
> Distinguishing different Cache Store configurations is impossible
> -----------------------------------------------------------------
> Key: ISPN-9027
> URL: https://issues.jboss.org/browse/ISPN-9027
> Project: Infinispan
> Issue Type: Bug
> Components: Loaders and Stores
> Affects Versions: 9.2.1.Final
> Reporter: Sebastian Łaskawiec
> Attachments: clustered.xml
> h3. Problem description
> One of our users reported that when using two Deployed Cache Stores, their configuration gets overridden and both behaves as if they had the same configuration. Here's an example:
> {code}
> <distributed-cache name="cassandracache1" owners="2" segments="256" mode="SYNC">
> <store name="store1" class="org.infinispan.persistence.cassandra.CassandraStore" preload="true">
> <property name="autoCreateKeyspace">
> true
> </property>
> <property name="keyspace">
> Infinispan
> </property>
> <property name="entryTable">
> InfinispanEntries
> </property>
> <property name="servers">
> </property>
> </store>
> </distributed-cache>
> <distributed-cache name="cassandracache2" owners="2" segments="256" mode="SYNC">
> <store name="store2" class="org.infinispan.persistence.cassandra.CassandraStore" preload="true">
> <property name="autoCreateKeyspace">
> true
> </property>
> <property name="keyspace">
> Infinispan
> </property>
> <property name="entryTable">
> InfinispanEntries1
> </property>
> <property name="servers">
> </property>
> </store>
> </distributed-cache>
> {code}
> Both caches ({{cassandracache1}} and {{cassandracache2}}) use the same {{entryTable}} which is set to {{InfinispanEntries1}}.
> h3. Investigation
> {{CacheStoreFactory}} implementation were created to fabricate Loader/Writer instance based on parsed configuration (via {{createInstance}} method). This method receives from {{PersistenceManagerImpl}} an instance of an {{AbstractStoreConfiguration}} (here's an example (1)) two times - once per each parsed configuration. The parsing part seems OK but we do not parse Cache Store Name, which makes differentiating both configuration impossible.
> h3. Proposed fix
> * Add {{name}} attribute to {{StoreConfiguration}}.
> * Either add an explicit parameter to {{CacheStoreFactory#createInstance(StoreConfiguration cfg, String cacheStoreName)}} or scan for Cache Store name in both implementations ({{DeployedCacheStoreFactory}} and {{LocalClassLoaderCacheStoreFactory}}.
> {code}
> (1) AbstractStoreConfiguration [attributes=DeployedStoreConfiguration = [fetchPersistentState=false, purgeOnStartup=false, ignoreModifications=false, preload=true, shared=false, transactional=false, maxBatchSize=100, properties={keyspace=Infinispan, connectionPool.poolTimeoutMillis=5, entryTable=InfinispanEntries, connectionPool.idleTimeoutSeconds=120, connectionPool.heartbeatIntervalSeconds=30, autoCreateKeyspace=true, servers=[9042]}, customStoreClassName=org.infinispan.persistence.cassandra.CassandraStore], async=AsyncStoreConfiguration [attributes=AsyncStoreConfiguration = [enabled=false, modificationQueueSize=1024, threadPoolSize=1]], singletonStore=SingletonStoreConfiguration [attributes=SingletonStoreConfiguration = [enabled=false, push-state-timeout=10000, push-state-when-coordinator=true]]]
> {code}
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-9027) Distinguishing different Cache Store configurations is impossible
by Sebastian Łaskawiec (JIRA)
Sebastian Łaskawiec created ISPN-9027:
Summary: Distinguishing different Cache Store configurations is impossible
Key: ISPN-9027
URL: https://issues.jboss.org/browse/ISPN-9027
Project: Infinispan
Issue Type: Bug
Components: Loaders and Stores
Affects Versions: 9.2.1.Final
Reporter: Sebastian Łaskawiec
h3. Problem description
One of our users reported that when using two Deployed Cache Stores, their configuration gets overridden and both behaves as if they had the same configuration. Here's an example:
<distributed-cache name="cassandracache1" owners="2" segments="256" mode="SYNC">
<store name="store1" class="org.infinispan.persistence.cassandra.CassandraStore" preload="true">
<property name="autoCreateKeyspace">
<property name="keyspace">
<property name="entryTable">
<property name="servers">[9042]
<distributed-cache name="cassandracache2" owners="2" segments="256" mode="SYNC">
<store name="store2" class="org.infinispan.persistence.cassandra.CassandraStore" preload="true">
<property name="autoCreateKeyspace">
<property name="keyspace">
<property name="entryTable">
<property name="servers">[9042]
Both caches ({{cassandracache1}} and {{cassandracache2}}) use the same {{entryTable}} which is set to {{InfinispanEntries1}}.
h3. Investigation
{{CacheStoreFactory}} implementation were created to fabricate Loader/Writer instance based on parsed configuration (via {{createInstance}} method). This method receives from {{PersistenceManagerImpl}} an instance of an {{AbstractStoreConfiguration}} (here's an example (1)) two times - once per each parsed configuration. The parsing part seems OK but we do not parse Cache Store Name, which makes differentiating both configuration impossible.
h3. Proposed fix
* Add {{name}} attribute to {{StoreConfiguration}}.
* Either add an explicit parameter to {{CacheStoreFactory#createInstance(StoreConfiguration cfg, String cacheStoreName)}} or scan for Cache Store name in both implementations ({{DeployedCacheStoreFactory}} and {{LocalClassLoaderCacheStoreFactory}}.
(1) AbstractStoreConfiguration [attributes=DeployedStoreConfiguration = [fetchPersistentState=false, purgeOnStartup=false, ignoreModifications=false, preload=true, shared=false, transactional=false, maxBatchSize=100, properties={keyspace=Infinispan, connectionPool.poolTimeoutMillis=5, entryTable=InfinispanEntries, connectionPool.idleTimeoutSeconds=120, connectionPool.heartbeatIntervalSeconds=30, autoCreateKeyspace=true, servers=[9042]}, customStoreClassName=org.infinispan.persistence.cassandra.CassandraStore], async=AsyncStoreConfiguration [attributes=AsyncStoreConfiguration = [enabled=false, modificationQueueSize=1024, threadPoolSize=1]], singletonStore=SingletonStoreConfiguration [attributes=SingletonStoreConfiguration = [enabled=false, push-state-timeout=10000, push-state-when-coordinator=true]]]
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-9021) Remote query: add option to disable default indexing per schema file
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-9021?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-9021:
All types that are not annotated are currently fully indexed for backward compat with first version of remote query (which did not have protobuf annotations to control indexing).
This behaviour is very inefficient and confusing for users that do not intend to use indexing (non-indexed query works anyway).
Given the history of this behaviour we cannot remove it until next major version. Until then we can offer the choice of disabling it per each schema file via a boolean protobuf file-level option named 'enable_legacy_indexing', which when absent is considered to default to true. Setting it to false will disable indexing of types that do not have indexing annotations.
Whenever an entry is indexed using default indexing a warning will be logged in order to motivate people to switch to using proper annotations.
All types that are not annotated are currently fully indexed for backward compat with first version of remote query (which did not have protobuf annotations to control indexing).
This behaviour is very inefficient and confusing for users that do not intend to use indexing (non-indexed query works anyway).
Given the history of this behaviour we cannot remove it until next major version. Until then we can offer the choice of disabling it per each schema file via a boolean protobuf file-level option named 'enable_default_indexing', which when absent is considered to default to true. Setting it to false will disable indexing of types that do not have indexing annotations.
Whenever an entry is indexed using default indexing a warning will be logged in order to motivate people to switch to using proper annotations.
> Remote query: add option to disable default indexing per schema file
> --------------------------------------------------------------------
> Key: ISPN-9021
> URL: https://issues.jboss.org/browse/ISPN-9021
> Project: Infinispan
> Issue Type: Enhancement
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 9.3.0.Alpha1, 9.3.0.Final
> All types that are not annotated are currently fully indexed for backward compat with first version of remote query (which did not have protobuf annotations to control indexing).
> This behaviour is very inefficient and confusing for users that do not intend to use indexing (non-indexed query works anyway).
> Given the history of this behaviour we cannot remove it until next major version. Until then we can offer the choice of disabling it per each schema file via a boolean protobuf file-level option named 'enable_legacy_indexing', which when absent is considered to default to true. Setting it to false will disable indexing of types that do not have indexing annotations.
> Whenever an entry is indexed using default indexing a warning will be logged in order to motivate people to switch to using proper annotations.
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-9021) Remote query: add option to disable default/legacy indexing per schema file
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-9021?page=com.atlassian.jira.plugin.... ]
Adrian Nistor updated ISPN-9021:
Summary: Remote query: add option to disable default/legacy indexing per schema file (was: Remote query: add option to disable default indexing per schema file)
> Remote query: add option to disable default/legacy indexing per schema file
> ---------------------------------------------------------------------------
> Key: ISPN-9021
> URL: https://issues.jboss.org/browse/ISPN-9021
> Project: Infinispan
> Issue Type: Enhancement
> Reporter: Adrian Nistor
> Assignee: Adrian Nistor
> Fix For: 9.3.0.Alpha1, 9.3.0.Final
> All types that are not annotated are currently fully indexed for backward compat with first version of remote query (which did not have protobuf annotations to control indexing).
> This behaviour is very inefficient and confusing for users that do not intend to use indexing (non-indexed query works anyway).
> Given the history of this behaviour we cannot remove it until next major version. Until then we can offer the choice of disabling it per each schema file via a boolean protobuf file-level option named 'enable_legacy_indexing', which when absent is considered to default to true. Setting it to false will disable indexing of types that do not have indexing annotations.
> Whenever an entry is indexed using default indexing a warning will be logged in order to motivate people to switch to using proper annotations.
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-8728) ExceptionEvictionTest.testSizeCorrectWithStateTransfer random failures
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-8728?page=com.atlassian.jira.plugin.... ]
William Burns commented on ISPN-8728:
[~dan.berindei] It being the smallest is fine. Just means the entries were not distributed evenly among all the nodes. It sounds like instead of waiting for the toplogy to be stable, the test instead needs to wait until all nodes remove all their old segments. I will try to take a look at this in the next few days.
> ExceptionEvictionTest.testSizeCorrectWithStateTransfer random failures
> ----------------------------------------------------------------------
> Key: ISPN-8728
> URL: https://issues.jboss.org/browse/ISPN-8728
> Project: Infinispan
> Issue Type: Bug
> Components: Test Suite - Core
> Affects Versions: 9.2.0.CR1
> Reporter: Dan Berindei
> Assignee: William Burns
> Labels: testsuite_stability
> Fix For: 9.3.0.Final
> Attachments: ExceptionEvictionTest_20180129.log.gz, ExceptionEvictionTest_ISPN-8962_preferavailabilitystrategy_20180328.log.gz
> {noformat}
> 15:10:01,610 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.eviction.impl.ExceptionEvictionTest.testSizeCorrectWithStateTransfer[DIST_SYNC, nodeCount=3, storageType=BINARY, optimisticTransaction=true]
> java.lang.AssertionError: expected:<1920> but was:<1984>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.8.8.jar:?]
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364) ~[testng-6.8.8.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80) ~[testng-6.8.8.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:170) ~[testng-6.8.8.jar:?]
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:177) ~[testng-6.8.8.jar:?]
> at org.infinispan.eviction.impl.ExceptionEvictionTest.assertInterceptorCount(ExceptionEvictionTest.java:252) ~[test-classes/:?]
> at org.infinispan.eviction.impl.ExceptionEvictionTest.testSizeCorrectWithStateTransfer(ExceptionEvictionTest.java:600) ~[test-classes/:?]
> {noformat}
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-8691) Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
by William Burns (JIRA)
[ https://issues.jboss.org/browse/ISPN-8691?page=com.atlassian.jira.plugin.... ]
William Burns commented on ISPN-8691:
Hrmm that may be helpful. The field you pointed at is supposed to be total length of a single entry (including key/value/metadata) So I wonder if instead the file got corrupted somehow? Or is it possible it the file was created from a different version of Infinispan that you are no reading it from?
> Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
> --------------------------------------------------------------------------------
> Key: ISPN-8691
> URL: https://issues.jboss.org/browse/ISPN-8691
> Project: Infinispan
> Issue Type: Enhancement
> Components: Loaders and Stores
> Affects Versions: 9.1.1.Final
> Reporter: Dmitry Katsubo
> Priority: Minor
> In my scenario the cache file size created by {{SingleFileStore}} is bytes. When this file is tried to be loaded, it fails with the following exception:
> {code}
> Caused by: org.infinispan.persistence.spi.PersistenceException: ISPN000279: Failed to read stored entries from file. Error in file /work/search-service-layer_data/infinispan/cache_test_dk83146/markupCache.dat at offset 4
> at org.infinispan.persistence.file.SingleFileStore.rebuildIndex(SingleFileStore.java:182)
> at org.infinispan.persistence.file.SingleFileStore.start(SingleFileStore.java:127)
> ... 155 more
> {code}
> Cache file content:
> {code}
> 0000000000: 46 43 53 31 80 B1 89 47 │ 00 00 00 00 00 00 00 00 FCS1?+%G
> 0000000010: 00 00 00 00 FF FF FF FF │ FF FF FF FF 02 15 4E 06 yyyyyyyy☻§N♠
> 0000000020: 05 03 04 09 00 00 00 2F │ 6F 72 67 2E 73 70 72 69 ♣♥♦○ /org.spri
> 0000000030: 6E 67 66 72 61 6D 65 77 │ 6F 72 6B 2E 63 61 63 68 ngframework.cach
> 0000000040: 65 2E 69 6E 74 65 72 63 │ 65 70 74 6F 72 2E 53 69 e.interceptor.Si
> 0000000050: 6D 70 6C 65 4B 65 79 4C │ 0A 57 03 6B 6D 93 D8 00 mpleKeyL◙W♥km"O
> 0000000060: 00 00 02 00 00 00 08 68 │ 61 73 68 43 6F 64 65 23 ☻ ◘hashCode#
> 0000000070: 00 00 00 00 06 70 61 72 │ 61 6D 73 16 00 16 15 E6 ♠params▬ ▬§?
> {code}
> The problem is that integer value 0x80B18947 is treated as signed integer in line {{SingleFileStore:181}}, hence in expression
> {code}
> if (fe.size < KEY_POS + fe.keyLen + fe.dataLen + fe.metadataLen) {
> throw log.errorReadingFileStore(file.getPath(), filePos);
> }
> {code}
> {{fe.size}} is negative and equal to -2135848633.
> I have tried to configure the persistence storage so that it gets purged on start:
> {code}
> <persistence passivation="true">
> <file-store path="/var/cache/infinispan" purge="true">
> <write-behind thread-pool-size="5" />
> </file-store>
> </persistence>
> {code}
> however this does not help as storage is first read and then purged (see also ISPN-7186).
> It is expected that {{SingleFileStore}} either does not allow to write such big entries to the cache, or handles them correctly.
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-8691) Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
by Dmitry Katsubo (JIRA)
[ https://issues.jboss.org/browse/ISPN-8691?page=com.atlassian.jira.plugin.... ]
Dmitry Katsubo commented on ISPN-8691:
If for "entry" you mean an object which is stored in cache, then I assure you we that don't have objects even close to integer max. I can upload the cachefile for further investigation, if that helps.
> Infinispan rejects to read cache file bigger than 2147483647 (Integer.MAX_VALUE)
> --------------------------------------------------------------------------------
> Key: ISPN-8691
> URL: https://issues.jboss.org/browse/ISPN-8691
> Project: Infinispan
> Issue Type: Enhancement
> Components: Loaders and Stores
> Affects Versions: 9.1.1.Final
> Reporter: Dmitry Katsubo
> Priority: Minor
> In my scenario the cache file size created by {{SingleFileStore}} is bytes. When this file is tried to be loaded, it fails with the following exception:
> {code}
> Caused by: org.infinispan.persistence.spi.PersistenceException: ISPN000279: Failed to read stored entries from file. Error in file /work/search-service-layer_data/infinispan/cache_test_dk83146/markupCache.dat at offset 4
> at org.infinispan.persistence.file.SingleFileStore.rebuildIndex(SingleFileStore.java:182)
> at org.infinispan.persistence.file.SingleFileStore.start(SingleFileStore.java:127)
> ... 155 more
> {code}
> Cache file content:
> {code}
> 0000000000: 46 43 53 31 80 B1 89 47 │ 00 00 00 00 00 00 00 00 FCS1?+%G
> 0000000010: 00 00 00 00 FF FF FF FF │ FF FF FF FF 02 15 4E 06 yyyyyyyy☻§N♠
> 0000000020: 05 03 04 09 00 00 00 2F │ 6F 72 67 2E 73 70 72 69 ♣♥♦○ /org.spri
> 0000000030: 6E 67 66 72 61 6D 65 77 │ 6F 72 6B 2E 63 61 63 68 ngframework.cach
> 0000000040: 65 2E 69 6E 74 65 72 63 │ 65 70 74 6F 72 2E 53 69 e.interceptor.Si
> 0000000050: 6D 70 6C 65 4B 65 79 4C │ 0A 57 03 6B 6D 93 D8 00 mpleKeyL◙W♥km"O
> 0000000060: 00 00 02 00 00 00 08 68 │ 61 73 68 43 6F 64 65 23 ☻ ◘hashCode#
> 0000000070: 00 00 00 00 06 70 61 72 │ 61 6D 73 16 00 16 15 E6 ♠params▬ ▬§?
> {code}
> The problem is that integer value 0x80B18947 is treated as signed integer in line {{SingleFileStore:181}}, hence in expression
> {code}
> if (fe.size < KEY_POS + fe.keyLen + fe.dataLen + fe.metadataLen) {
> throw log.errorReadingFileStore(file.getPath(), filePos);
> }
> {code}
> {{fe.size}} is negative and equal to -2135848633.
> I have tried to configure the persistence storage so that it gets purged on start:
> {code}
> <persistence passivation="true">
> <file-store path="/var/cache/infinispan" purge="true">
> <write-behind thread-pool-size="5" />
> </file-store>
> </persistence>
> {code}
> however this does not help as storage is first read and then purged (see also ISPN-7186).
> It is expected that {{SingleFileStore}} either does not allow to write such big entries to the cache, or handles them correctly.
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-8980) High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-8980?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes edited comment on ISPN-8980 at 3/29/18 12:37 PM:
[~debashish.bharali] Were you able to collect TRACE logs from the nodes I asked before?
Things to look for in the TRACE, Is there more than 1 node writing to the index at the same time? In any case, having a local lock in a clustered Infinispan Directory is not recommended since any node can acquire the lock at anytime, and the chance of indexing corruption is very high.
Regarding the JGroups backend setup, either [~sannegrinovero] or the Hibernate Search team can help you with.
>From my understanding, the JGroups backend can use static or automatic master, have you tried to use static master? I can see from Hibernate Search docs that automatic master election is [experimental and has some drawbacks|https://docs.jboss.org/hibernate/stable/search/reference/en-US/...]
was (Author: gustavonalle):
[~debashish.bharali] Were you able to collect TRACE logs from the nodes I asked before?
Things to look for in the TRACE, Is there more than 1 node writing to the index at the same time? In any case, having a local lock in a clustered Infinispan Directory is not recommended since any node can acquire the lock at anytime, and the change of indexing corruption is very high.
Regarding the JGroups backend setup, either [~sannegrinovero] or the Hibernate Search team can help you with.
>From my understanding, the JGroups backend can use static or automatic master, have you tried to use static master? I can see from Hibernate Search docs that automatic master election is [experimental and has some drawbacks|https://docs.jboss.org/hibernate/stable/search/reference/en-US/...]
> High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
> ------------------------------------------------------------------------------------------------
> Key: ISPN-8980
> URL: https://issues.jboss.org/browse/ISPN-8980
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 8.2.5.Final
> Reporter: Debashish Bharali
> Assignee: Gustavo Fernandes
> Priority: Critical
> Attachments: SysOutLogs.txt, neutrino-hibernate-search-worker-jgroups.xml, neutrino-hibernatesearch-infinispan.xml
> During high concurrency of action, we are getting *{color:red}'Error loading metadata for index file'{color}* even in *{color:red}Non-Clustered{color}* env.
> *Hibernate Search Indexes (Lucene Indexes) - 5.7.0.Final*
> *Infinispan - 8.2.5.Final*
> *infinispan-directory-provider-8.2.5.Final*
> *jgroups-3.6.7.Final*
> *Worker Backend : JGroups*
> *Worker Execution: Sync*
> *write_metadata_async: false (implicitly)*
> *Note:* Currently we are on Non-Clustered env. We are moving to Clustered Env within few days.
> On analyzing the code, and putting some additional SYSOUT loggers into FileListOperations and DirectoryImplementor classes, we have established the following points:
> # This is happening during high concurrency on non-clustered env.
> # One thread *'T1'* is deleting a segment and segment name *'SEG1'* from the *'FileListCacheKey'* list* stored in MetaDatacache*.
> # Concurrently, at the same time, another thread *'T2'* is looping through the FileList ['copy list' from MetadataCache - for -FileListCacheKey - provided by toArray method of *FileListOperations* (changes also being done in the corresponding original list by T1 thread) ].
> # *'T2'* is calling open input method on each segment name - getting corresponding Metadata segment from *MetadataCache*.
> # However, for *'T2'*, the *'copy list'* still contains the name of segment *'SEG1'*.
> # So while looping through the list, *'T2'* tries to get Segment from MetadataCache for segment name *'SEG1'*.
> # But at this instant, *segment* corresponding to segment name *'SEG1'*, has been already removed from *MetadataCache* by *'T1'*.
> # This results in *'java.io.FileNotFoundException: Error loading metadata for index file'* for segment name *'SEG1'*
> # As mentioned earlier, this happens more often during high concurrency.
> *{color:red}On a standalone server (non-clustered), we are getting below error intermittently:{color}*
> Full Stack trace:
> 2018-03-19 17:29:11,938 ERROR [Hibernate Search sync consumer thread for index com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer] o.h.s.e.i.LogErrorHandler [LogErrorHandler.java:69]
> *{color:red}HSEARCH000058: Exception occurred java.io.FileNotFoundException: Error loading metadata for index file{color}*: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> Primary Failure:
> Entity com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer Id 1649990024999813056 Work Type org.hibernate.search.backend.AddLuceneWork
> java.io.FileNotFoundException: Error loading metadata for index file: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:126) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:92) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractCommitPolicy.getIndexWriter(AbstractCommitPolicy.java:33) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexCommitPolicy.getIndexWriter(SharedIndexCommitPolicy.java:77) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexWorkspaceImpl.getIndexWriter(SharedIndexWorkspaceImpl.java:36) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:81) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:165) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:151) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at java.lang.Thread.run(Thread.java:785) [na:1.8.0-internal]
> *As per our understanding, this issue should not come in {color:red}'non-clustered'{color} env. Also it should not arise when worker execution is {color:red}'sync'{color}.*
> *We have debugged the code, and confirmed that the value for {color:red}'write_metadata_async'{color} is coming as 'false' only (as expected).*
This message was sent by Atlassian JIRA
6 years, 10 months
[JBoss JIRA] (ISPN-8980) High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-8980?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes edited comment on ISPN-8980 at 3/29/18 12:37 PM:
[~debashish.bharali] Were you able to collect TRACE logs from the nodes I asked before?
Things to look for in the TRACE, Is there more than 1 node writing to the index at the same time? In any case, having a local lock in a clustered Infinispan Directory is not recommended since any node can acquire the lock at anytime, and the change of indexing corruption is very high.
Regarding the JGroups backend setup, either [~sannegrinovero] or the Hibernate Search team can help you with.
>From my understanding, the JGroups backend can use static or automatic master, have you tried to use static master? I can see from Hibernate Search docs that automatic master election is [experimental and has some drawbacks|https://docs.jboss.org/hibernate/stable/search/reference/en-US/...]
was (Author: gustavonalle):
[~debashish.bharali] Were you able to collect TRACE logs from the nodes I asked before?
Things to lock for in the TRACE, Is there more than 1 node writing to the index at the same time? In any case, having a local lock in a clustered Infinispan Directory is not recommended since any node can acquire the lock at anytime, and the change of indexing corruption is very high.
Regarding the JGroups backend setup, either [~sannegrinovero] or the Hibernate Search team can help you with.
>From my understanding, the JGroups backend can use static or automatic master, have you tried to use static master? I can see from Hibernate Search docs that automatic master election is [experimental and has some drawbacks|https://docs.jboss.org/hibernate/stable/search/reference/en-US/...]
> High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
> ------------------------------------------------------------------------------------------------
> Key: ISPN-8980
> URL: https://issues.jboss.org/browse/ISPN-8980
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 8.2.5.Final
> Reporter: Debashish Bharali
> Assignee: Gustavo Fernandes
> Priority: Critical
> Attachments: SysOutLogs.txt, neutrino-hibernate-search-worker-jgroups.xml, neutrino-hibernatesearch-infinispan.xml
> During high concurrency of action, we are getting *{color:red}'Error loading metadata for index file'{color}* even in *{color:red}Non-Clustered{color}* env.
> *Hibernate Search Indexes (Lucene Indexes) - 5.7.0.Final*
> *Infinispan - 8.2.5.Final*
> *infinispan-directory-provider-8.2.5.Final*
> *jgroups-3.6.7.Final*
> *Worker Backend : JGroups*
> *Worker Execution: Sync*
> *write_metadata_async: false (implicitly)*
> *Note:* Currently we are on Non-Clustered env. We are moving to Clustered Env within few days.
> On analyzing the code, and putting some additional SYSOUT loggers into FileListOperations and DirectoryImplementor classes, we have established the following points:
> # This is happening during high concurrency on non-clustered env.
> # One thread *'T1'* is deleting a segment and segment name *'SEG1'* from the *'FileListCacheKey'* list* stored in MetaDatacache*.
> # Concurrently, at the same time, another thread *'T2'* is looping through the FileList ['copy list' from MetadataCache - for -FileListCacheKey - provided by toArray method of *FileListOperations* (changes also being done in the corresponding original list by T1 thread) ].
> # *'T2'* is calling open input method on each segment name - getting corresponding Metadata segment from *MetadataCache*.
> # However, for *'T2'*, the *'copy list'* still contains the name of segment *'SEG1'*.
> # So while looping through the list, *'T2'* tries to get Segment from MetadataCache for segment name *'SEG1'*.
> # But at this instant, *segment* corresponding to segment name *'SEG1'*, has been already removed from *MetadataCache* by *'T1'*.
> # This results in *'java.io.FileNotFoundException: Error loading metadata for index file'* for segment name *'SEG1'*
> # As mentioned earlier, this happens more often during high concurrency.
> *{color:red}On a standalone server (non-clustered), we are getting below error intermittently:{color}*
> Full Stack trace:
> 2018-03-19 17:29:11,938 ERROR [Hibernate Search sync consumer thread for index com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer] o.h.s.e.i.LogErrorHandler [LogErrorHandler.java:69]
> *{color:red}HSEARCH000058: Exception occurred java.io.FileNotFoundException: Error loading metadata for index file{color}*: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> Primary Failure:
> Entity com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer Id 1649990024999813056 Work Type org.hibernate.search.backend.AddLuceneWork
> java.io.FileNotFoundException: Error loading metadata for index file: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:126) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:92) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractCommitPolicy.getIndexWriter(AbstractCommitPolicy.java:33) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexCommitPolicy.getIndexWriter(SharedIndexCommitPolicy.java:77) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexWorkspaceImpl.getIndexWriter(SharedIndexWorkspaceImpl.java:36) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:81) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:165) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:151) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at java.lang.Thread.run(Thread.java:785) [na:1.8.0-internal]
> *As per our understanding, this issue should not come in {color:red}'non-clustered'{color} env. Also it should not arise when worker execution is {color:red}'sync'{color}.*
> *We have debugged the code, and confirmed that the value for {color:red}'write_metadata_async'{color} is coming as 'false' only (as expected).*
This message was sent by Atlassian JIRA
6 years, 10 months