[JBoss JIRA] (ISPN-8980) High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
by Debashish Bharali (JIRA)
[ https://issues.jboss.org/browse/ISPN-8980?page=com.atlassian.jira.plugin.... ]
Debashish Bharali updated ISPN-8980:
------------------------------------
Attachment: TestResult-StoreIndexReplTest.txt
> High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-8980
> URL: https://issues.jboss.org/browse/ISPN-8980
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 8.2.5.Final
> Reporter: Debashish Bharali
> Assignee: Gustavo Fernandes
> Priority: Critical
> Attachments: JoiningNode_N2.zip, OriginalNode_N1.zip, SysOutLogs.txt, TestResult-StoreIndexReplTest.txt, neutrino-hibernate-search-worker-jgroups.xml, neutrino-hibernatesearch-infinispan.xml
>
>
> During high concurrency of action, we are getting *{color:red}'Error loading metadata for index file'{color}* even in *{color:red}Non-Clustered{color}* env.
> *Hibernate Search Indexes (Lucene Indexes) - 5.7.0.Final*
> *Infinispan - 8.2.5.Final*
> *infinispan-directory-provider-8.2.5.Final*
> *jgroups-3.6.7.Final*
> *Worker Backend : JGroups*
> *Worker Execution: Sync*
> *write_metadata_async: false (implicitly)*
> *Note:* Currently we are on Non-Clustered env. We are moving to Clustered Env within few days.
> On analyzing the code, and putting some additional SYSOUT loggers into FileListOperations and DirectoryImplementor classes, we have established the following points:
> # This is happening during high concurrency on non-clustered env.
> # One thread *'T1'* is deleting a segment and segment name *'SEG1'* from the *'FileListCacheKey'* list* stored in MetaDatacache*.
> # Concurrently, at the same time, another thread *'T2'* is looping through the FileList ['copy list' from MetadataCache - for -FileListCacheKey - provided by toArray method of *FileListOperations* (changes also being done in the corresponding original list by T1 thread) ].
> # *'T2'* is calling open input method on each segment name - getting corresponding Metadata segment from *MetadataCache*.
> # However, for *'T2'*, the *'copy list'* still contains the name of segment *'SEG1'*.
> # So while looping through the list, *'T2'* tries to get Segment from MetadataCache for segment name *'SEG1'*.
> # But at this instant, *segment* corresponding to segment name *'SEG1'*, has been already removed from *MetadataCache* by *'T1'*.
> # This results in *'java.io.FileNotFoundException: Error loading metadata for index file'* for segment name *'SEG1'*
> # As mentioned earlier, this happens more often during high concurrency.
> *{color:red}On a standalone server (non-clustered), we are getting below error intermittently:{color}*
> Full Stack trace:
> 2018-03-19 17:29:11,938 ERROR [Hibernate Search sync consumer thread for index com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer] o.h.s.e.i.LogErrorHandler [LogErrorHandler.java:69]
> *{color:red}HSEARCH000058: Exception occurred java.io.FileNotFoundException: Error loading metadata for index file{color}*: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> Primary Failure:
> Entity com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer Id 1649990024999813056 Work Type org.hibernate.search.backend.AddLuceneWork
> java.io.FileNotFoundException: Error loading metadata for index file: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:126) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:92) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractCommitPolicy.getIndexWriter(AbstractCommitPolicy.java:33) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexCommitPolicy.getIndexWriter(SharedIndexCommitPolicy.java:77) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexWorkspaceImpl.getIndexWriter(SharedIndexWorkspaceImpl.java:36) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:81) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:165) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:151) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at java.lang.Thread.run(Thread.java:785) [na:1.8.0-internal]
> *As per our understanding, this issue should not come in {color:red}'non-clustered'{color} env. Also it should not arise when worker execution is {color:red}'sync'{color}.*
> *We have debugged the code, and confirmed that the value for {color:red}'write_metadata_async'{color} is coming as 'false' only (as expected).*
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years
[JBoss JIRA] (ISPN-8980) High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
by Debashish Bharali (JIRA)
[ https://issues.jboss.org/browse/ISPN-8980?page=com.atlassian.jira.plugin.... ]
Debashish Bharali commented on ISPN-8980:
-----------------------------------------
Hi [~gustavonalle],
I have multiple executions of the test case provided by you. I have attached one of the outcomes.
The tests have completed successfully. So issue is not getting replicated for the shared test case.
One point to focus on the outcome, the size of the *'.DAT'* files for the joining node 'NodeB' is way smaller than the original node 'NodeA'. (May be due to fragmentation and compaction)
I am making further changes on the test case to simulate the issue somehow.
Please share some inputs on how to proceed on this.
Additionally, I really appreciate the great support and efforts provided by you. :)
(You even created a special test case for our scenario)
> High concurrency : Infinispan Directory Provider: Lucene : Error loading metadata for index file
> ------------------------------------------------------------------------------------------------
>
> Key: ISPN-8980
> URL: https://issues.jboss.org/browse/ISPN-8980
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 8.2.5.Final
> Reporter: Debashish Bharali
> Assignee: Gustavo Fernandes
> Priority: Critical
> Attachments: JoiningNode_N2.zip, OriginalNode_N1.zip, SysOutLogs.txt, neutrino-hibernate-search-worker-jgroups.xml, neutrino-hibernatesearch-infinispan.xml
>
>
> During high concurrency of action, we are getting *{color:red}'Error loading metadata for index file'{color}* even in *{color:red}Non-Clustered{color}* env.
> *Hibernate Search Indexes (Lucene Indexes) - 5.7.0.Final*
> *Infinispan - 8.2.5.Final*
> *infinispan-directory-provider-8.2.5.Final*
> *jgroups-3.6.7.Final*
> *Worker Backend : JGroups*
> *Worker Execution: Sync*
> *write_metadata_async: false (implicitly)*
> *Note:* Currently we are on Non-Clustered env. We are moving to Clustered Env within few days.
> On analyzing the code, and putting some additional SYSOUT loggers into FileListOperations and DirectoryImplementor classes, we have established the following points:
> # This is happening during high concurrency on non-clustered env.
> # One thread *'T1'* is deleting a segment and segment name *'SEG1'* from the *'FileListCacheKey'* list* stored in MetaDatacache*.
> # Concurrently, at the same time, another thread *'T2'* is looping through the FileList ['copy list' from MetadataCache - for -FileListCacheKey - provided by toArray method of *FileListOperations* (changes also being done in the corresponding original list by T1 thread) ].
> # *'T2'* is calling open input method on each segment name - getting corresponding Metadata segment from *MetadataCache*.
> # However, for *'T2'*, the *'copy list'* still contains the name of segment *'SEG1'*.
> # So while looping through the list, *'T2'* tries to get Segment from MetadataCache for segment name *'SEG1'*.
> # But at this instant, *segment* corresponding to segment name *'SEG1'*, has been already removed from *MetadataCache* by *'T1'*.
> # This results in *'java.io.FileNotFoundException: Error loading metadata for index file'* for segment name *'SEG1'*
> # As mentioned earlier, this happens more often during high concurrency.
> *{color:red}On a standalone server (non-clustered), we are getting below error intermittently:{color}*
> Full Stack trace:
> 2018-03-19 17:29:11,938 ERROR [Hibernate Search sync consumer thread for index com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer] o.h.s.e.i.LogErrorHandler [LogErrorHandler.java:69]
> *{color:red}HSEARCH000058: Exception occurred java.io.FileNotFoundException: Error loading metadata for index file{color}*: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> Primary Failure:
> Entity com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer Id 1649990024999813056 Work Type org.hibernate.search.backend.AddLuceneWork
> java.io.FileNotFoundException: Error loading metadata for index file: M|segments_w6|com.nucleus.integration.ws.server.globalcustomer.entity.GlobalCustomer|-1
> at org.infinispan.lucene.impl.DirectoryImplementor.openInput(DirectoryImplementor.java:138) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.infinispan.lucene.impl.DirectoryLucene.openInput(DirectoryLucene.java:102) ~[infinispan-lucene-directory-8.2.5.Final.jar:8.2.5.Final]
> at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:109) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:294) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:171) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:949) ~[lucene-core-5.5.4.jar:5.5.4 31012120ebbd93744753eb37f1dbc5e654628291 - jpountz - 2017-02-08 19:08:03]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.createNewIndexWriter(IndexWriterHolder.java:126) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.IndexWriterHolder.getIndexWriter(IndexWriterHolder.java:92) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractCommitPolicy.getIndexWriter(AbstractCommitPolicy.java:33) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexCommitPolicy.getIndexWriter(SharedIndexCommitPolicy.java:77) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SharedIndexWorkspaceImpl.getIndexWriter(SharedIndexWorkspaceImpl.java:36) ~[hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.AbstractWorkspaceImpl.getIndexWriterDelegate(AbstractWorkspaceImpl.java:203) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.applyUpdates(LuceneBackendQueueTask.java:81) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.LuceneBackendQueueTask.run(LuceneBackendQueueTask.java:46) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.applyChangesets(SyncWorkProcessor.java:165) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at org.hibernate.search.backend.impl.lucene.SyncWorkProcessor$Consumer.run(SyncWorkProcessor.java:151) [hibernate-search-engine-5.7.0.Final.jar:5.7.0.Final]
> at java.lang.Thread.run(Thread.java:785) [na:1.8.0-internal]
> *As per our understanding, this issue should not come in {color:red}'non-clustered'{color} env. Also it should not arise when worker execution is {color:red}'sync'{color}.*
> *We have debugged the code, and confirmed that the value for {color:red}'write_metadata_async'{color} is coming as 'false' only (as expected).*
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years
[JBoss JIRA] (ISPN-9104) Majority partition nodes can process minority topology updates after merge
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-9104?page=com.atlassian.jira.plugin.... ]
Ryan Emerson resolved ISPN-9104.
--------------------------------
Resolution: Done
> Majority partition nodes can process minority topology updates after merge
> --------------------------------------------------------------------------
>
> Key: ISPN-9104
> URL: https://issues.jboss.org/browse/ISPN-9104
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.2.1.Final, 9.3.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.3.0.Beta1
>
>
> After a merge, NAKACK2 resends some broadcast messages that were originally sent in a partition to the members of the merged cluster that weren't in that partition.
> We have a check in LocalTopologyManagerImpl to ignore topology updates from the wrong coordinator, but unfortunately that only happens after calling resetLocalTopologyBeforeRebalance(). If the topology id is higher than the current topology id, that can install a "reset" topology to prepare for rebalance.
> The reset topology has all the owners owned by the minority partition nodes, so the majority partition nodes installing this topology will invalidate all their entries before conflict resolution even starts.
> This causes random failures in the conflict resolution tests, e.g. {{MergePolicyPreferredAlwaysTest}}:
> {noformat}
> 19:19:28,448 DEBUG (testng-Test:[]) [GMS] Test-NodeA-50368: installing view MergeView::[Test-NodeA-50368|10] (5) [Test-NodeA-50368, Test-NodeB-27290, Test-NodeC-9368, Test-NodeD-49504, Test-NodeE-55304], 2 subgroups: [Test-NodeA-50368|8] (3) [Test-NodeA-50368, Test-NodeB-27290, Test-NodeC-9368], [Test-NodeD-49504|9] (2) [Test-NodeD-49504, Test-NodeE-55304]
> 19:19:28,740 TRACE (jgroups-10,Test-NodeA-50368:[]) [GlobalInboundInvocationHandler] Attempting to execute non-CacheRpcCommand: CacheTopologyControlCommand{cache=___defaultcache, type=CH_UPDATE, sender=Test-NodeD-49504, joinInfo=null, topologyId=21, rebalanceId=7, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeD-49504: 132, Test-NodeE-55304: 124]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeD-49504: 134, Test-NodeE-55304: 122]}, availabilityMode=null, phase=READ_ALL_WRITE_ALL, actualMembers=[Test-NodeD-49504, Test-NodeE-55304], throwable=null, viewId=7} [sender=Test-NodeD-49504]
> 19:19:28,741 DEBUG (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [LocalTopologyManagerImpl] Installing fake cache topology CacheTopology{id=20, phase=NO_REBALANCE, rebalanceId=6, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeD-49504: 132, Test-NodeE-55304: 124]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeD-49504, Test-NodeE-55304], persistentUUIDs=[6f22a4be-bf94-42a7-9ea1-4128944351a2, 59c315d5-c7d2-4121-b939-01d62ba9af4f]} for cache ___defaultcache
> 19:19:28,742 TRACE (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [StateConsumerImpl] On cache ___defaultcache we have: new segments: []; old segments: RangeSet(256)
> 19:19:28,744 TRACE (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [StateConsumerImpl] On cache ___defaultcache we have: added segments: {}; removed segments: {0-255}
> 19:19:28,745 DEBUG (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [StateConsumerImpl] Removing no longer owned entries for cache ___defaultcache
> 19:19:28,745 TRACE (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [InvocationContextInterceptor] Invoked with command InvalidateCommand{keys=[MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}]} and InvocationContext [org.infinispan.context.impl.NonTxInvocationContext@48a2a9d5]
> 19:19:30,152 TRACE (stateTransferExecutor-thread-Test-NodeA-p66803-t4:[]) [JGroupsTransport] Test-NodeA-50368 sending request 232 to all: SingleRpcCommand{cacheName='___defaultcache', command=RemoveCommand{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=null, metadata=null, flags=[SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, IGNORE_RETURN_VALUES], commandInvocationId=CommandInvocation:Test-NodeA-50368:109014, valueMatcher=MATCH_ALWAYS, topologyId=24}}
> 19:19:28,748 TRACE (transport-thread-Test-NodeA-p66802-t6:[Topology-___defaultcache]) [DefaultDataContainer] Removed ImmortalCacheEntry{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=DURING SPLIT} from container
> 19:19:30,096 TRACE (stateTransferExecutor-thread-Test-NodeA-p66803-t4:[]) [DefaultConflictManager] Cache ___defaultcache conflict detected {Test-NodeA-50368=NullCacheEntry{}, Test-NodeE-55304=ImmortalCacheEntry{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=BEFORE SPLIT}, Test-NodeC-9368=NullCacheEntry{}, Test-NodeD-49504=ImmortalCacheEntry{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=BEFORE SPLIT}, Test-NodeB-27290=NullCacheEntry{}}
> 19:19:30,132 TRACE (stateTransferExecutor-thread-Test-NodeA-p66803-t4:[]) [DefaultConflictManager] Cache ___defaultcache applying EntryMergePolicy org.infinispan.conflict.MergePolicy to PreferredEntry NullCacheEntry{}, otherEntries [ImmortalCacheEntry{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=BEFORE SPLIT}, NullCacheEntry{}, ImmortalCacheEntry{key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}, value=BEFORE SPLIT}, NullCacheEntry{}]
> 19:19:30,132 TRACE (stateTransferExecutor-thread-Test-NodeA-p66803-t4:[]) [DefaultConflictManager] Cache ___defaultcache executing remove on conflict: key MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}
> 19:19:35,274 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.conflict.impl.MergePolicyPreferredAlwaysTest.testPartitionMergePolicy[REPL_SYNC, 5N]
> java.lang.AssertionError: Key=MagicKey{1AD3/F92B3173/51@Test-NodeA-50368}. VersionMap: {Test-NodeA-50368=null, Test-NodeE-55304=null, Test-NodeC-9368=null, Test-NodeD-49504=null, Test-NodeB-27290=null}
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) ~[testng-6.9.9.jar:?]
> at org.testng.AssertJUnit.assertNotNull(AssertJUnit.java:267) ~[testng-6.9.9.jar:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.afterConflictResolutionAndMerge(BaseMergePolicyTest.java:113) ~[test-classes/:?]
> at org.infinispan.conflict.impl.BaseMergePolicyTest.testPartitionMergePolicy(BaseMergePolicyTest.java:138) ~[test-classes/:?]
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years
[JBoss JIRA] (ISPN-8852) StackOverflowError when requesting data in case cache is in degraded mode
by Michal Stehlik (JIRA)
[ https://issues.jboss.org/browse/ISPN-8852?page=com.atlassian.jira.plugin.... ]
Michal Stehlik commented on ISPN-8852:
--------------------------------------
Thanks, since I reported this ticket, I got this error more often than in original description. Sometimes, it crush in so bad way, whole cache manager was not working anymore. And yes, pesimistic locking, sometimes with this flag Flag.FORCE_WRITE_LOCK and sometimes with Flag.SKIP_REMOTE_LOOKUP, Flag.SKIP_CACHE_LOAD
> StackOverflowError when requesting data in case cache is in degraded mode
> -------------------------------------------------------------------------
>
> Key: ISPN-8852
> URL: https://issues.jboss.org/browse/ISPN-8852
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.4.Final
> Reporter: Michal Stehlik
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Beta1
>
> Attachments: stackowerflow.log
>
>
> Found StackOverflowError in logs when network was disconnected, caches are in degraded mode and system attempt to operate with cache Read & Update.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years
[JBoss JIRA] (ISPN-8852) StackOverflowError when requesting data in case cache is in degraded mode
by Ryan Emerson (JIRA)
[ https://issues.jboss.org/browse/ISPN-8852?page=com.atlassian.jira.plugin.... ]
Ryan Emerson commented on ISPN-8852:
------------------------------------
[~dan.berindei] That makes sense, I didn't take into account FORCE_WRITE_LOCK when adding ALLOW_READS. The fix should be simple, but I'll try to create a test case as well.
> StackOverflowError when requesting data in case cache is in degraded mode
> -------------------------------------------------------------------------
>
> Key: ISPN-8852
> URL: https://issues.jboss.org/browse/ISPN-8852
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.4.Final
> Reporter: Michal Stehlik
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Beta1
>
> Attachments: stackowerflow.log
>
>
> Found StackOverflowError in logs when network was disconnected, caches are in degraded mode and system attempt to operate with cache Read & Update.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years
[JBoss JIRA] (ISPN-8852) StackOverflowError when requesting data in case cache is in degraded mode
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-8852?page=com.atlassian.jira.plugin.... ]
Dan Berindei commented on ISPN-8852:
------------------------------------
[~stehlik.michal] from the stack trace it appears that the cache uses pessimistic locking and the application is doing a {{cache.getAdvancedCache().withFlags(Flag.FORCE_WRITE_LOCK).get(key)}}, is that correct?
[~ryanemerson] I believe in this scenario the read should fail with an {{AvailabilityException}}, because the cache is in DEGRADED mode and lock acquisition needs to contact all the owners.
> StackOverflowError when requesting data in case cache is in degraded mode
> -------------------------------------------------------------------------
>
> Key: ISPN-8852
> URL: https://issues.jboss.org/browse/ISPN-8852
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.1.4.Final
> Reporter: Michal Stehlik
> Assignee: Ryan Emerson
> Fix For: 9.3.0.Beta1
>
> Attachments: stackowerflow.log
>
>
> Found StackOverflowError in logs when network was disconnected, caches are in degraded mode and system attempt to operate with cache Read & Update.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
6 years