[JBoss JIRA] (ISPN-4949) Split brain: inconsistent data after merge
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-4949?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-4949:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/3062
> Split brain: inconsistent data after merge
> ------------------------------------------
>
> Key: ISPN-4949
> URL: https://issues.jboss.org/browse/ISPN-4949
> Project: Infinispan
> Issue Type: Bug
> Components: State Transfer
> Affects Versions: 7.0.0.Final
> Reporter: Radim Vansa
> Assignee: Dan Berindei
> Priority: Critical
>
> 1) cluster A, B, C, D splits into 2 parts:
> A, B (coord A) finds this out immediately and enters degraded mode with CH [A, B, C, D]
> C, D (coord D) first detects that B is lost, gets view A, C, D and starts rebalance with CH [A, C, D]. Segment X is primary owned by C (it had backup on B but this got lost)
> 2) D detects that A was lost as well, therefore enters degraded mode with CH [A, C, D]
> 3) C inserts entry into X: all owners (only C) is present, therefore the modification is allowed
> 4) cluster is merged and coordinator finds out that the max stable topology has CH [A, B, C, D] (it is the older of the two partitions' topologies, got from A, B) - logs 'No active or unavailable partitions, so all the partitions must be in degraded mode' (yes, all partitions are in degraded mode, but write has happened in the meantime)
> 5) The old CH is broadcast in newest topology, no rebalance happens
> 6) Inconsistency: read in X may miss the update
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-4692) Optimize externalizer for FileListCacheValue
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4692?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4692:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166028|https://bugzilla.redhat.com/show_bug.cgi?id=1166028] from NEW to MODIFIED
> Optimize externalizer for FileListCacheValue
> --------------------------------------------
>
> Key: ISPN-4692
> URL: https://issues.jboss.org/browse/ISPN-4692
> Project: Infinispan
> Issue Type: Enhancement
> Components: Lucene Directory
> Reporter: Sanne Grinovero
> Assignee: Gustavo Fernandes
> Fix For: 7.0.0.CR1
>
>
> There are two possible improvements to be applied to the Externalizer strategy applied by FileListCacheValue.
> - Each String is being encoded (and decoded) in UTF8 format, which is expensive. We should explore alternative encodings to String - at least for the wire format.
> - This is an ideal case for Delta operations: on each modification just one entry of the map is added / removed, but the whole HashMap is being transferred at each write.
> I'm not sure how we can combine the Delta interface with custom Externalizers, so that will need to be explored.
> We might want to avoid storing it as a value and resort to custom RPC commands to transfer the needed bits only, but we don't want to reimplement state transfer and CacheStore storage.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-4801) ConcurrentModificationException on the FileListCacheValue
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4801?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4801:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166028|https://bugzilla.redhat.com/show_bug.cgi?id=1166028] from NEW to MODIFIED
> ConcurrentModificationException on the FileListCacheValue
> ---------------------------------------------------------
>
> Key: ISPN-4801
> URL: https://issues.jboss.org/browse/ISPN-4801
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying
> Affects Versions: 7.0.0.Beta2
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Priority: Critical
> Fix For: 7.0.0.CR1
>
>
> Since ISPN-4692 that made FileListCacheValue DeltaAware, the following is happening when running {{org.infinispan.lucene.profiling.PerformanceCompareStressTest}}:
> {code}
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:922)
> at java.util.HashMap$KeyIterator.next(HashMap.java:956)
> at java.util.AbstractCollection.toArray(AbstractCollection.java:195)
> at org.infinispan.lucene.impl.FileListCacheValue.toArray(FileListCacheValue.java:109)
> at org.infinispan.lucene.impl.FileListOperations.listFilenames(FileListOperations.java:101)
> at org.infinispan.lucene.impl.DirectoryImplementor.list(DirectoryImplementor.java:56)
> at org.infinispan.lucene.impl.DirectoryLuceneV4.listAll(DirectoryLuceneV4.java:123)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:759)
> {code}
> The problem is that the deltas are not being applied in a thread safe manner
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-5001) NPE on preload with tx caches containing DeltaAware values
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-5001?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-5001:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166028|https://bugzilla.redhat.com/show_bug.cgi?id=1166028] from NEW to MODIFIED
> NPE on preload with tx caches containing DeltaAware values
> ----------------------------------------------------------
>
> Key: ISPN-5001
> URL: https://issues.jboss.org/browse/ISPN-5001
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 7.0.0.Final, 7.0.2.Final
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Fix For: 7.1.0.Alpha1
>
>
> A similar bug was fixed for non-tx caches on ISPN-4746
> To reproduce the issue, change the {{DeltaAwarePreloadTest}} to use transactional cache.
> Error:
> {code}
> org.infinispan.commons.CacheException: Unable to invoke method public void org.infinispan.persistence.manager.PersistenceManagerImpl.preload() on object of type PersistenceManagerImpl
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:170)
> at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
> at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
> at org.infinispan.factories.AbstractComponentRegistry.internalStart(AbstractComponentRegistry.java:627)
> at org.infinispan.factories.AbstractComponentRegistry.start(AbstractComponentRegistry.java:530)
> at org.infinispan.factories.ComponentRegistry.start(ComponentRegistry.java:216)
> at org.infinispan.cache.impl.CacheImpl.start(CacheImpl.java:813)
> at org.infinispan.distribution.DeltaAwarePreloadTest.testPreloadOnStart(DeltaAwarePreloadTest.java:38)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
> at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
> at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
> at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
> at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
> at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
> at org.testng.TestRunner.privateRun(TestRunner.java:767)
> at org.testng.TestRunner.run(TestRunner.java:617)
> at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
> at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:343)
> at org.testng.SuiteRunner.privateRun(SuiteRunner.java:305)
> at org.testng.SuiteRunner.run(SuiteRunner.java:254)
> at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
> at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
> at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224)
> at org.testng.TestNG.runSuitesLocally(TestNG.java:1149)
> at org.testng.TestNG.run(TestNG.java:1057)
> at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
> at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
> at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
> at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:125)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
> Caused by: org.infinispan.persistence.spi.PersistenceException: Unable to preload!
> at org.infinispan.persistence.manager.PersistenceManagerImpl.preloadKey(PersistenceManagerImpl.java:633)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.access$000(PersistenceManagerImpl.java:70)
> at org.infinispan.persistence.manager.PersistenceManagerImpl$1.processEntry(PersistenceManagerImpl.java:232)
> at org.infinispan.persistence.dummy.DummyInMemoryStore.process(DummyInMemoryStore.java:165)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.preload(PersistenceManagerImpl.java:224)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 37 more
> Caused by: java.lang.NullPointerException
> at org.infinispan.distribution.impl.DistributionManagerImpl.getReadConsistentHash(DistributionManagerImpl.java:110)
> at org.infinispan.interceptors.distribution.TxDistributionInterceptor.remoteGet(TxDistributionInterceptor.java:319)
> at org.infinispan.interceptors.distribution.TxDistributionInterceptor.remoteGetBeforeWrite(TxDistributionInterceptor.java:311)
> at org.infinispan.interceptors.distribution.TxDistributionInterceptor.handleTxWriteCommand(TxDistributionInterceptor.java:269)
> at org.infinispan.interceptors.distribution.TxDistributionInterceptor.visitPutKeyValueCommand(TxDistributionInterceptor.java:105)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-4746) NPE when preloading cache with DeltaAware values
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4746?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4746:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166028|https://bugzilla.redhat.com/show_bug.cgi?id=1166028] from NEW to MODIFIED
> NPE when preloading cache with DeltaAware values
> ------------------------------------------------
>
> Key: ISPN-4746
> URL: https://issues.jboss.org/browse/ISPN-4746
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 6.0.2.Final, 7.0.0.Beta2
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Fix For: 7.0.0.CR1, 7.0.0.Final
>
>
> {code:java}
> public class DeltaAwarePreloadTest extends MultipleCacheManagersTest {
> private static final int CLUSTER_SIZE = 1;
> @Override
> protected void createCacheManagers() throws Throwable {
> ConfigurationBuilder c = getDefaultClusteredCacheConfig(CacheMode.DIST_SYNC, false);
> c.persistence().addStore(new DummyInMemoryStoreConfigurationBuilder(c.persistence()).storeName(getClass().getSimpleName())).preload(true);
> createCluster(c, CLUSTER_SIZE);
> waitForClusterToForm();
> }
> @Test
> public void testPreloadOnStart() throws PersistenceException {
> Cache<Object, Object> cache = caches().get(0);
> cache.put(1, new TestDeltaAware());
> cache.stop();
> cache.start();
> }
> }
> {code}
> During preload, the {{NonTxDistributionInterceptor}} decides that is requires values from previous owners and tries to check if the local node is the primary owner. At this point, the WriteCH is null since no topology was ever updated. Here's the stacktrace:
> {code}
> Caused by: org.infinispan.persistence.spi.PersistenceException: Unable to preload!
> at org.infinispan.persistence.manager.PersistenceManagerImpl.preloadKey(PersistenceManagerImpl.java:620)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.access$000(PersistenceManagerImpl.java:70)
> at org.infinispan.persistence.manager.PersistenceManagerImpl$1.processEntry(PersistenceManagerImpl.java:228)
> at org.infinispan.persistence.dummy.DummyInMemoryStore.process(DummyInMemoryStore.java:165)
> at org.infinispan.persistence.manager.PersistenceManagerImpl.preload(PersistenceManagerImpl.java:220)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
> ... 37 more
> Caused by: java.lang.NullPointerException
> at org.infinispan.distribution.impl.DistributionManagerImpl.getWriteConsistentHash(DistributionManagerImpl.java:115)
> at org.infinispan.distribution.impl.DistributionManagerImpl.getConsistentHash(DistributionManagerImpl.java:105)
> at org.infinispan.distribution.impl.DistributionManagerImpl.getPrimaryLocation(DistributionManagerImpl.java:95)
> at org.infinispan.interceptors.locking.ClusteringDependentLogic$DistributionLogic.localNodeIsPrimaryOwner(ClusteringDependentLogic.java:395)
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.remoteGetBeforeWrite(NonTxDistributionInterceptor.java:131)
> at org.infinispan.interceptors.distribution.BaseDistributionInterceptor.handleNonTxWriteCommand(BaseDistributionInterceptor.java:195)
> at org.infinispan.interceptors.distribution.NonTxDistributionInterceptor.visitPutKeyValueCommand(NonTxDistributionInterceptor.java:72)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
> at org.infinispan.interceptors.DistCacheWriterInterceptor.visitPutKeyValueCommand(DistCacheWriterInterceptor.java:72)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
> at org.infinispan.interceptors.CacheLoaderInterceptor.visitPutKeyValueCommand(CacheLoaderInterceptor.java:113)
> at org.infinispan.commands.write.PutKeyValueCommand.acceptVisitor(PutKeyValueCommand.java:71)
> at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:98)
> at org.infinispan.interceptors.EntryWrappingInterceptor.invokeNextAndApplyChanges(EntryWrappingInterceptor.java:376)
> at org.infinispan.interceptors.EntryWrappingInterceptor.setSkipRemoteGetsAndInvokeNextForDataCommand(EntryWrappingInterceptor.java:464)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-4497) Race condition in LocalLockMergingSegmentReadLocker results in file content being deleted
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4497?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4497:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166865|https://bugzilla.redhat.com/show_bug.cgi?id=1166865] from NEW to MODIFIED
> Race condition in LocalLockMergingSegmentReadLocker results in file content being deleted
> -----------------------------------------------------------------------------------------
>
> Key: ISPN-4497
> URL: https://issues.jboss.org/browse/ISPN-4497
> Project: Infinispan
> Issue Type: Bug
> Components: Lucene Directory
> Affects Versions: 5.2.6.Final
> Reporter: Anuj Shah
> Assignee: Sanne Grinovero
> Fix For: 7.0.0.Alpha5
>
>
> There is a race condition in LocalLockMergingSegmentReadLocker which can lead to more calls delete on the underlying DistributedSegmentReadLocker which results in the file being removed from the caches.
> This happens with three or more threads acquiring and releasing locks on the same file simultaneously:
> # Thread 1 (T1) acquires a lock and creates a {{LocalReadLock}}, call it L1 - the underlying lock is acquired
> # T2 starts to acquire and holds a reference to L1
> # T3 starts to acquire and holds a reference to L1
> # T1 releases - L1 at this stage only has value of 1 - so the underlying lock is released, and L1 is removed from the map
> # T2 continues - finds L1 with value 0 and acquires the underlying lock
> # T3 continues - increments L1 value to 2
> # T2 releases - creates a new {{LocalReadLock}}, L2 - this has zero value so the underlying lock is released, and L2 is removed from the map
> # T3 releases - creates a new {{LocalReadLock}}, L3 - this has zero value so the underlying lock is released, and L3 is removed from the map
> # The final step triggers a real file delete as underlying lock is released one too many times
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months
[JBoss JIRA] (ISPN-4710) DistributedSegmentReadLocker should be allowed to skip ReadLocks on small files
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-4710?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-4710:
-----------------------------------------------
Sebastian Łaskawiec <slaskawi(a)redhat.com> changed the Status of [bug 1166865|https://bugzilla.redhat.com/show_bug.cgi?id=1166865] from NEW to MODIFIED
> DistributedSegmentReadLocker should be allowed to skip ReadLocks on small files
> -------------------------------------------------------------------------------
>
> Key: ISPN-4710
> URL: https://issues.jboss.org/browse/ISPN-4710
> Project: Infinispan
> Issue Type: Enhancement
> Components: Lucene Directory
> Reporter: Sanne Grinovero
> Assignee: Gustavo Fernandes
> Fix For: 7.0.0.CR1
>
>
> Both of these methods:
> - {{org.infinispan.lucene.readlocks.DistributedSegmentReadLocker.deleteOrReleaseReadLock(String)}}
> - {{org.infinispan.lucene.readlocks.DistributedSegmentReadLocker.realFileDelete(FileReadLockKey, AdvancedCache<Object, Integer>, AdvancedCache<?, ?>, AdvancedCache<?, ?>, boolean)}}
> Are performing a lot of unnecessary operations - potentially on synchronous clustered caches - as we know in advance that files which are not being chunked don't need a read lock, and are not being chunked in smaller pieces (which affects how we delete things).
> The determining factor between the two styles is defined in:
> {{org.infinispan.lucene.impl.DirectoryLuceneV4.openInput(String, IOContext)}}
> {code} @Override
> public IndexInput openInput(final String name, final IOContext context) throws IOException {
> final IndexInputContext indexInputContext = impl.openInput(name);
> if ( indexInputContext.readLocks == null ) {
> return new SingleChunkIndexInput(indexInputContext);
> }
> else {
> return new InfinispanIndexInput(indexInputContext);
> }
> }{code}
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)
11 years, 4 months