[JBoss JIRA] (ISPN-6350) Data race in the ShardIndexManager under topology changes
by Gustavo Fernandes (JIRA)
[ https://issues.jboss.org/browse/ISPN-6350?page=com.atlassian.jira.plugin.... ]
Gustavo Fernandes updated ISPN-6350:
------------------------------------
Status: Open (was: New)
> Data race in the ShardIndexManager under topology changes
> ---------------------------------------------------------
>
> Key: ISPN-6350
> URL: https://issues.jboss.org/browse/ISPN-6350
> Project: Infinispan
> Issue Type: Bug
> Components: Embedded Querying
> Affects Versions: 8.2.0.Final
> Reporter: Gustavo Fernandes
> Assignee: Gustavo Fernandes
> Labels: affinity
>
> The following example data race can cause unrecoverable errors during indexing:
> \[node1\] cache.put(key) // key maps to segment 48, owned by node1
> \[node1\] starts shard 48
> \[node1\] acquires lock on shard 48
> \[node1\] starts writing to the index
> \[node1\] notification of topology changed, lock released on shard 48
> \[node1\] lock reacquired (still writing to the index)
> \[node1\] commit on shard 48
> \[node1\] shard still locked
> \[node2\] cache.put(key) // Node2 now owns segment 48
> \[node2\] starts shard 48
> \[node2\] tries to acquire the lock on shard 48
> \[node2\] fail (lock still owned by node1)
> The current mechanism employed by the {{ShardIndexManager}} during topology changes involves using a listener and closing the IndexWriter on all nodes upon ownership changes, so that the lock is released and can be reacquired by the new owner (1 segment maps to 1 shard).
> Since writing to a shard can take some time, the listener can be triggered in the middle of an index operation and the closing of the index writer will have a very short duration because it is sudden reacquired, and not released anymore.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-6790) Distribution interceptors re-compute key location after remote get
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6790?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-6790:
-------------------------------
Status: Pull Request Sent (was: Open)
Git Pull Request: https://github.com/infinispan/infinispan/pull/4418
> Distribution interceptors re-compute key location after remote get
> ------------------------------------------------------------------
>
> Key: ISPN-6790
> URL: https://issues.jboss.org/browse/ISPN-6790
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.Alpha2, 8.2.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Fix For: 9.0.0.Alpha3
>
>
> If the distribution interceptors don't find a key remotely, they try again to retrieve it from the local data container. Before reading the local value, though, they compute the key's ownership again, to make sure they don't read a stale value that was previously owned by the local node.
> Most of the time, the topology doesn't change during the invocation through the interceptor chain, so there's no need to look up the value locally again, or to re-compute the key's location.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-6790) Distribution interceptors re-compute key location after remote get
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6790?page=com.atlassian.jira.plugin.... ]
Dan Berindei reassigned ISPN-6790:
----------------------------------
Assignee: Dan Berindei
> Distribution interceptors re-compute key location after remote get
> ------------------------------------------------------------------
>
> Key: ISPN-6790
> URL: https://issues.jboss.org/browse/ISPN-6790
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.Alpha2, 8.2.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Fix For: 9.0.0.Alpha3
>
>
> If the distribution interceptors don't find a key remotely, they try again to retrieve it from the local data container. Before reading the local value, though, they compute the key's ownership again, to make sure they don't read a stale value that was previously owned by the local node.
> Most of the time, the topology doesn't change during the invocation through the interceptor chain, so there's no need to look up the value locally again, or to re-compute the key's location.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-6790) Distribution interceptors re-compute key location after remote get
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6790?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-6790:
-------------------------------
Status: Open (was: New)
> Distribution interceptors re-compute key location after remote get
> ------------------------------------------------------------------
>
> Key: ISPN-6790
> URL: https://issues.jboss.org/browse/ISPN-6790
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.Alpha2, 8.2.2.Final
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Priority: Minor
> Fix For: 9.0.0.Alpha3
>
>
> If the distribution interceptors don't find a key remotely, they try again to retrieve it from the local data container. Before reading the local value, though, they compute the key's ownership again, to make sure they don't read a stale value that was previously owned by the local node.
> Most of the time, the topology doesn't change during the invocation through the interceptor chain, so there's no need to look up the value locally again, or to re-compute the key's location.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months
[JBoss JIRA] (ISPN-6792) NonTxDistributionInterceptor doesn't always change the matcher for retry
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-6792?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-6792:
-------------------------------
Status: Open (was: New)
> NonTxDistributionInterceptor doesn't always change the matcher for retry
> ------------------------------------------------------------------------
>
> Key: ISPN-6792
> URL: https://issues.jboss.org/browse/ISPN-6792
> Project: Infinispan
> Issue Type: Bug
> Components: Core
> Affects Versions: 9.0.0.Alpha2, 8.2.2.Final
> Reporter: Dan Berindei
> Labels: testsuite_stability
> Fix For: 9.0.0.Alpha3
>
>
> When invoking a non-transactional write command remotely, {{BaseDistributionInterceptor}} expects to receive an {{OutdatedTopologyException}} wrapped in a {{RemoteException}} if the remote node saw a newer cache topology.
> However, {{OutdatedTopologyException}} is handled differently by {{JGroupsTransport}}, and it is *not* wrapped in a {{RemoteException}}. Because of this, {{BaseDistributionInterceptor}} doesn't update the value matcher, and the retried command may fail when it sees its own value.
> This makes {{NonTxPutIfAbsentDuringLeaveStressTest}} fail randomly with
> {noformat}
> java.lang.AssertionError: expected:<null> but was:<value_7_0>
> at org.testng.AssertJUnit.fail(AssertJUnit.java:59)
> at org.testng.AssertJUnit.failNotEquals(AssertJUnit.java:364)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:80)
> at org.testng.AssertJUnit.assertEquals(AssertJUnit.java:88)
> at org.infinispan.distribution.rehash.NonTxPutIfAbsentDuringLeaveStressTest.testNodeLeavingDuringPutIfAbsent(NonTxPutIfAbsentDuringLeaveStressTest.java:101)
> {noformat}
> Note that this is different from ISPN-6451: there, the assertion message is {{AssertionError: expected:<value_48_1> but was:<null>}}, and it is most likely caused by ISPN-3918.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
8 years, 6 months