[infinispan-issues] [JBoss JIRA] Commented: (ISPN-682) return value of Remove command is unreliable in DIST without L1
Trustin Lee (JIRA)
jira-events at lists.jboss.org
Wed Oct 27 14:42:54 EDT 2010
[ https://jira.jboss.org/browse/ISPN-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559767#action_12559767 ]
Trustin Lee commented on ISPN-682:
----------------------------------
In the following scenario (modified from the patch's testSkipLookupOnRemove()):
MagicKey k1 = new MagicKey(c1);
final String value = "SomethingToSayHere";
c1.put(k1, value);
...
assert c3.containsKey(k1); // (1)
assert value.equals(c3.remove(k1)); // (2)
If L1 cache is disabled, assertion (2) fails while (1) passes.
The root cause lies in how remote operation works. containsKey() is actually implemented as 'get(key) != null'. Because L1 cache is disabled, c3.datacontainer.get(k1) will return null. Then, DistributionInterceptor.visitGetKeyValueCommand() calls remoteGetAndStoreInL1() to retrieve the missing entry from c1 (or maybe c2). The retrieval of k1 succeeds and visitGetKeyValueCommand() returns non-null. Note that the retrieved entry will not be stored into c3 because L1 cache is disabled, as you see in DistributionInterceptor.realRemoteGet().
So, now we know we can get an entry from other nodes even if the current node doesn't have it. However, modifying an entry is more complicated.
When c3.remove(k1) is called, DistributionInterceptor.visitRemoveCommand() again ends up with calling remoteGetAndStoreInL1() and realRemoteGet() to load the missing entries into c3 so that the subsequent modification operations can always access the necessary entries directly from c3. This could have been OK if L1 cache were enabled, but because L1 cache has been disabled, realRemoteGet() does nothing but wasting bandwidth. Therefore RemoveCommand.perform() will eventually fail.
One solution is to modify DistributionInterceptor.visitRemoveCommand() (and all other modification operations?) to call DistributionManager.applyRemoteTxLog() if the affected entry does not exist in the current cache at the cost of potential performance drop (maybe not much if consistent hashing works well). However, it will make things much more complicated if the operation was a part of a transaction (i.e. should we do XA?)
The other solution is to just load the entries for write operations even if L1 cache is disabled. However, I doubt this will ever work because it will make impossible to track which node exactly has which entry.
Since I'm not sure about what the best solution is, I'd like to wait for some comments first before implementing anything.
> return value of Remove command is unreliable in DIST without L1
> ---------------------------------------------------------------
>
> Key: ISPN-682
> URL: https://jira.jboss.org/browse/ISPN-682
> Project: Infinispan
> Issue Type: Bug
> Components: Core API, Distributed Cache
> Affects Versions: 4.2.0.ALPHA2
> Reporter: Sanne Grinovero
> Assignee: Trustin Lee
> Fix For: 4.2.0.BETA1
>
> Attachments: ISPN-682-unittest.patch
>
>
> cache.remove(key) returns null in DIST mode when L1 is disabled (unreliableReturnValues was not enabled), while the expected value is !=null.
> providing a unit test highlighting the difference between L1 on/off: it seems to work fine when L1 is enabled.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the infinispan-issues
mailing list