[
https://issues.jboss.org/browse/ISPN-3357?page=com.atlassian.jira.plugin....
]
Mircea Markus commented on ISPN-3357:
-------------------------------------
[~dan.berindei] What about this for a solution?
ATM the NonTxDistributionInterceptor.remoteGetBeforeWrite does a remote get if there's
an rehash going on (our situation) and uses the value obtained here (incorrect in this
situation, as described by Takayoshi) in order to determine if the conditional operation
should succeed. A nicer approach would be to have this information forwarded from an
existing owner which would also set the PutKeyValueCommand.ignorePreviousValue flag to
true. In other words, the solution would be for the conditional commands to never rely on
the forwarding, i.e. we don't ever send the write to the *new* owner directly from the
operation originator.
Insufficient owners with putIfAbsent during node join rebalance
---------------------------------------------------------------
Key: ISPN-3357
URL:
https://issues.jboss.org/browse/ISPN-3357
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
Reporter: Takayoshi Kimura
Assignee: Dan Berindei
Priority: Critical
Attachments: 7c29bccb.log
Here is test scenario:
* DIST numOwners=2, start with 3 nodes cluster then join 1 node during load
* HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000
entries total
After the test run, the numberOfEntries on each node are:
* node1: 20074
* node2: 19888
* node3: 20114
* node4: 18885
Total is 78961, 1039 entries are missing. No error on HotRod client side so 80000 entries
should be there.
Let's take a look at example missing entry, hash(thread01key151) = 7c29bccb.
Current CH: owners(7c29bccb) are [node1, node2]
Pending CH: owners(7c29bccb) are [node1, node2, node4]
Balanced CH: owners(7c29bccb) are [node1, node4]
The events sequence is:
* hotrod -> node1
* node1 -> node2, node4
* node2 committed entry
* node4 performed clustered get before write, got a value from node2 and will not commit
the entry because this node thinks it's not changed/created
* node1 committed entry
* node2 invalidates the entry because it's no longer an owner
Result owners(7c29bccb) are only node1 and node4 is missing. This entry may be completely
lost by further rebalances when node4 is donor for this segment.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira