[
https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin....
]
Dan Berindei commented on ISPN-3366:
------------------------------------
[~tkimura], could you try to build again from branch
https://github.com/danberindei/infinispan/tree/t_3366_m and see if you can still reproduce
the issue?
The fix there is still incomplete in that it breaks some putForExternalRead tests, but it
should fix the data loss as it retries the command when the topology changes on the
primary owner. BTW, this is the approach that I said wouldn't work in the previous
comment - I realized the topology id is set on the originator, so checking for changes on
the primary owner should work. The challenge now is to limit the cases where we retry the
command, because putForExternalRead is asynchronous and retrying doesn't really work
in that case.
Data loss when entry forwarding to primary owner and primary owner
shutdown
---------------------------------------------------------------------------
Key: ISPN-3366
URL:
https://issues.jboss.org/browse/ISPN-3366
Project: Infinispan
Issue Type: Bug
Components: Distributed Cache
Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
Reporter: Takayoshi Kimura
Assignee: Dan Berindei
Priority: Critical
Fix For: 5.2.8.Final, 6.0.0.Alpha3, 6.0.0.Final
Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-logs.zip
Looks like a problem in entry forwarding.
Here is test scenario:
* DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
* HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000
entries total
After the test run, the numberOfEntries on each node are:
* node1: 26608
* node2: 26622
* node3: 26746
* node4: 0
Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It
means 1 entry is completely missing.
Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
Current CH: owners(574ff563) are [node4, node1]
The events sequence is:
* hotrod -> node1
* node1 forwarding it to primary owner node4
* node4 doesn't process the forwarded entry, shutdown
Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira