[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura commented on ISPN-3366:
----------------------------------------
Looks like the problem symptom is different, it retries (this fixes the original problem) but doesn't commit the entry at the last try. There is no "About to commit entry" log. Probably we need to tweak EntryWrappingInterceptor as well?
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.8.Final, 6.0.0.Alpha2, 6.0.0.Final
>
> Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura commented on ISPN-3366:
----------------------------------------
Note that the possibility of the data loss is lowered, it's now 10% or so. I reproduced it 2 times out of 20 times. Previously it was over 50%.
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.8.Final, 6.0.0.Alpha2, 6.0.0.Final
>
> Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura reopened ISPN-3366:
------------------------------------
Reopen, we still see data loss.
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.8.Final, 6.0.0.Alpha2, 6.0.0.Final
>
> Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Takayoshi Kimura (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Takayoshi Kimura updated ISPN-3366:
-----------------------------------
Attachment: ISPN-3366-full-logs-3rd.zip
Tested with 5.2.8-SNAPSHOT and data loss happened. (In logs it says 5.2.6-SNAPSHOT because Version.java is not updated)
3 entries are missing, first one is hash(thread10key59)=4d12e1a9.
Attached ISPN-3366-full-logs-3rd.zip, from this time total entries 4000, 1/10 of previous size. Also it contains JGroups TRACE logging.
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.8.Final, 6.0.0.Alpha2, 6.0.0.Final
>
> Attachments: ISPN-3366-full-logs-3rd.zip, ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3366:
--------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Attachments: ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3366) Data loss when entry forwarding to primary owner and primary owner shutdown
by Mircea Markus (JIRA)
[ https://issues.jboss.org/browse/ISPN-3366?page=com.atlassian.jira.plugin.... ]
Mircea Markus updated ISPN-3366:
--------------------------------
Fix Version/s: 5.2.8.Final
6.0.0.Alpha2
6.0.0.Final
> Data loss when entry forwarding to primary owner and primary owner shutdown
> ---------------------------------------------------------------------------
>
> Key: ISPN-3366
> URL: https://issues.jboss.org/browse/ISPN-3366
> Project: Infinispan
> Issue Type: Bug
> Components: Distributed Cache
> Affects Versions: 5.2.4.Final, 6.0.0.Alpha1
> Reporter: Takayoshi Kimura
> Assignee: Dan Berindei
> Priority: Critical
> Fix For: 5.2.8.Final, 6.0.0.Alpha2, 6.0.0.Final
>
> Attachments: ISPN-3366-logs.zip
>
>
> Looks like a problem in entry forwarding.
> Here is test scenario:
> * DIST numOwners=2, start with 4 nodes cluster then normal shutdown 1 node during load
> * HotRod putIfAbsent accesses from 40 threads (1 process, 1 remote cache instance), 40000 entries total
> After the test run, the numberOfEntries on each node are:
> * node1: 26608
> * node2: 26622
> * node3: 26746
> * node4: 0
> Total is 79976 and HotRod client received 11 errors, so 79976 + (11 * 2) = 79998. It means 1 entry is completely missing.
> Let's take a look at the missing entry, hash(thread16key59) = 574ff563.
> Current CH: owners(574ff563) are [node4, node1]
> The events sequence is:
> * hotrod -> node1
> * node1 forwarding it to primary owner node4
> * node4 doesn't process the forwarded entry, shutdown
> Result owners(7c29bccb) is [] empty. This entry is completely lost without any errors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3391) Upgrade to JBoss Marshalling 2.0.0.Beta1
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3391?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-3391:
-------------------------------
Status: Pull Request Sent (was: Open)
> Upgrade to JBoss Marshalling 2.0.0.Beta1
> ----------------------------------------
>
> Key: ISPN-3391
> URL: https://issues.jboss.org/browse/ISPN-3391
> Project: Infinispan
> Issue Type: Component Upgrade
> Components: Build process, Marshalling
> Affects Versions: 6.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Alpha2
>
>
> Quoting David Lloyd:
> The following things should be known about this update:
> * There are some API incompatibilities (in particular, Creators are gone, and Externalizers changed a little bit)
> * The binary protocol is 100% identical
> 2.x can interoperate with 1.x over the wire with no problems
> * The changes to Infinispan seem pretty minimal
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] (ISPN-3391) Upgrade to JBoss Marshalling 2.0.0.Beta1
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-3391?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-3391:
-------------------------------
Status: Resolved (was: Pull Request Sent)
Resolution: Done
> Upgrade to JBoss Marshalling 2.0.0.Beta1
> ----------------------------------------
>
> Key: ISPN-3391
> URL: https://issues.jboss.org/browse/ISPN-3391
> Project: Infinispan
> Issue Type: Component Upgrade
> Components: Build process, Marshalling
> Affects Versions: 6.0.0.Alpha1
> Reporter: Dan Berindei
> Assignee: Dan Berindei
> Fix For: 6.0.0.Alpha2
>
>
> Quoting David Lloyd:
> The following things should be known about this update:
> * There are some API incompatibilities (in particular, Creators are gone, and Externalizers changed a little bit)
> * The binary protocol is 100% identical
> 2.x can interoperate with 1.x over the wire with no problems
> * The changes to Infinispan seem pretty minimal
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months