[JBoss JIRA] Created: (ISPN-1365) Data left in inconsistent state on rehash.

[JBoss JIRA] Created: (ISPN-1380)...

[JBoss JIRA] Created: (ISPN-1084)...

Erik Salter (JIRA)

Thursday, 1 September 2011 Thu, 1 Sep '11

1:52 p.m.

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Manik Surtani I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Show replies by date

Erik Salter (JIRA)

Thursday, 1 September Thu, 1 Sep

1:56 p.m.

New subject: [JBoss JIRA] Updated: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Erik Salter updated ISPN-1365: ------------------------------ Attachment: cacheTest_rehash.zip The test in question is at: net.beaumaris.cachetest.group.GroupingRehashTest

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Manik Surtani Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Manik Surtani (JIRA)

Friday, 2 September Fri, 2 Sep

10:04 a.m.

New subject: [JBoss JIRA] Updated: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Manik Surtani updated ISPN-1365: -------------------------------- Assignee: Dan Berindei (was: Manik Surtani) Fix Version/s: 5.1.0.ALPHA1 5.1.0.FINAL

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Dan Berindei (JIRA)

Sunday, 4 September Sun, 4 Sep

1:22 p.m.

New subject: [JBoss JIRA] Commented: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Dan Berindei commented on ISPN-1365: ------------------------------------ I've run the test a few times and I got two kinds of errors: 1. The test didn't create all caches when starting up a node, so sometimes the alloc cache is not created at all on a node and the cluster is never properly form. 2. Most other times the cluster is formed through a succession of merges. The test doesn't wait for all the nodes to join the cluster, so it is possible for two transactions to write to the same key on two different partitions and then one of those values will be lost on merge.

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Erik Salter (JIRA)

3:43 p.m.

New subject: [JBoss JIRA] Commented: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Erik Salter commented on ISPN-1365: ----------------------------------- For the second case, how do we manage this in production?

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Manik Surtani (JIRA)

Monday, 5 September Mon, 5 Sep

4:35 a.m.

New subject: [JBoss JIRA] Commented: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Manik Surtani commented on ISPN-1365: ------------------------------------- You can register a listener for a merge view and reconcile (re-write) the data on your own. Infinispan does not automatically deal with eventual consistency (partition tolerance) yet.

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Dan Berindei (JIRA)

Friday, 9 September Fri, 9 Sep

11:18 a.m.

New subject: [JBoss JIRA] Updated: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-1365: ------------------------------- Status: Pull Request Sent (was: Open) Git Pull Request: https://github.com/infinispan/infinispan/pull/522 Distributed tasks were allowed to execute on a joining node before it was finished starting. That meant a distributed task could modify a key before the rehash, but then during the rehash another node could push an old value and overwrite it, resulting in data loss.

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Galder Zamarreño (JIRA)

11:36 a.m.

New subject: [JBoss JIRA] Updated: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Galder Zamarreño updated ISPN-1365: ----------------------------------- Status: Resolved (was: Pull Request Sent) Resolution: Done

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Dan Berindei (JIRA)

Monday, 12 September Mon, 12 Sep

4:15 a.m.

New subject: [JBoss JIRA] Commented: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Dan Berindei commented on ISPN-1365: ------------------------------------ It turns out Erik's problem is not caused by the merge, but by the partitioning itself. He is using ISPN locks to ensure that external resources can only have a single a single owner, so any partitioning will allow multiple owners and break his model. His conclusion was that in production the load balancer will only connect to the nodes that are part of the cluster, so when a node starts up in a cluster by itself it will not receive any client requests until it has finished merging with the initial cluster. This will ensure the integrity of the model. A subsequent partition in the cluster ca still break his model, but the rehashing process can't help there. Note that the attached pull request solves a related problem when the node joins the cluster normally, *not* the 2) scenario in my Sep 4 comment. Somehow my environment changed and I'm not seeing any merge views, although my JGroups configuration is the same.

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.0.1.FINAL, 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

Dan Berindei (JIRA)

4:15 a.m.

New subject: [JBoss JIRA] Updated: (ISPN-1365) Data left in inconsistent state on rehash.

[ https://issues.jboss.org/browse/ISPN-1365?page=com.atlassian.jira.plugin.... ] Dan Berindei updated ISPN-1365: ------------------------------- Fix Version/s: 5.0.1.FINAL Git Pull Request: https://github.com/infinispan/infinispan/pull/522 (was: https://github.com/infinispan/infinispan/pull/522)

...

Data left in inconsistent state on rehash. ------------------------------------------ Key: ISPN-1365 URL: https://issues.jboss.org/browse/ISPN-1365 Project: Infinispan Issue Type: Bug Components: Core API Affects Versions: 5.0.0.FINAL Reporter: Erik Salter Assignee: Dan Berindei Fix For: 5.0.1.FINAL, 5.1.0.ALPHA1, 5.1.0.FINAL Attachments: cacheTest_rehash.zip I'm seeing a lot of data inconsistencies on a rehash, especially if there is a lot of lock contention for keys on caches participating in a transaction. Attached is a unit test that can reproduce the problem quite readily. This uses the grouping API, eager locking of a single node, and the distribution framework to effect "local" transactions.

-- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

4855

days inactive

4866

days old

infinispan-issues@lists.jboss.org

Manage subscription

9 comments

4 participants

tags (0)

participants (4)

Dan Berindei (JIRA)
Erik Salter (JIRA)
Galder Zamarreño (JIRA)
Manik Surtani (JIRA)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[JBoss JIRA] Created: (ISPN-1365) Data left in inconsistent state on rehash.