[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-2712:
---------------------------------------
[~anistor] yes you might want to use passivation with eviction when you have large indexes, it's often not practical to keep them all in memory.
Also even if you have small indexes you want to be able to persist them to restart the grid nodes without losing the index.
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2632) Uneven request balancing after node crash
by RH Bugzilla Integration (JIRA)
[ https://issues.jboss.org/browse/ISPN-2632?page=com.atlassian.jira.plugin.... ]
RH Bugzilla Integration commented on ISPN-2632:
-----------------------------------------------
Michal Linhard <mlinhard(a)redhat.com> made a comment on [bug 886549|https://bugzilla.redhat.com/show_bug.cgi?id=886549]
I calculated how the values of entries in caches and throughput on nodes differ from each other during the test by calculating how each one differs from average (counting only active nodes) these are maximum differences in entry counts and throughput on individual nodes:
32-28-32 test:
entries: 4.8 % (ignoring 6 extreme values)
throughput 17.6 % (ignoring 6 extreme values)
32-31-32 test:
entries: 3.6 % (ignoring one extreme value)
throughput: 18.6 % (ignoring one extreme value)
8-6-8 test:
entries: 7.9 % (ignoring 10 extreme values)
throughput: 17.1 % (ignoring 3 extreme values)
Other than this the tests are fine, no unusual client or server errors in logs.
So this depends on what we're able to tolerate.
> Uneven request balancing after node crash
> -----------------------------------------
>
> Key: ISPN-2632
> URL: https://issues.jboss.org/browse/ISPN-2632
> Project: Infinispan
> Issue Type: Bug
> Components: Remote protocols
> Affects Versions: 5.2.0.CR1
> Reporter: Michal Linhard
> Assignee: Dan Berindei
> Priority: Blocker
> Fix For: 5.2.0.CR2, 5.2.0.Final
>
>
> This is a new manifestation of ISPN-1995, but in this case this happens after killing only one node: the hot rod requests aren't very well balanced.
> these runs still manifest also ISPN-2550 and it may be cause of this bug.
> The uneven balancing of requests can be seen here:
> https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/view/EDG-REPOR...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Randall Hauch (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Randall Hauch commented on ISPN-2712:
-------------------------------------
I noticed that the workaround involved changing the concurrency level to a much smaller value. This might be fine initially when the cache has a relatively small number of entries, but doesn't this value become impractical when the cache grows to a size that is orders of magnitude larger than the sized used to originally set the value?
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Adrian Nistor commented on ISPN-2712:
-------------------------------------
[~sannegrinovero] I'm curious, do indexes stored in grid use eviction?
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Adrian Nistor (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Adrian Nistor commented on ISPN-2712:
-------------------------------------
The logs indicate the entries are properly transferred, so this is not really a state transfer issue. The problem is eviction, or more exactly the counter-intuitive way concurrencylevel and maxEntries interact - avoidable as detailed by Dan.
I have created this small eviction test that uses your settings, both using BoundedConcurrentHashMap directly or using a cache. They work identically on 5.1.x and 5.2 but you said the problem is not reproducible on 5.1.x. I'm puzzled.
{code}
@Test(groups = "functional", testName = "EvictionTest")
public class EvictionTest extends SingleCacheManagerTest {
protected EmbeddedCacheManager createCacheManager() throws Exception {
// build a cache with locking and eviction configuration identical to the failing scenario
ConfigurationBuilder builder = TestCacheManagerFactory.getDefaultCacheConfiguration(true);
builder.transaction().transactionMode(TransactionMode.TRANSACTIONAL).lockingMode(LockingMode.PESSIMISTIC)
.eviction().maxEntries(1000).strategy(EvictionStrategy.LIRS)
.locking().lockAcquisitionTimeout(20000)
.concurrencyLevel(5000).useLockStriping(false).writeSkewCheck(false).isolationLevel(IsolationLevel.READ_COMMITTED);
EmbeddedCacheManager cm = TestCacheManagerFactory.createCacheManager(builder);
cache = cm.getCache();
return cm;
}
public void testCacheEviction() throws Exception {
for (int i = 0; i < 300; i++) {
cache.put(i, i);
}
System.out.println("cache size: " + cache.size()); // should be roughly 200, the rest was evicted
assertTrue(1 < cache.size() && cache.size() < 300);
}
public void testBoundedConcurrentHashMapEviction() {
BoundedConcurrentHashMap<Integer, Integer> map = new BoundedConcurrentHashMap<Integer, Integer>(1000, 5000, BoundedConcurrentHashMap.Eviction.LIRS);
for (int i = 0; i < 300; i++) {
map.put(i, i);
}
System.out.println("map size: " + map.size()); // should be roughly 200, the rest was evicted
assertTrue(1 < map.size() && map.size() < 300);
}
}
{code}
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2697) HotRodServer startup fails when its record cannot be inserted into topology cache
by Galder Zamarreño (JIRA)
[ https://issues.jboss.org/browse/ISPN-2697?page=com.atlassian.jira.plugin.... ]
Galder Zamarreño commented on ISPN-2697:
----------------------------------------
Bela, it's not a message that needs to be tagged as RSVP, but a particular cache.put call, to be more precise:
https://github.com/infinispan/infinispan/blob/master/server/hotrod/src/ma...
> HotRodServer startup fails when its record cannot be inserted into topology cache
> ---------------------------------------------------------------------------------
>
> Key: ISPN-2697
> URL: https://issues.jboss.org/browse/ISPN-2697
> Project: Infinispan
> Issue Type: Bug
> Components: Remote protocols
> Affects Versions: 5.2.0.Beta6
> Reporter: Radim Vansa
> Assignee: Galder Zamarreño
> Priority: Critical
> Fix For: 5.2.0.Final
>
>
> When the HotRodServer starts it inserts its record to __hotRodTopologyCache ({{HotRodServer.addSelfToTopologyView(...)}}).
> However, this put may very easily fail - as the command is broadcasted using NAKACK2 protocol, if the message gets lost and there's no following broadcasted message, the message will be not retransmitted and the put operation times out (Replication timeout), which fails the whole HotRodServer startup, all because of one lost UDP message.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Sanne Grinovero (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Sanne Grinovero commented on ISPN-2712:
---------------------------------------
It's good to see there is a workaround, but I hope we'll see a proper solution before Final? This is an bomb for indexes stored in the grid.
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-2712:
-------------------------------
Workaround Description: Change concurrencyLevel from 5000 (effective value 500) to 50. This increases the maximum size of each data container segment from 2 to 20. (was: Change concurrencyLevel to 50 (which increases the maximum size of each data container segment from 2 to 20).)
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months
[JBoss JIRA] (ISPN-2712) Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
by Dan Berindei (JIRA)
[ https://issues.jboss.org/browse/ISPN-2712?page=com.atlassian.jira.plugin.... ]
Dan Berindei updated ISPN-2712:
-------------------------------
Workaround Description: Change concurrencyLevel to 50 (which increases the maximum size of each data container segment from 2 to 20). (was: Do not use eviction (though this is not practical for real-world uses).)
> Initial state transfer doesn't appear to all be persisted when using eviction in a replicated cluster
> ------------------------------------------------------------------------------------------------------
>
> Key: ISPN-2712
> URL: https://issues.jboss.org/browse/ISPN-2712
> Project: Infinispan
> Issue Type: Bug
> Affects Versions: 5.2.0.CR1
> Reporter: Randall Hauch
> Assignee: Adrian Nistor
> Priority: Critical
> Fix For: 5.2.0.Final
>
> Attachments: ispn_eviction_log.txt, spectrum-repository-infinispan.xml
>
>
> Using a clustered cache with 2 nodes, where the cache in each node is configured identically with replication, eviction and (non-shared) file cache store. (See attached configuration.)
> The first (coordinator) process in the cluster is started and populated with 293 entries. Then the first process continually adds a few entries every 5 seconds. After a short delay, the second process is started, at which point it joins the cluster and starts the state-transfer process; logging shows in the first process that all 293 entries are transferred to the new cluster member, and the second log shows that they are all received. The second process then attempts to look for a specific entry that was created during initial population in the first process. This fails to find the existing entry.
> By enabling trace logging and "IDE breakpoint output messages" around state transfer, it's visible that from the 293 keys, only 218 are placed into the cache, the others being lost.
> (This problem was originally discovered when clustering ModeShape, which behaves roughly in the manner described above. The initial entries that are populated upon initialization are content created when a new repository is started. The second process looks for this content, and if it finds the content it knows not to create all of this initial content. However, if it doesn't find it, it thinks the repository has not yet been initialized and that it should create the initial content. The problem described by this bug then manifests itself in ModeShape through dozens of exceptions because the repository has been corrupted. See MODE-1745 for details on this problem. ModeShape's corresponding known issue for this issue, ISPN-2712, is MODE-1754.)
> The eviction is configured like this:
> {code:xml}
> <eviction strategy="LIRS" maxEntries="1000"/>
> {code}
> The attached log file is from the second process (the "receiver" node) and it contains the following key points:
> * line 40 - the total number of keys & entries to be transferred = 293
> * line 1352 and from there onwards 1358 / 1364 / i + 6 - the data container's size stops growing at 218, while the other entries are being sent. This means that in effect, they are ignored.
> * line 1797 - the loop from {{org.infinispan.statetransfer.StateConsumerImpl#doApplyState}} finishes
> Disabling eviction fixes the problem and all 293 nodes are placed in Node2's cache.
> (I initially marked this as CRITICAL priority, though it is a blocker for our use of Infinispan 5.2.)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
11 years, 11 months