[
https://issues.jboss.org/browse/ISPN-3455?page=com.atlassian.jira.plugin....
]
Lukasz Szelag commented on ISPN-3455:
-------------------------------------
In our system (6 clustered nodes), the replication counts continue to grow rapidly even
after there is *no caching activity at all* (initially, the caches are being populated as
data is being processed). For example, there are only 18 entries in the cache, whereas the
replication count is close to 1 million. This eventually causes system going out of memory
with a 20 GB heap.
Cache replication not warranted under load
------------------------------------------
Key: ISPN-3455
URL:
https://issues.jboss.org/browse/ISPN-3455
Project: Infinispan
Issue Type: Feature Request
Affects Versions: 5.3.0.Final
Environment: JSE 1.6.0_45; Windows 7
Reporter: Lukasz Szelag
Assignee: Mircea Markus
Attachments: infinispan.zip
Problem:
When running a replicated cache and repeatedly calling a cacheable method (using Spring
cache abstraction), Infinispan enters an infinite replication loop. This can be confirmed
by observing replication counts growing over time, where there are no cache misses.
Expected behavior:
Caches shouldn't be replicated when there is a cache hit.
Test case:
- 3 cluster members; asynchronous replication with a replication queue
- a cacheable method is executed repeatedly using 2 different keys
Notes:
- for some reason, this issue only occurs when using Enum arguments for a cache key; I
was not able to replicate this when using int or String types (see
com.designamus.infinispan.Main.works())
- the behavior is not deterministic (random), which points to a race condition
- the problem does not seem to be related to the Spring's default cache key
generator; I was able to reproduce the same behavior with a custom cache key generator,
which was thread-safe
- the cacheable method is executed only twice (once both keys are stored in the cache);
subsequent invocations retrieve stored values from the cache; this can be confirmed by
inspecting the log file
- the cache doesn't expire and entries are not evicted
- the memory usage grows over time, eventually causing OOM on a heavily loaded system
- since the issue is random in nature it may take a 3-4 attempts to reproduce it; I was
successful in reproducing this behavior numerous times
Steps to reproduce:
1. Build a test project (mvn clean compile)
2. Execute /run.sh (this will spawn 3 JVMs)
3. Start JConsole to monitor 3 cluster members (jconsole localhost:17001 localhost:17002
localhost:17003)
4. Monitor "replicationCount" attribute under RpcManager for cache
"MyCache" for all JVMs (see /replication-counts.png)
5. Observe that replication counts grow over time
6. Observe that all caches are of size 2 and there are no cache misses (see
/cache-statistics.png)
If the issue cannot be reproduced (replication counts stay at the same level):
5. Terminate all 3 JVM processes (as a convenience you can execute /stop.sh)
6. Repeat steps 2 through 5 above
When testing the above scenario using a distributed mode, I observed some other anomalies
(i.e. the cacheable method was executed multiple times, as if the value was not there).
While this may be related, it deserves a separate JIRA.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira