[jbosscache-issues] [JBoss JIRA] Updated: (JBCACHE-1316) recycleQueue.put() could block forever halting evictions and eventually any replications
Manik Surtani (JIRA)
jira-events at lists.jboss.org
Sun Jan 4 04:12:54 EST 2009
[ https://jira.jboss.org/jira/browse/JBCACHE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Manik Surtani updated JBCACHE-1316:
-----------------------------------
Fix Version/s: 1.4.1.SP10
(was: 1.4.X)
> recycleQueue.put() could block forever halting evictions and eventually any replications
> ----------------------------------------------------------------------------------------
>
> Key: JBCACHE-1316
> URL: https://jira.jboss.org/jira/browse/JBCACHE-1316
> Project: JBoss Cache
> Issue Type: Bug
> Security Level: Public(Everyone can see)
> Components: Eviction
> Affects Versions: 1.4.0.SP1, 2.0.0.GA, 1.4.1.SP8, 1.4.1.SP9, 2.1.0.GA
> Reporter: Galder Zamarreno
> Assignee: Galder Zamarreno
> Fix For: 1.4.1.SP10, 2.1.1.GA, 2.2.0.GA
>
>
> In BaseEvictionAlgorithm, whenever we can't evict a node, we add it to the
> recycle queue:
> protected void evict(NodeEntry ne)
> {
> // NodeEntry ne = evictionQueue.getNodeEntry(fqn);
> if (ne != null)
> {
> evictionQueue.removeNodeEntry(ne);
> if (!this.evictCacheNode(ne.getFqn()))
> {
> try
> {
> recycleQueue.put(ne.getFqn());
> }
> catch (InterruptedException e)
> {
> log.debug("InterruptedException", e);
> }
> }
> }
> }
> For that to happen, an Exception must have been thrown in:
> protected boolean evictCacheNode(Fqn fqn)
> {
> if (log.isTraceEnabled())
> {
> log.trace("Attempting to evict cache node with fqn of " + fqn);
> }
> EvictionPolicy policy = region.getEvictionPolicy();
> // Do an eviction of this node
> try
> {
> policy.evict(fqn);
> }
> catch (Exception e)
> {
> if (e instanceof TimeoutException)
> {
> log.warn("eviction of " + fqn + " timed out. Will retry later.");
> return false;
> }
> e.printStackTrace();
> return false;
> }
> if (log.isTraceEnabled())
> {
> log.trace("Eviction of cache node with fqn of " + fqn + " successful");
> }
> return true;
> }
> In a production environment, specially if the user ignores warnings or Exceptions,
> we could end up blocking indefinitely in recycleQueue.put(ne.getFqn()); if there's
> no more room. If this deadlocks, evictions deadlock leading to either OOME or
> replication exceptions because no more events can be put in the eviction queue.
> Rather than calling put() on the EDU.oswego.cs.dl.util.concurrent.BoundedBuffer, we
> should call offer(Object x, long msecs) so that we never run the chance of blocking
> forever here again.
> Same thing happens for 2.x, it uses LinkedBlockingQueue and calls put() on it, which waits
> if necessary for space to become available. Now, the chances of this issue occurring in 2.x
> are lower because LinkedBlockingQueue is initialised with a capacity of 500000.
> In 1.4.x, the BoundedBuffer was initialised with the default capacity, which was 1024.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jbosscache-issues
mailing list