[jbosscache-issues] [JBoss JIRA] Updated: (JBCACHE-1316) recycleQueue.put() could block forever halting evictions and eventually any replications

Manik Surtani (JIRA) jira-events at lists.jboss.org
Sun Jan 4 04:12:54 EST 2009


     [ https://jira.jboss.org/jira/browse/JBCACHE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manik Surtani updated JBCACHE-1316:
-----------------------------------

    Fix Version/s: 1.4.1.SP10
                       (was: 1.4.X)


> recycleQueue.put() could block forever halting evictions and eventually any replications
> ----------------------------------------------------------------------------------------
>
>                 Key: JBCACHE-1316
>                 URL: https://jira.jboss.org/jira/browse/JBCACHE-1316
>             Project: JBoss Cache
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Eviction
>    Affects Versions: 1.4.0.SP1, 2.0.0.GA, 1.4.1.SP8, 1.4.1.SP9, 2.1.0.GA
>            Reporter: Galder Zamarreno
>            Assignee: Galder Zamarreno
>             Fix For: 1.4.1.SP10, 2.1.1.GA, 2.2.0.GA
>
>
> In BaseEvictionAlgorithm, whenever we can't evict a node, we add it to the 
> recycle queue:
>    protected void evict(NodeEntry ne)
>    {
> //      NodeEntry ne = evictionQueue.getNodeEntry(fqn);
>       if (ne != null)
>       {
>          evictionQueue.removeNodeEntry(ne);
>          if (!this.evictCacheNode(ne.getFqn()))
>          {
>             try
>             {
>                recycleQueue.put(ne.getFqn());
>             }
>             catch (InterruptedException e)
>             {
>                log.debug("InterruptedException", e);
>             }
>          }
>       }
>    }
> For that to happen, an Exception must have been thrown in:
>    protected boolean evictCacheNode(Fqn fqn)
>    {
>       if (log.isTraceEnabled())
>       {
>          log.trace("Attempting to evict cache node with fqn of " + fqn);
>       }
>       EvictionPolicy policy = region.getEvictionPolicy();
>       // Do an eviction of this node
>       try
>       {
>          policy.evict(fqn);
>       }
>       catch (Exception e)
>       {
>          if (e instanceof TimeoutException)
>          {
>             log.warn("eviction of " + fqn + " timed out. Will retry later.");
>             return false;
>          }
>          e.printStackTrace();
>          return false;
>       }
>       if (log.isTraceEnabled())
>       {
>          log.trace("Eviction of cache node with fqn of " + fqn + " successful");
>       }
>       return true;
>    }
> In a production environment, specially if the user ignores warnings or Exceptions,
> we could end up blocking indefinitely in recycleQueue.put(ne.getFqn()); if there's 
> no more room. If this deadlocks, evictions deadlock leading to either OOME or 
> replication exceptions because no more events can be put in the eviction queue.
> Rather than calling put() on the EDU.oswego.cs.dl.util.concurrent.BoundedBuffer, we 
> should call offer(Object x, long msecs) so that we never run the chance of blocking 
> forever here again.
> Same thing happens for 2.x, it uses LinkedBlockingQueue and calls put() on it, which waits 
> if necessary for space to become available. Now, the chances of this issue occurring in 2.x 
> are lower because LinkedBlockingQueue is initialised with a capacity of 500000.
> In 1.4.x, the BoundedBuffer was initialised with the default capacity, which was 1024.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jbosscache-issues mailing list