[jboss-jira] [JBoss JIRA] Created: (JBCACHE-1316) recycleQueue.put() could block forever halting evictions and eventually any replications

Galder Zamarreno (JIRA) jira-events at lists.jboss.org
Fri Mar 28 06:51:40 EDT 2008


recycleQueue.put() could block forever halting evictions and eventually any replications
----------------------------------------------------------------------------------------

                 Key: JBCACHE-1316
                 URL: http://jira.jboss.com/jira/browse/JBCACHE-1316
             Project: JBoss Cache
          Issue Type: Bug
      Security Level: Public (Everyone can see)
          Components: Eviction
    Affects Versions: 2.1.0.GA, 1.4.1.SP8, 2.0.0.GA, 1.4.0.SP1
            Reporter: Galder Zamarreno
         Assigned To: Galder Zamarreno


In BaseEvictionAlgorithm, whenever we can't evict a node, we add it to the 
recycle queue:

   protected void evict(NodeEntry ne)
   {
//      NodeEntry ne = evictionQueue.getNodeEntry(fqn);
      if (ne != null)
      {
         evictionQueue.removeNodeEntry(ne);
         if (!this.evictCacheNode(ne.getFqn()))
         {
            try
            {
               recycleQueue.put(ne.getFqn());
            }
            catch (InterruptedException e)
            {
               log.debug("InterruptedException", e);
            }
         }
      }
   }


For that to happen, an Exception must have been thrown in:

   protected boolean evictCacheNode(Fqn fqn)
   {
      if (log.isTraceEnabled())
      {
         log.trace("Attempting to evict cache node with fqn of " + fqn);
      }
      EvictionPolicy policy = region.getEvictionPolicy();
      // Do an eviction of this node

      try
      {
         policy.evict(fqn);
      }
      catch (Exception e)
      {
         if (e instanceof TimeoutException)
         {
            log.warn("eviction of " + fqn + " timed out. Will retry later.");
            return false;
         }
         e.printStackTrace();
         return false;
      }

      if (log.isTraceEnabled())
      {
         log.trace("Eviction of cache node with fqn of " + fqn + " successful");
      }

      return true;
   }

In a production environment, specially if the user ignores warnings or Exceptions,
we could end up blocking indefinitely in recycleQueue.put(ne.getFqn()); if there's 
no more room. If this deadlocks, evictions deadlock leading to either OOME or 
replication exceptions because no more events can be put in the eviction queue.

Rather than calling put() on the EDU.oswego.cs.dl.util.concurrent.BoundedBuffer, we 
should call offer(Object x, long msecs) so that we never run the chance of blocking 
forever here again.

Same thing happens for 2.x, it uses LinkedBlockingQueue and calls put() on it, which waits 
if necessary for space to become available. Now, the chances of this issue occurring in 2.x 
are lower because LinkedBlockingQueue is initialised with a capacity of 500000.

In 1.4.x, the BoundedBuffer was initialised with the default capacity, which was 1024.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list