[hibernate-issues] [Hibernate-JIRA] Updated: (HHH-5300) Configurable strong and soft reference QueryPlanCache sizes

Friday, 18 June 2010

     [
http://opensource.atlassian.com/projects/hibernate/browse/HHH-5300?page=c...
]

Manuel Dominguez Sarmiento updated HHH-5300:
--------------------------------------------

    Attachment: diffs.zip

Hi Steve, I've attached the diffs. Please note that LRUMap.java is a new class (so
there is no diff, just full source) and SoftLimitMRUCache.java has changed quite a bit so
I'm attaching the full source instead.

...
 Configurable strong and soft reference QueryPlanCache sizes
 -----------------------------------------------------------

                 Key: HHH-5300
                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HHH-5300
             Project: Hibernate Core
          Issue Type: Patch
          Components: core
    Affects Versions: 3.5.0-Final, 3.5.1, 3.5.2
         Environment: N/A
            Reporter: Manuel Dominguez Sarmiento
         Attachments: diffs.zip, Environment.java, LRUMap.java, QueryPlanCache.java,
SessionFactoryImpl.java, SimpleMRUCache.java, SoftLimitMRUCache.java

   Original Estimate: 2h
  Remaining Estimate: 2h

 Some of our production servers (Hibernate-based apps) have been hanging on full GC
sporadically, most of the time after running normally for a few days, sometimes even over
a week.
 We suspected a memory leak. We used the Eclipse MAT tool to analyze a live heap dump, and
found that most of the heap was being used by QueryPlanCache, most specifically, by the
soft references held by SoftLimitMRUCache.
 We use very large heaps (up to 30 GB in some cases). Since memory is plentiful and the
SoftLimitMRUCache is unbounded, the heap eventually fills up until a major stop-the-world
GC is necessary to cleanse the SoftLimitMRUCache soft references. We performed several
live tests configuring the Concurrent-Mark-Sweep (CMS) GC collector in order to avoid the
full GC caused by concurrent mode failures. We played around with the following settings
available in the Sun JVM:
 -XX:+CMSIncrementalMode
 -XX:+CMSIncrementalPacing
 -XX:CMSIncrementalDutyCycle=<PCT>
 -XX:CMSIncrementalDutyCycleMin=<PCT>
 -XX:CMSInitiatingOccupancyFraction=<PCT>
 -XX:CMSMarkStackSize=<SIZE>
 -XX:CMSMarkStackSizeMax=<SIZE>
 -XX:SoftRefLRUPolicyMSPerMB=<MSECS>
 -XX:+ParallelRefProcEnabled
 Most of these options helped somewhat by allowing soft-reference GC to be performed in
parallel, ahead of time before tipping the scales and requiring full GC. However, this did
not avoid all problems, and the servers still periodically hang upon concurrent mode
failures. These are high-load web servers which process hundreds of hits per second, so
full GC is disastrous as garbage cannot be collected fast enough. Full GC would sometimes
take over 15 minutes, sometimes it would not even finish doing its job requiring a manual
app restart.
 Before anyone cries out "well, it's probably the application's fault, why do
you have so many different queries? Aren't you using parameterized queries /
PreparedStatements?" - the application does in fact produce many, many different
queries, but most of them are not reused. Even if the QueryPlanCache is highly effective,
most of the absolute number of queries are issued only once. Some use cases are the
following:
 - Our system allows ad-hoc reporting and searching capabilities. Each query is typically
issued once and never reused.
 - Many of our parameterized queries use IN clauses with variable-length collection/array
parameters. I'm unsure whether this affects the cache hit ratio for HQL query plans,
but it sure affects native SQL queries, since multiple collection lengths means multiple
"?" in the query string, and thus, multiple similar, if slightly different
queries polluting the corresponding plan cache.
 I'm sure there are other legitimate cases in which an unbounded query plan cache is
also a problem, I'm just enumerating the ones we've run into.
 This issue is not new. The same problem is described, in one way or another, by HHH-2431,
HHH-3191, and HHH-4627. I created a new issue since we've produced a working patch and
it would be made more visible, rather than as comments in the previous ones.
 The solution involves giving up the unbounded soft-reference-based cache. We introduced
two new configuration options:
 - hibernate.query.plan_cache_max_strong_references -> defaults to 128
 - hibernate.query.plan_cache_max_soft_references -> defaults to 2048
 Entries are evicted using an LRU policy, or by memory pressure from the GC in the case of
soft references. We used 2048 as a default for the soft size since it seems pretty
reasonable, but of course it can be tuned to suit the user's needs. Users looking to
emulate previous behaviour (we don't see the point, but who knows ...) can set this
option to Integer.MAX_VALUE
 The patch introduces no new dependencies. It uses Apache Commons Collections' LRUMap
(just as the released version does), and does away with the ReferenceMap (which does not
support LRU eviction) in order to manage soft references manually on top of an LRUMap.
 We see this issue as a top priority and should be applied to the trunk ASAP.
SoftLimitMRUCache has seen the most "radical" changes. SimpleMRUCache,
QueryPlanCache and Environment only contain minor changes. 
-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-issues] [Hibernate-JIRA] Updated: (HHH-5300) Configurable strong and soft reference QueryPlanCache sizes