[hibernate-dev] Re: About HSEARCH-310

Sanne Grinovero sanne.grinovero at gmail.com
Tue Jan 13 18:35:32 EST 2009


thank you, you helped me a lot; still I would like to propose an
evolution to cover all use cases correctly.
I've already implemented and tested this, in roughly three steps:

1)
when moving the cursor outside of the loaded window boundaries I
extract the new EntityInfos from the index, then verify if any
matching entity is already contained in the persistence context: if it
is, I discard the EntityInfo and I "batch load" the entities
identified with the remaining EntityInfos
( I am doing this by building the EntityKey to look at
session.getPersistenceContext().containsEntity( key ) ).

2)
Immediately after batch loading, I evict them all right away and
discard the references:
this may look useless but the goal was to put data in the second level
cache, useful for later and possibly living in a more efficient data
structure than a simple soft map.
Please note that I'm evicting only those objects which were not
attached before the loading, so nothing changed in the persistence
context from a client-code-point-of-view.
The original implementation also did a one-by-one load on each entity
as a last initialization step; this is
not needed here as I use load on each entity later on, when I'm sure
the clients actually needs that data.

3)
After that when a get() is called I use load( entityInfo ) to get a
managed instance to return to the user.

The end result is:
* I don't put stuff in the persistence context the user doesn't expect
* I don't evict stuff that was attached before
* no memory leaks

it works correctly even without a second level cache, but of course
you only get good performance when using one.

Do you think this looks reasonable? Besides the EntityKey creation the
code is much simplified, and the lines of code are reduced.
Would you prefer me to keep a soft reference to the loaded entities to
re-attach to the session somehow?
I'll prefer to avoid the soft map for the entities: IMHO having an
additional (inefficient) caching layer on top of object loading, which
has probably lots of other caching layers, is just making the code
more complex.

Sanne

2009/1/13 Emmanuel Bernard <emmanuel at hibernate.org>:
> Clearly we cannot solve all use cases, so let's focus on the most useful
> one.
> People use scrollable resultset to walk through all objects in a window one
> after the other wo skipping them.
>
> while( rs.hasNext() ) {
>  Address a = (Address) rs.get(0)[0];
>  doSomething(a);
>  if (i++ % 100 == 0) s.clear();
> }
>
> So we should consider that:
>  - eviction is done by the user
>  - eviction can happen behind the impl's back
>
> =>
> place objects and EntityInfo in a Soft reference (if we need to keep
> EntityInfo, don't remember).
> when we give an entity, make sure session.contains(entity) == true.
> Otherwise, discard the detached one and reload it.
>
> That should help for most cases.
>
>
> On  Jan 11, 2009, at 18:21, Sanne Grinovero wrote:
>
>> Hello,
>> I'm in need of some help about how to fix this memory leak at HSEARCH-310
>> in ScrollableResultsImpl of Search, there currently is:
>>
>> * an internal cache of all loaded EntityInfo
>> * an internal cache of all loaded managed Entities
>>
>> Both caches are never emptied during the scrolling; the implementation
>> is loading a batch-sized list each time an entity is requested which
>> was not loaded before and adds new objects to the caches.
>>
>> We had said to use some Soft ReferenceMap to fix this, but actually we
>> should evict the loaded entities which where not requested from the
>> client code, or I'll still be leaking memory in Hibernate's
>> persistence context.
>>
>> I expect the most frequent use case for a ScrollableResults should be
>> to iterate at will and periodically evict objects in the client code;
>> Still the client could skip some entities we loaded by batching.
>> At first I thought to keep track of which references were used and
>> evict the other ones, but this isn't a good solution as the client
>> could be loading some entities by other means (i.e. another query) and
>> won't be happy if I'm evicting some random entities: I don't have
>> enough information to know for sure what to evict and what is going to
>> be used later.
>>
>> Also a bug, consequence of current implementation, is that if you get
>> an object, then you evict it, and then try scrollable.get() you don't
>> get an attached object but again the evicted instance; as a client of
>> the Hibernate library I expect all objects I ask for to be attached by
>> default.
>>
>> My idea to fix this should be something like:
>> 1) load the batch
>> 2) evict them all
>> 3) and then load one by one as requested, relying on the second level
>> cache (I suppose point 1 should have warmed it)
>>
>> But this has other flaws.. like I should not evict (2) those objects
>> already existing in the persistence context.
>> Ideally I should avoid to preload objects already present in the
>> persistence context; I am considering to look in the
>> session.persistenceContext.entitiesByKey map to only load+evict by
>> batch those entities not already managed.
>>
>> Is there some way to populate the second level cache skipping the
>> actual load in the persistence context?
>> Should I keep my own "soft map cache" in the scroll implementation
>> anyway, and somehow detach and reattach them as needed?
>> Am I missing some trivial better solution, or is my solution not going
>> to work at all?
>>
>> thanks,
>> Sanne
>
>



More information about the hibernate-dev mailing list