[hibernate-dev] Re: About HSEARCH-310

Wed Jan 14 10:08:38 EST 2009

ok, the difference in eager/evict policies you say is good enough a
reason to throw my patch to garbage;
I'll fix it for the 80% case, hoping people uses session.clear and not
session.evict to manage memory.

thanks,
Sanne

2009/1/14 Emmanuel Bernard <emmanuel at hibernate.org>:
>
> On  Jan 13, 2009, at 18:35, Sanne Grinovero wrote:
>
>> thank you, you helped me a lot; still I would like to propose an
>> evolution to cover all use cases correctly.
>> I've already implemented and tested this, in roughly three steps:
>>
>> 1)
>> when moving the cursor outside of the loaded window boundaries I
>> extract the new EntityInfos from the index, then verify if any
>> matching entity is already contained in the persistence context: if it
>> is, I discard the EntityInfo and I "batch load" the entities
>> identified with the remaining EntityInfos
>> ( I am doing this by building the EntityKey to look at
>> session.getPersistenceContext().containsEntity( key ) ).
>
>>
>>
>> 2)
>> Immediately after batch loading, I evict them all right away and
>> discard the references:
>> this may look useless but the goal was to put data in the second level
>> cache, useful for later and possibly living in a more efficient data
>> structure than a simple soft map.
>
> In a scrollable usage, the 2nd level cache is typically a bad idea (ie we
> want to load more that what's available in memory). Plus you don't know if
> the entity will be marked for caching.
>
>>
>> Please note that I'm evicting only those objects which were not
>> attached before the loading, so nothing changed in the persistence
>> context from a client-code-point-of-view.
>> The original implementation also did a one-by-one load on each entity
>> as a last initialization step; this is
>> not needed here as I use load on each entity later on, when I'm sure
>> the clients actually needs that data.
>>
>> 3)
>> After that when a get() is called I use load( entityInfo ) to get a
>> managed instance to return to the user.
>>
>> The end result is:
>> * I don't put stuff in the persistence context the user doesn't expect
>
> well you do if the EVICT cascade policy is different that the EAGER fetching
> policy
>
>>
>> * I don't evict stuff that was attached before
>
> well you do if the EVICT cascade policy is different that the EAGER feching
> policy
>
>>
>> * no memory leaks
>>
>> it works correctly even without a second level cache, but of course
>> you only get good performance when using one.
>
> I really don't like this idea. I would rather get the 80% scenario work good
> (as described in the previous email) and not rely on the 2nd level cache.
> You can attach the patch in JIRA but I think we should go for the previous
> version.
>
>>
>>
>> Do you think this looks reasonable? Besides the EntityKey creation the
>> code is much simplified, and the lines of code are reduced.
>> Would you prefer me to keep a soft reference to the loaded entities to
>> re-attach to the session somehow?
>> I'll prefer to avoid the soft map for the entities: IMHO having an
>> additional (inefficient) caching layer on top of object loading, which
>> has probably lots of other caching layers, is just making the code
>> more complex.
>>
>> Sanne
>>
>> 2009/1/13 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>
>>> Clearly we cannot solve all use cases, so let's focus on the most useful
>>> one.
>>> People use scrollable resultset to walk through all objects in a window
>>> one
>>> after the other wo skipping them.
>>>
>>> while( rs.hasNext() ) {
>>> Address a = (Address) rs.get(0)[0];
>>> doSomething(a);
>>> if (i++ % 100 == 0) s.clear();
>>> }
>>>
>>> So we should consider that:
>>> - eviction is done by the user
>>> - eviction can happen behind the impl's back
>>>
>>> =>
>>> place objects and EntityInfo in a Soft reference (if we need to keep
>>> EntityInfo, don't remember).
>>> when we give an entity, make sure session.contains(entity) == true.
>>> Otherwise, discard the detached one and reload it.
>>>
>>> That should help for most cases.
>>>
>>>
>>> On  Jan 11, 2009, at 18:21, Sanne Grinovero wrote:
>>>
>>>> Hello,
>>>> I'm in need of some help about how to fix this memory leak at
>>>> HSEARCH-310
>>>> in ScrollableResultsImpl of Search, there currently is:
>>>>
>>>> * an internal cache of all loaded EntityInfo
>>>> * an internal cache of all loaded managed Entities
>>>>
>>>> Both caches are never emptied during the scrolling; the implementation
>>>> is loading a batch-sized list each time an entity is requested which
>>>> was not loaded before and adds new objects to the caches.
>>>>
>>>> We had said to use some Soft ReferenceMap to fix this, but actually we
>>>> should evict the loaded entities which where not requested from the
>>>> client code, or I'll still be leaking memory in Hibernate's
>>>> persistence context.
>>>>
>>>> I expect the most frequent use case for a ScrollableResults should be
>>>> to iterate at will and periodically evict objects in the client code;
>>>> Still the client could skip some entities we loaded by batching.
>>>> At first I thought to keep track of which references were used and
>>>> evict the other ones, but this isn't a good solution as the client
>>>> could be loading some entities by other means (i.e. another query) and
>>>> won't be happy if I'm evicting some random entities: I don't have
>>>> enough information to know for sure what to evict and what is going to
>>>> be used later.
>>>>
>>>> Also a bug, consequence of current implementation, is that if you get
>>>> an object, then you evict it, and then try scrollable.get() you don't
>>>> get an attached object but again the evicted instance; as a client of
>>>> the Hibernate library I expect all objects I ask for to be attached by
>>>> default.
>>>>
>>>> My idea to fix this should be something like:
>>>> 1) load the batch
>>>> 2) evict them all
>>>> 3) and then load one by one as requested, relying on the second level
>>>> cache (I suppose point 1 should have warmed it)
>>>>
>>>> But this has other flaws.. like I should not evict (2) those objects
>>>> already existing in the persistence context.
>>>> Ideally I should avoid to preload objects already present in the
>>>> persistence context; I am considering to look in the
>>>> session.persistenceContext.entitiesByKey map to only load+evict by
>>>> batch those entities not already managed.
>>>>
>>>> Is there some way to populate the second level cache skipping the
>>>> actual load in the persistence context?
>>>> Should I keep my own "soft map cache" in the scroll implementation
>>>> anyway, and somehow detach and reattach them as needed?
>>>> Am I missing some trivial better solution, or is my solution not going
>>>> to work at all?
>>>>
>>>> thanks,
>>>> Sanne
>>>
>>>
>
>