[hibernate-dev] Bytecode enhancement

Wed Oct 10 03:26:00 EDT 2012

Yes my proposal would imply we pass along the cache data structure through all of our internal methods. 
My concerns about byte code enhancement are a bit diffuse but relate to:

- does it add an extra build step to users?
- how much configuration is needed in SE?
- has JBoss AS finally implemented the temporary class loaders required for runtime byte code enhancement (JPA contract)?
- how does that all play when entities are serialized?
- can I unserialize entities on different a different VM?

But more importantly I had the feeling that the cache approach would have less intrusive than the byte code enhancement approach but your email made me change my mind. Propagating the cache would be more intrusive assuming we don't want thread local, and we probably don't want these. 

Would using a service that store cache data structures by session work? I am thinking out loud here trying to push the cache idea before discarding it. That would require a weak-key concurrent map. The one I know is the one Jason wrote as a proposal for SE. I. Cannot remember where we use it but it must be in Search or in Validator. 

Emmanuel

On 9 oct. 2012, at 19:39, Steve Ebersole <steve at hibernate.org> wrote:

> The trouble is that you are glossing over where this "cache" comes from and how it comes into play.
> 
> My original code was part of PersistenceContext (which is owned by Session).  You explicitly say yours is not; or more correctly you say that in your code "cache" is call stack specific.  So to me that either means (1) a ThreadLocal that is referenced inside StatefulPersistenceContext#getEntityEntry or (2) a whole new Object/method.  Both are tricky given how we recurse.
> 
> You obviously have some concern about bytecode enhancement.  What is it?
> 
> On Tue 09 Oct 2012 12:28:33 PM CDT, Emmanuel Bernard wrote:
>> If you have both the bytecode enhancement and the cache logic I am
>> talking about the code would look like this.
>> 
>> public EntityEntry getEntityEntry(Object entity) {
>>    if ( isEnhanced( entity ) ) {
>>        return ( (Enhanced) entity ).getEntityEntry();
>>    }
>>    int cacheIndex = cache.size-1;
>>    if ( cache[size-1] == null ) {
>>       //the array cache is not maxed out, try it first
>>       //cache is an array of Map.Entry<Object,EntityEntry> (with not
>>       //actual map though
>>       for (cacheIndex = 0 ; cacheIndex < cache.size ; cacheIndex++) {
>>          Map.Entry<Object,EntityEntry> tuple = cache[cacheIndex];
>>          if (tuple == null) {
>>             break;
>>          }
>>          if ( entity == tuple.getKey() ) {
>>             //we did a instance equality
>>             // no hashcode involved nor lookup
>>             return tuple.getValue();
>>          }
>>       }
>>    }
>>    // this is what hotspots
>>    EntityEntry result = entityEntries.get( entity );
>>    if ( index < cache.size -1 ) {
>>       //rooms left in the cache, add to it
>>       cache[index] = new Map.Entry<Object,EntityEntry>(entity, result);
>>    }
>>    return result;
>> }
>> 
>> 
>> The cache array would not be per Session but per call stack.
>> If either the bytecode or the cache hits, we will avoid the hotspots
>> entirely (no hashcode generation and no map lookup).
>> 
>> Emmanuel
>> 
>> On Tue 2012-10-09 11:56, Steve Ebersole wrote:
>>> Well it does not keep track of EntityEntry specifically.  It keeps
>>> track of the replacements.  But that is the point.  We have quite a
>>> few "context caches" already.
>>> 
>>> PersistenceContext was intended for this lookup.  It just suffers
>>> from a bad implementation of how that works atm.    If I understand
>>> what you are suggesting (though to be honest, its extremely vague,
>>> at least to me) then whole point is to no longer do
>>> entity->EntityEntry resolutions via PersistenceContext (through this
>>> IdentityMap).  But thats actually the same net effect as bytecode
>>> enhancement.  On StatefulPersistenceContext:
>>> 
>>> public EntityEntry getEntityEntry(Object entity) {
>>>    if ( isEnhanced( entity ) ) {
>>>        return ( (Enhanced) entity ).getEntityEntry();
>>>    }
>>>    else {
>>>        // this is what hotspots
>>>        return entityEntries.get( entity );
>>>   }
>>> }
>>> 
>>> On Tue 09 Oct 2012 11:08:42 AM CDT, Emmanuel Bernard wrote:
>>>> On Tue 2012-10-09 10:30, Steve Ebersole wrote:
>>>>> On Tue 09 Oct 2012 09:57:12 AM CDT, Emmanuel Bernard wrote:
>>>>>> On Thu 2012-10-04 10:00, Steve Ebersole wrote:
>>>>>>> See https://hibernate.onjira.com/browse/HHH-7667
>>>>>>> 
>>>>>>> I want to investigate expanding Hibernate's support for bytecode
>>>>>>> enhancement.  I can see 3 main fronts to this:
>>>>>>> 
>>>>>>> 1) offloading map lookups from the PersistenceContext .  Currently we
>>>>>>> keep state associated with each entity in a map (a special type of map
>>>>>>> called IdentityMap) on the PersistenceContext, keyed by the entity.
>>>>>>> Profiling consistently shows these maps as a hot-spot.  The main thing
>>>>>>> we keep in the PersistenceContext in regards to the entity is called an
>>>>>>> EntityEntry, which is kept in an IdentityMap<Object,EntityEntry> (again,
>>>>>>> the entity is the key).  So the thought here is to invert this from
>>>>>>> `map.get( entity )` to something like `( (Enhanced) entity
>>>>>>> ).getEntityEntry()`
>>>>>> 
>>>>>> I was discussing this with Sanne. One "workaround" to the heavy
>>>>>> IdentityMap price is to keep a contextual cache. I am pretty sure these
>>>>>> IndeitityMap lookups are done over and over for a given operation.
>>>>>> 
>>>>>> We might be able to:
>>>>>> 
>>>>>> - pass along the EntityEntry instead of relying on the lookup several
>>>>>>   times
>>>>>> - use a cache of lookups for either the most recent or for the
>>>>>>   contextual operation.
>>>>>> 
>>>>>> The cache would be array based and thus not involve hashcode or lookup.
>>>>>> 
>>>>>> for ( int index = 0 ; index < cache.length ; index++ ) {
>>>>>>   if ( searchedO == cache[index] ) {
>>>>>>     entityEntry = entityEntryCache[index];
>>>>>>   }
>>>>>> }
>>>>>> if ( entityEntry == null ) {
>>>>>>   identityMap.get(searchedO);
>>>>>>   //add to cache
>>>>>> }
>>>>>> 
>>>>>> Alternatively, I don't remember what is costly, the actual lookup or the
>>>>>> computation of the identity hashcode.
>>>>>> If that's the former, we can take the bet and use the entity hashcode
>>>>>> instead but still use == instead of equals. The risk is more collisions
>>>>>> or in case of a rogue implementation of hashCode not finding the entity.
>>>>>> 
>>>>>> These options seems more lightweight than relying on bytecode
>>>>>> enhancement to keep a placeholder.
>>>>> 
>>>>> Well technically the hotspot is the calculation of the hashcode.
>>>>> However, the same is true no matter how we calculate the hashcode.
>>>>> Both Object#hashCode and System#identityHashCode have the same
>>>>> characteristic (in fact Object#hashCode simply calls
>>>>> System#identityHashCode, as best we can tell since they are both
>>>>> defined as native calls).  However, the fact that this call stack
>>>>> shows up as a hotspot at all is more due to the poor choice of Map
>>>>> impl here and then the fact that we do lots of lookups.  What
>>>>> happens is that we wrap the keys (the entity instances) in a wrapper
>>>>> object that applies the "hashCode as System#identityHashCode" and
>>>>> "equality as ==" semantics.  They problem with the number of calls
>>>>> then is the number of times we do lookups and the fact that we need
>>>>> to wrap the incoming keys and call identityHashCode to be able to do
>>>>> a map lookup.
>>>>> 
>>>>> I do not see how the answer can be "contextual" caches, beyond the
>>>>> set of contextual caches we leverage already (see "merge map", etc).
>>>> 
>>>> I am not familiar with this part of the code but if merge maps are used
>>>> even beyond the merge operations and if the merge map walks through an
>>>> underlying array of data without needing the hashcode (ie not map.get()) then you are
>>>> correct, you already have a cache system to prevent hashCode
>>>> computation and my assumption that it would reduce the pressure on
>>>> IdentityMap is not correct.
>>> 
>>> --
>>> steve at hibernate.org
>>> http://hibernate.org
> 
> --
> steve at hibernate.org
> http://hibernate.org