[infinispan-dev] Re: JProfiler snapshots for Infinispan+JBMAR

Fri May 8 15:47:50 EDT 2009

Already answered these in PM, but for the benefit of the list...

1. The caches aren't there for performance (though that is an obvious 
side-effect); these are the backreference caches for written data.  In 
other words, we don't write the same instance twice in the same message; 
instead there's a numerical reference to an object at an earlier stream 
position.  So in order to reuse this cache or avoid discarding it, you must 
continue to write to the same stream to guarantee that the remote side will 
read messages in the same order they were written by the corresponding 
marshaller.  Put simply, these caches *have* to be cleared on a per-message 
basis, otherwise the backreferences won't match up and unmarshalling will fail.

2. The for(;;) loop actually only executes either one or two times (usually 
just once).  According to the serialization spec[1], the various checks 
have to be done in a certain order, otherwise Serializable objects which 
depend on certain features such as write replacement and custom write 
methods might malfunction.  That said, if we can prove that this is a 
performance issue there are some things we can try, like optionally 
disabling object substitution via a configuration object; also it's 
possible that things can be rearranged without breaking the semantics.

For the example of String, we (you and I) might know that such a class can 
never support substitution, but short of doing an "instanceof String" 
there's no way that the marshaller can know that the instance is in fact a 
String in time to avoid doing the check.  And adding in a bunch of early 
"instanceof" may well impose more of an overhead than just checking if the 
method is there.

If you read the spec (chapter 2), actually any object type (including 
non-serializable objects) can have a replacement.  So disabling this has to 
be an optional feature, if we even opt to do it (proving where perf. issues 
are is pretty tricky even with profilers, as we're discovering).

[1] 
http://java.sun.com/javase/6/docs/platform/serialization/spec/serialTOC.html

On 05/08/2009 12:50 PM, Galder Zamarreno wrote:
> David,
> 
> A few questions about RiverMarshaller:
> 
> 1. Why not make classCache and externalizers in RM live longer than the 
> start/finish period? For example, if you had a create/destroy lifecycles 
> that expanded for example Cache.start()/stop(), then classCache and 
> externalizers will be shared by all writes and I think this could 
> enhance performance. Wouldn't do this for instancesCache since it would 
> blow out.
> 
> 2. Looking at the profiling data, I'm slightly concerned about 
> writeReplace() code are in the for(;;) loop in RM. I mean, writeReplace 
> only makes sense for Serializable classes but this section of code is 
> being executed a lot, much more than the number of Serializable classes 
> are in the tests.
> 
> I just a ran quick test and for a String, this section of the code is 
> executed but why do so when you know for sure that a String that is 
> final does not have and cannot have writeReplace()?
> 
> I think section should go right to the bottom to the obj instanceof 
> Serializable section or somewhere where is less in the critical path.
> 
> Regards,
> 
> Galder Zamarreno wrote:
>> Profiler data with JBMAR r174
>>
>> Galder Zamarreno wrote:
>>> Hi David,
>>>
>>> Please find attach graphs belonging to two runs that compare:
>>>
>>> infinispan-4.0.0(repl-sync) - home grown marshalling layer
>>> infinispan-4.0.0(repl-sync-jbmar) - infinispan + jbmar 1.1.2
>>> infinispan-4.0.0(repl-sync-jbmar)-rXYZ - infinispan + jbmar with 
>>> revision
>>>
>>> Not sure what's the conclusion here tbh. The results of 1.1.2 almost 
>>> look opposite in each test.
>>>
>>> I've also attached some information from previously run profiling 
>>> sessions with a couple of local machines we have in Neuchatel. I 
>>> profiled the faster of the two machines.
>>>
>>> Actually, looking at this profiled data, these tests are for 
>>> synchronous replicated caches but I see no traces of actually reading 
>>> the stream, only writing to it, hmmmm.
>>>
>>> I'm adding Externalizers to class table and implementing 
>>> marshaller/unmarshaller pooling as my next tasks.
>>>
>>> Regards,
>>>
>>> David M. Lloyd wrote:
>>>> OK, I tried out a few things.  You might want to try introducing 
>>>> these one at a time (i.e. update up to rev 173, then 174, then 175 
>>>> and see how each one does).  In particular, I think 175 has just as 
>>>> much chance of slowing things down as speeding them up - either 
>>>> you're getting tons of collisions in the hash table or the profiler 
>>>> is skewing the results there (maybe try filtering out 
>>>> org.jboss.marshalling.util.IdentityIntMap and java.lang.System to 
>>>> see if that gives a different picture).
>>>>
>>>> I feel pretty good about 173 and 174 though I think the profiler 
>>>> will skew 173 unless you have that UTFUtils filter installed.  If 
>>>> 175 slows things down (outside of the profiler), let me know and 
>>>> I'll revert it.  None of my tests showed much difference but I don't 
>>>> have any good benchmarks that really exercise that code right now.
>>>>
>>>> There's a couple things left to try yet, like looking at replacing 
>>>> ConcurrentReferenceHashMap (assuming that isn't the profiler again).
>>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> r175 | david.lloyd at jboss.com | 2009-05-08 00:17:46 -0500 (Fri, 08 
>>>> May 2009) | 1 line
>>>>
>>>> Try a trick to decrease the liklihood of collisions
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> r174 | david.lloyd at jboss.com | 2009-05-08 00:04:39 -0500 (Fri, 08 
>>>> May 2009) | 1 line
>>>>
>>>> Replacement caching is not economical; the cost is one extra hash 
>>>> table get for non-replaced objects, two hash table gets (total) for 
>>>> replaced objects.  Removing the cache gets rid of the cost for 
>>>> non-replaced objects, while replaced objects now have to be replaced 
>>>> again before the single hash table hit.
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> r173 | david.lloyd at jboss.com | 2009-05-07 23:44:52 -0500 (Thu, 07 
>>>> May 2009) | 1 line
>>>>
>>>> JBMAR-52 - Avoid extra copy of char array (1.5 of 2)
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>> - DML
>>>
>>
>