Already answered these in PM, but for the benefit of the list...
1. The caches aren't there for performance (though that is an obvious
side-effect); these are the backreference caches for written data. In
other words, we don't write the same instance twice in the same message;
instead there's a numerical reference to an object at an earlier stream
position. So in order to reuse this cache or avoid discarding it, you must
continue to write to the same stream to guarantee that the remote side will
read messages in the same order they were written by the corresponding
marshaller. Put simply, these caches *have* to be cleared on a per-message
basis, otherwise the backreferences won't match up and unmarshalling will fail.
2. The for(;;) loop actually only executes either one or two times (usually
just once). According to the serialization spec[1], the various checks
have to be done in a certain order, otherwise Serializable objects which
depend on certain features such as write replacement and custom write
methods might malfunction. That said, if we can prove that this is a
performance issue there are some things we can try, like optionally
disabling object substitution via a configuration object; also it's
possible that things can be rearranged without breaking the semantics.
For the example of String, we (you and I) might know that such a class can
never support substitution, but short of doing an "instanceof String"
there's no way that the marshaller can know that the instance is in fact a
String in time to avoid doing the check. And adding in a bunch of early
"instanceof" may well impose more of an overhead than just checking if the
method is there.
If you read the spec (chapter 2), actually any object type (including
non-serializable objects) can have a replacement. So disabling this has to
be an optional feature, if we even opt to do it (proving where perf. issues
are is pretty tricky even with profilers, as we're discovering).
[1]
http://java.sun.com/javase/6/docs/platform/serialization/spec/serialTOC.html
On 05/08/2009 12:50 PM, Galder Zamarreno wrote:
David,
A few questions about RiverMarshaller:
1. Why not make classCache and externalizers in RM live longer than the
start/finish period? For example, if you had a create/destroy lifecycles
that expanded for example Cache.start()/stop(), then classCache and
externalizers will be shared by all writes and I think this could
enhance performance. Wouldn't do this for instancesCache since it would
blow out.
2. Looking at the profiling data, I'm slightly concerned about
writeReplace() code are in the for(;;) loop in RM. I mean, writeReplace
only makes sense for Serializable classes but this section of code is
being executed a lot, much more than the number of Serializable classes
are in the tests.
I just a ran quick test and for a String, this section of the code is
executed but why do so when you know for sure that a String that is
final does not have and cannot have writeReplace()?
I think section should go right to the bottom to the obj instanceof
Serializable section or somewhere where is less in the critical path.
Regards,
Galder Zamarreno wrote:
> Profiler data with JBMAR r174
>
> Galder Zamarreno wrote:
>> Hi David,
>>
>> Please find attach graphs belonging to two runs that compare:
>>
>> infinispan-4.0.0(repl-sync) - home grown marshalling layer
>> infinispan-4.0.0(repl-sync-jbmar) - infinispan + jbmar 1.1.2
>> infinispan-4.0.0(repl-sync-jbmar)-rXYZ - infinispan + jbmar with
>> revision
>>
>> Not sure what's the conclusion here tbh. The results of 1.1.2 almost
>> look opposite in each test.
>>
>> I've also attached some information from previously run profiling
>> sessions with a couple of local machines we have in Neuchatel. I
>> profiled the faster of the two machines.
>>
>> Actually, looking at this profiled data, these tests are for
>> synchronous replicated caches but I see no traces of actually reading
>> the stream, only writing to it, hmmmm.
>>
>> I'm adding Externalizers to class table and implementing
>> marshaller/unmarshaller pooling as my next tasks.
>>
>> Regards,
>>
>> David M. Lloyd wrote:
>>> OK, I tried out a few things. You might want to try introducing
>>> these one at a time (i.e. update up to rev 173, then 174, then 175
>>> and see how each one does). In particular, I think 175 has just as
>>> much chance of slowing things down as speeding them up - either
>>> you're getting tons of collisions in the hash table or the profiler
>>> is skewing the results there (maybe try filtering out
>>> org.jboss.marshalling.util.IdentityIntMap and java.lang.System to
>>> see if that gives a different picture).
>>>
>>> I feel pretty good about 173 and 174 though I think the profiler
>>> will skew 173 unless you have that UTFUtils filter installed. If
>>> 175 slows things down (outside of the profiler), let me know and
>>> I'll revert it. None of my tests showed much difference but I don't
>>> have any good benchmarks that really exercise that code right now.
>>>
>>> There's a couple things left to try yet, like looking at replacing
>>> ConcurrentReferenceHashMap (assuming that isn't the profiler again).
>>>
>>> ------------------------------------------------------------------------
>>>
>>> r175 | david.lloyd(a)jboss.com | 2009-05-08 00:17:46 -0500 (Fri, 08
>>> May 2009) | 1 line
>>>
>>> Try a trick to decrease the liklihood of collisions
>>> ------------------------------------------------------------------------
>>>
>>> r174 | david.lloyd(a)jboss.com | 2009-05-08 00:04:39 -0500 (Fri, 08
>>> May 2009) | 1 line
>>>
>>> Replacement caching is not economical; the cost is one extra hash
>>> table get for non-replaced objects, two hash table gets (total) for
>>> replaced objects. Removing the cache gets rid of the cost for
>>> non-replaced objects, while replaced objects now have to be replaced
>>> again before the single hash table hit.
>>> ------------------------------------------------------------------------
>>>
>>> r173 | david.lloyd(a)jboss.com | 2009-05-07 23:44:52 -0500 (Thu, 07
>>> May 2009) | 1 line
>>>
>>> JBMAR-52 - Avoid extra copy of char array (1.5 of 2)
>>> ------------------------------------------------------------------------
>>>
>>>
>>> - DML
>>
>