[infinispan-dev] Double+ buffering during value marshalling

Fri Aug 31 15:12:56 EDT 2012

JBoss Marshalling is written at a level now where surprising things 
cause performance issues.  It relies, at this point, heavily on method 
inlining (which is touchy business) and specific HotSpot optimization 
behaviors for performance.  So it's really hard to say at this point 
what will or will not cause a performance problem without actually 
benchmarking it, but I think I'm not too far off base to say that adding 
an invokevirtual into the hot path will very likely create a measurable 
slowdown.

On 08/31/2012 01:46 PM, Sanne Grinovero wrote:
> Hi David,
> by looking into SimpleDataOutput I thought you could move the details
> of the buffer management into a separate class, and define an
> interface which supports some kind of random access to seek back to
> where you need to store the length.
>
> I see however that you crafted it with great care to have local
> reference to the buffer and write directly in the local array.. I
> guess introducing any interface at that level could have a terrible
> impact on performance? Is that the reason you prefer to avoid this
> abstraction?
>
> @Galder no I didn't measure anything, I only noticed the double
> buffering while debugging. Thought to rise it here as from the
> profiling months ago we noticed lots of array copies generally and
> would like to remove some if we find some easy win, but it seems this
> is not such a case.
>
> I guess I'm a dreamer when I think we should allocate a single buffer
> when needing to transfer several objects, wrap it in an RPC, prefix
> with JGroups headers all with a single allocation - and possibly an
> out-of-heap buffer to feed directly to a NIO2 network channel?
>
> Sanne
>
> On 28 August 2012 16:56, David M. Lloyd <david.lloyd at redhat.com> wrote:
>> All I can contribute is that you cannot really avoid buffering in the
>> marshaller, because for user data it uses constructs of the form:
>>
>>     <length> <content>
>>
>> You generally cannot know <length> without some form of buffering.
>> There may be some optimizations which are possible that I haven't done
>> yet, but I think as it is right now is pretty much how it'll stay for
>> the near future at least.
>>
>> On 08/20/2012 09:51 AM, Galder Zamarreño wrote:
>>> I don't really know tbh. More of a question for David.
>>>
>>> To provide more background info, we have MarshalledValue in:
>>> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/marshall/MarshalledValue.java
>>>
>>>    It used to keep a byte[], but now we do some buffering in ExpandableMarshalledValueByteStream (https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/io/ExpandableMarshalledValueByteStream.java) to be more efficient and avoid throwing away byte arrays, see https://issues.jboss.org/browse/ISPN-2032
>>>
>>> What Sanne suggests is that we avoid double buffering in ExpandableMarshalledValueByteStream and SimpleDataOutput (RiverMarshaller).
>>>
>>> There's a 1 to 1 mapping between the RiverMarshaller and the stream/buffer passed to Marshalling.createByteOutput(), so I think there could be an option where the stream passed is the actual buffer.
>>>
>>> @Sanne, did you see any particular performace impact in the profiler? Or just noting it?
>>>
>>> Cheers,
>>>
>>> On Aug 13, 2012, at 6:22 PM, Manik Surtani wrote:
>>>
>>>> Probably a question for David or Galder… but good detective work as always, Sanne.
>>>>
>>>> On 11 Aug 2012, at 21:00, Sanne Grinovero <sanne at infinispan.org> wrote:
>>>>
>>>>> While debugging around the Infinispan/JBMar integration I noticed that
>>>>> Infinispan defines in-memory buffers wrapped as byte streams, and uses
>>>>> JBoss Marshaller to write to these.
>>>>>
>>>>> JBoss Marshaller also buffers writes, and flushes when needed or on
>>>>> demand; I did already know of both, but just realized that these two
>>>>> levels of buffering are both applied when serializing instances in a
>>>>> org.infinispan.marshall.MarshalledValue; I guess we could do better
>>>>> avoiding buffering and have JBMAR write straight away? It would also
>>>>> mean less array resizing, as we are often able to allocate the right
>>>>> size in one shot.
>>>>>
>>>>> I gave a look into RiverMarshaller, and this didn't look like a simple
>>>>> task as _byte[] buffer_ is the main element around which most of the
>>>>> code revolves (in it's superclass SimpleDataOutput).
>>>>>
>>>>> I'm wondering if SimpleDataOutput wouldn't be simpler by just writing
>>>>> to a org.jboss.marshalling.ByteOutput (and have an optional buffering
>>>>> implementation), or if Infinispan should use River in a different way.
>>>>>
>>>>> Regards,
>>>>> Sanne
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> twitter.com/maniksurtani
>>>>
>>>> Founder and Project Lead, Infinispan
>>>> http://www.infinispan.org
>>>>
>>>> Platform Architect, JBoss Data Grid
>>>> http://red.ht/data-grid
>>>>
>>>
>>> --
>>> Galder Zamarreño
>>> Sr. Software Engineer
>>> Infinispan, JBoss Cache
>>>
>>
>>
>> --
>> - DML
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
- DML