[infinispan-dev] Double+ buffering during value marshalling

Tue Aug 28 10:56:04 EDT 2012

All I can contribute is that you cannot really avoid buffering in the 
marshaller, because for user data it uses constructs of the form:

   <length> <content>

You generally cannot know <length> without some form of buffering. 
There may be some optimizations which are possible that I haven't done 
yet, but I think as it is right now is pretty much how it'll stay for 
the near future at least.

On 08/20/2012 09:51 AM, Galder Zamarreño wrote:
> I don't really know tbh. More of a question for David.
>
> To provide more background info, we have MarshalledValue in:
> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/marshall/MarshalledValue.java
>
>   It used to keep a byte[], but now we do some buffering in ExpandableMarshalledValueByteStream (https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/io/ExpandableMarshalledValueByteStream.java) to be more efficient and avoid throwing away byte arrays, see https://issues.jboss.org/browse/ISPN-2032
>
> What Sanne suggests is that we avoid double buffering in ExpandableMarshalledValueByteStream and SimpleDataOutput (RiverMarshaller).
>
> There's a 1 to 1 mapping between the RiverMarshaller and the stream/buffer passed to Marshalling.createByteOutput(), so I think there could be an option where the stream passed is the actual buffer.
>
> @Sanne, did you see any particular performace impact in the profiler? Or just noting it?
>
> Cheers,
>
> On Aug 13, 2012, at 6:22 PM, Manik Surtani wrote:
>
>> Probably a question for David or Galder… but good detective work as always, Sanne.
>>
>> On 11 Aug 2012, at 21:00, Sanne Grinovero <sanne at infinispan.org> wrote:
>>
>>> While debugging around the Infinispan/JBMar integration I noticed that
>>> Infinispan defines in-memory buffers wrapped as byte streams, and uses
>>> JBoss Marshaller to write to these.
>>>
>>> JBoss Marshaller also buffers writes, and flushes when needed or on
>>> demand; I did already know of both, but just realized that these two
>>> levels of buffering are both applied when serializing instances in a
>>> org.infinispan.marshall.MarshalledValue; I guess we could do better
>>> avoiding buffering and have JBMAR write straight away? It would also
>>> mean less array resizing, as we are often able to allocate the right
>>> size in one shot.
>>>
>>> I gave a look into RiverMarshaller, and this didn't look like a simple
>>> task as _byte[] buffer_ is the main element around which most of the
>>> code revolves (in it's superclass SimpleDataOutput).
>>>
>>> I'm wondering if SimpleDataOutput wouldn't be simpler by just writing
>>> to a org.jboss.marshalling.ByteOutput (and have an optional buffering
>>> implementation), or if Infinispan should use River in a different way.
>>>
>>> Regards,
>>> Sanne
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Manik Surtani
>> manik at jboss.org
>> twitter.com/maniksurtani
>>
>> Founder and Project Lead, Infinispan
>> http://www.infinispan.org
>>
>> Platform Architect, JBoss Data Grid
>> http://red.ht/data-grid
>>
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>

-- 
- DML