I like the idea but as Manik hinted I wonder how many people are
gonna go and configure this unless Infinispan is blatant enough for the users to tell them
their configuration is not optimal.
We also need to consider the importance of the problem which is that STABLE keeps the
whole buffer ref around.
Before doing anything further, we should repeat the tests and see what GC looks like on
sender side with the current adaptive buffer sizing.
My understanding is that at the point the buffer reaches JGroups, and
so gets into the STABLE "long lifecycle" we already know the exact
size so we can do a last resize.
The issue still looks to me about minimizing the amount of resizes
needed until we reach that point.
Cheers,
Sanne
On Jun 9, 2011, at 9:06 PM, Manik Surtani wrote:
> Hi guys
>
> This is an excellent and fun discussion - very entertaining read for me. :-) So a
quick summary based on everyones' ideas:
>
> I think we can't have a one size fits all solution here. I think simple array
copies work well as long as the serialized forms are generally small. And while I agree
with Bela that in some cases (HTTP session replication) it can be hard to determine the
size of payloads, in others (Hibernate 2LC) this can be determined with a fair degree of
certainty.
>
> Either way, I think this can be a bottleneck (both in terms of memory and CPU
performance) if the serialized forms are large (over 100K? That's a guess... ) and
buffers are sub-optimally sized.
>
> I think this should be pluggable - I haven't looked at the code paths in detail
to see where the impact is, but perhaps different marshaller implementations (maybe all
extending a generic JBoss Marshalling based marshaller) with different buffer/arraycopy
logic? So here are the options I see:
>
> 1) Simple arraycopy (could be the default)
> 2) Static buffer size like we have now - but should be configurable in XML
> 3) Adaptive buffer (the current Netty-like policy Galder has implemented, maybe a
separate one with reservoir sampling)
> 4) Per-Externalizer static buffer size - Externalizer to either provide a
deterministic buffer size or a starting buffer size and growth factor.
>
> Option (4) would clearly be an "advanced option", reserved for use by very
experienced developers who want to squeeze every drop of performance, and have intimate
knowledge of their object graphs and know what this demands of their system in terms of
serialization.
>
> But further, we should also have some logging in the marshaller - probably TRACE
level, maybe JMX, disabled by default - to monitor samples and gather statistics on
inefficiently configured buffer sizes and policies, perhaps even log marshalled types and
resulting sizes. This could be run during a stress test on a staging environment to help
determine how to tune marshalling based on the policies above.
>
> WDYT? I think the benefit of making this pluggable is that (a) it can be done
piece-meal - one policy at a time and (b) each one is easier to unit test, so fewer bugs
in, say, a reservoir sampling impl.
>
> Cheers
> Manik
>
>
>
>
> On 25 May 2011, at 08:45, Galder Zamarreño wrote:
>
>>
>> On May 24, 2011, at 1:08 PM, Dan Berindei wrote:
>>
>>> On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero
>>> <sanne.grinovero(a)gmail.com> wrote:
>>>> 2011/5/24 Galder Zamarreño <galder(a)redhat.com>:
>>>>> Guys,
>>>>>
>>>>> Some interesting discussions here, keep them coming! Let me summarise
what I submitted yesterday as pull req for
https://issues.jboss.org/browse/ISPN-1102
>>>>>
>>>>> - I don't think users can really provide such accurate
predictions of the objects sizes because first java does not give you an easy way of
figuring out how much your object takes up and most of the people don't have such
knowledge. What I think could be more interesting is potentially having a buffer predictor
that predicts sizes per type, so rather than calculate the next buffer size taking all
objects into account, do that per object type. To enable to do this in the future, I'm
gonna add the object to be marshalled as parameter to
https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This enhancement allows
for your suggestions on externalizers providing estimate size to be implemented, but
I'm not keen on that.
>>>>>
>>>>> - For a solution to ISPN-1102, I've gone for a simpler adaptive
buffer size algorithm that Netty uses for determining the receiver buffer size. The use
cases are different but I liked the simplicity of the algorithm since calculating the next
buffer size was an O(1) op and can grow both ways very easily. I agree that it might not
be as exact as reservoir sampling+percentile, but at least it's cheaper to compute and
it resolves the immediate problem of senders keeping too much memory for sent buffers
before STABLE comes around.
>>>>>
>>>>> - Next step would be to go and test this and compare it with Bela/Dan
were seeing (+1 to another interactive debugging session), and if we are still not too
happy about the memory consumption, maybe we can look into providing a different
implementation for BufferSizePredictor that uses R sampling.
>>>>>
>>>>> - Finally, I think once ISPN-1102 is in, we should make the
BufferSizePredictor implementation configurable programmatically and via XML - I'll
create a separate JIRA for this.
>>>>
>>>> great wrap up, +1 on all points.
>>>> BTW I definitely don't expect every user to be able to figure out
the
>>>> proper size, just that some of them might want (need?) to provide
>>>> hints.
>>>>
>>>
>>> Looks great Galder, although I could use some comments on how the
>>> possible buffer sizes are chosen in your algorithm :-)
>>
>> I'll ping you on IRC.
>>
>>> I guess we were thinking of different things with the externalizer
>>> extension. I was imagining something like an ObjectOutput
>>> implementation that doesn't really write anything but instead it just
>>> records the size of the object that would be written. That way the
>>> size estimate would always be accurate, but of course the performance
>>> wouldn't be very good for complex object graphs.
>>>
>>> Still I'd like to play with something like this to see if we can
>>> estimate the memory usage of the cache and base the eviction on the
>>> (estimated) memory usage instead of a fixed number of entries, it
>>> seems to me like that's the first question people ask when they start
>>> using Infinispan.
>>
>> Sure, this is something we have considered in the past, and a cache that stores
everything as binary is the easiest of the use cases to provide this type of calculation.
>>
>> In the case where store-as-binary is off, doing this is more complicated because
even if you can marshall things at some point (i.e. at replication time), the space taken
by the object in memory vs it's binary form are different.
>>
>>>
>>> Cheers
>>> Dan
>>>
>>>
>>>> Cheers,
>>>> Sanne
>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> On May 24, 2011, at 8:12 AM, Bela Ban wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/23/11 11:09 PM, Dan Berindei wrote:
>>>>>>
>>>>>>>> No need to expose the ExposedByteArrayOutputStream, a
byte[] buffer,
>>>>>>>> offset and length will do it, and we already use this
today.
>>>>>>>>
>>>>>>>>
>>>>>>>>> In case the value is not stored in binary form, the
expected life of
>>>>>>>>> the stream is very short anyway, after being pushed
directly to
>>>>>>>>> network buffers we don't need it anymore...
couldn't we pass the
>>>>>>>>> non-truncated stream directly to JGroups without this
final size
>>>>>>>>> adjustement ?
>>>>>>>>
>>>>>>>
>>>>>>> The problem is that byte[] first has to be copied to another
buffer
>>>>>>> together with the rest of the ReplicableCommand before
getting to
>>>>>>> JGroups. AFAIK in JGroups you must have 1 buffer for each
message.
>>>>>>
>>>>>>
>>>>>> If you use ExposedByteArrayOutputStream, you should have access
to the
>>>>>> underlying buffer, so you don't need to copy it.
>>>>>>
>>>>>>
>>>>>>>> You do that, yes.
>>>>>>>>
>>>>>>>> However, afair, the issue is not on the *sending*, but on
the
>>>>>>>> *receiving* side. That's where the larger-than-needed
buffer sticks
>>>>>>>> around. On the sending side, as you mentioned, Infinispan
passes a
>>>>>>>> buffer/offset/length to JGroups and JGroups passes this
right on to the
>>>>>>>> network layer, which copies that data into a buffer.
>>>>>>>>
>>>>>>>
>>>>>>> I don't think so... on the receiving size the buffer size
is
>>>>>>> controlled exclusively by JGroups, the unmarshaller
doesn't create any
>>>>>>> buffers. The only buffers on the receiving side are those
created by
>>>>>>> JGroups, and JGroups knows the message size before creating
the buffer
>>>>>>> so it doesn't have to worry about predicting buffer
sizes.
>>>>>>>
>>>>>>> On sending however I understood that JGroups keeps the buffer
with the
>>>>>>> offset and length in the NakReceivingWindow exactly as it got
it from
>>>>>>> Infinispan, without any trimming, until it receives a STABLE
message
>>>>>>> from all the other nodes in the cluster.
>>>>>>
>>>>>>
>>>>>> Ah, ok. I think we should really do what we said before JBW,
namely have
>>>>>> an interactive debugging session, to clear this up.
>>>>>>
>>>>>> --
>>>>>> Bela Ban
>>>>>> Lead JGroups / Clustering Team
>>>>>> JBoss
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> infinispan-dev(a)lists.jboss.org
>>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> --
>>>>> Galder Zamarreño
>>>>> Sr. Software Engineer
>>>>> Infinispan, JBoss Cache
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik(a)jboss.org
>
twitter.com/maniksurtani
>
> Lead, Infinispan
>
http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev