On May 24, 2011, at 1:08 PM, Dan Berindei wrote:
On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero
<sanne.grinovero(a)gmail.com> wrote:
> 2011/5/24 Galder Zamarreño <galder(a)redhat.com>:
>> Guys,
>>
>> Some interesting discussions here, keep them coming! Let me summarise what I
submitted yesterday as pull req for
https://issues.jboss.org/browse/ISPN-1102
>>
>> - I don't think users can really provide such accurate predictions of the
objects sizes because first java does not give you an easy way of figuring out how much
your object takes up and most of the people don't have such knowledge. What I think
could be more interesting is potentially having a buffer predictor that predicts sizes per
type, so rather than calculate the next buffer size taking all objects into account, do
that per object type. To enable to do this in the future, I'm gonna add the object to
be marshalled as parameter to
https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This enhancement allows
for your suggestions on externalizers providing estimate size to be implemented, but
I'm not keen on that.
>>
>> - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size
algorithm that Netty uses for determining the receiver buffer size. The use cases are
different but I liked the simplicity of the algorithm since calculating the next buffer
size was an O(1) op and can grow both ways very easily. I agree that it might not be as
exact as reservoir sampling+percentile, but at least it's cheaper to compute and it
resolves the immediate problem of senders keeping too much memory for sent buffers before
STABLE comes around.
>>
>> - Next step would be to go and test this and compare it with Bela/Dan were seeing
(+1 to another interactive debugging session), and if we are still not too happy about the
memory consumption, maybe we can look into providing a different implementation for
BufferSizePredictor that uses R sampling.
>>
>> - Finally, I think once ISPN-1102 is in, we should make the BufferSizePredictor
implementation configurable programmatically and via XML - I'll create a separate JIRA
for this.
>
> great wrap up, +1 on all points.
> BTW I definitely don't expect every user to be able to figure out the
> proper size, just that some of them might want (need?) to provide
> hints.
>
Looks great Galder, although I could use some comments on how the
possible buffer sizes are chosen in your algorithm :-)
I'll ping you on IRC.
I guess we were thinking of different things with the externalizer
extension. I was imagining something like an ObjectOutput
implementation that doesn't really write anything but instead it just
records the size of the object that would be written. That way the
size estimate would always be accurate, but of course the performance
wouldn't be very good for complex object graphs.
Still I'd like to play with something like this to see if we can
estimate the memory usage of the cache and base the eviction on the
(estimated) memory usage instead of a fixed number of entries, it
seems to me like that's the first question people ask when they start
using Infinispan.
Sure, this is something we have considered in the past, and a cache that stores everything
as binary is the easiest of the use cases to provide this type of calculation.
In the case where store-as-binary is off, doing this is more complicated because even if
you can marshall things at some point (i.e. at replication time), the space taken by the
object in memory vs it's binary form are different.
Cheers
Dan
> Cheers,
> Sanne
>
>>
>> Cheers,
>>
>> On May 24, 2011, at 8:12 AM, Bela Ban wrote:
>>
>>>
>>>
>>> On 5/23/11 11:09 PM, Dan Berindei wrote:
>>>
>>>>> No need to expose the ExposedByteArrayOutputStream, a byte[] buffer,
>>>>> offset and length will do it, and we already use this today.
>>>>>
>>>>>
>>>>>> In case the value is not stored in binary form, the expected life
of
>>>>>> the stream is very short anyway, after being pushed directly to
>>>>>> network buffers we don't need it anymore... couldn't we
pass the
>>>>>> non-truncated stream directly to JGroups without this final size
>>>>>> adjustement ?
>>>>>
>>>>
>>>> The problem is that byte[] first has to be copied to another buffer
>>>> together with the rest of the ReplicableCommand before getting to
>>>> JGroups. AFAIK in JGroups you must have 1 buffer for each message.
>>>
>>>
>>> If you use ExposedByteArrayOutputStream, you should have access to the
>>> underlying buffer, so you don't need to copy it.
>>>
>>>
>>>>> You do that, yes.
>>>>>
>>>>> However, afair, the issue is not on the *sending*, but on the
>>>>> *receiving* side. That's where the larger-than-needed buffer
sticks
>>>>> around. On the sending side, as you mentioned, Infinispan passes a
>>>>> buffer/offset/length to JGroups and JGroups passes this right on to
the
>>>>> network layer, which copies that data into a buffer.
>>>>>
>>>>
>>>> I don't think so... on the receiving size the buffer size is
>>>> controlled exclusively by JGroups, the unmarshaller doesn't create
any
>>>> buffers. The only buffers on the receiving side are those created by
>>>> JGroups, and JGroups knows the message size before creating the buffer
>>>> so it doesn't have to worry about predicting buffer sizes.
>>>>
>>>> On sending however I understood that JGroups keeps the buffer with the
>>>> offset and length in the NakReceivingWindow exactly as it got it from
>>>> Infinispan, without any trimming, until it receives a STABLE message
>>>> from all the other nodes in the cluster.
>>>
>>>
>>> Ah, ok. I think we should really do what we said before JBW, namely have
>>> an interactive debugging session, to clear this up.
>>>
>>> --
>>> Bela Ban
>>> Lead JGroups / Clustering Team
>>> JBoss
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache