[infinispan-dev] L1OnRehash Discussion

Wed Feb 5 09:43:34 EST 2014

I'm all for simplification, assuming that this will deliver better
reliability and easier maintenance, but let's not forget that some
entries might be actually large.

Saving a couple of transfers might be a pointless complexity for our
usual small-key tests but maybe it's an interesting feature when you
store gigabytes per value.
Also, performance "hiccups" are not desirable even in small-key
scenarios: an often read key should stay where it is rather than
needing an occasional RPC.

I haven't looked into the details of your problem, so if you think
it's too complex I'm not against ditching this, I'm just trying to
make sure we evaluate the full picture.

I think you made a great point when specifying that the entry
remaining in place might actually not get any hit - so being pointless
- but that should be a decision the eviction strategy should be able
to handle?

Cheers,
Sanne

On 5 February 2014 13:19, William Burns <mudokonman at gmail.com> wrote:
> On Tue, Feb 4, 2014 at 6:04 AM, Dan Berindei <dan.berindei at gmail.com> wrote:
>>
>>
>>
>> On Tue, Feb 4, 2014 at 10:07 AM, Galder Zamarreño <galder at redhat.com> wrote:
>>>
>>>
>>> On 28 Jan 2014, at 15:29, William Burns <mudokonman at gmail.com> wrote:
>>>
>>> > Hello everyone,
>>> >
>>> > I wanted to discuss what I would say as dubious benefit of L1OnRehash
>>> > especially compared to the benefits it provide.
>>> >
>>> > L1OnRehash is used to retain a value by moving a previously owned
>>> > value into the L1 when a rehash occurs and this node no longer owns
>>> > that value  Also any current L1 values are removed when a rehash
>>> > occurs.  Therefore it can only save a single remote get for only a few
>>> > keys when a rehash occurs.
>>> >
>>> > This by itself is fine however L1OnRehash has many edge cases to
>>> > guarantee consistency as can be seen from
>>> > https://issues.jboss.org/browse/ISPN-3838.  This can get quite
>>> > complicated for a feature that gives marginal performance increases
>>> > (especially given that this value may never have been read recently -
>>> > at least normal L1 usage guarantees this).
>>> >
>>> > My first suggestion is instead to deprecate the L1OnRehash
>>> > configuration option and to remove this logic.
>>>
>>> +1
>>
>>
>> +1 from me as well
>>
>>>
>>>
>>> > My second suggestion is a new implementation of L1OnRehash that is
>>> > always enabled when L1 threshold is configured to 0.  For those not
>>> > familiar L1 threshold controls whether invalidations are broadcasted
>>> > instead of individual messages.  A value of 0 means to always
>>> > broadcast.  This would allow for some benefits that we can't currently
>>> > do:
>>> >
>>> > 1. L1 values would never have to be invalidated on a rehash event
>>> > (guarantee locality reads under rehash)
>>> > 2. L1 requestors would not have to be tracked any longer
>>> >
>>> > However every write would be required to send an invalidation which
>>> > could slow write performance in additional cases (since we currently
>>> > only send invalidations when requestors are found).  The difference
>>> > would be lessened with udp, which is the transport I would assume
>>> > someone would use when configuring L1 threshold to 0.
>>>
>>> Sounds good to me, but I think you could go even beyond this and maybe get
>>> rid of threshold configuration option too?
>>>
>>> If the transport is UDP and multicast is configured, invalidations are
>>> broadcasted (and apply the two benefits you mention).
>>> If UDP w/ unicast or TCP used, track invalidations and send them as
>>> unicasts.
>>>
>>> Do we really need to expose these configuration options to the user?
>>
>>
>> I think the idea was that even with UDP, sending 2 unicasts and waiting for
>> only 2 responses may be faster than sending a multicast and waiting for 10
>> responses. However, I'm not sure that's the case if we send 1 unicast
>> invalidation from each owner instead of a single multicast invalidation from
>> the primary owner/originator [1]. Maybe if each owner would return a list of
>> requestors and the originator would do the invalidation at the end...
>
> I totally agree since we currently have to send invalidations from the
> primary owner and all backup owners to guarantee consistency if we
> have a response from the backup owner [2].  By moving to this route we
> only ever have to send a single multicast invalidation instead of N
> unicast invalidations.  However this also brings up another change
> where we only L1 cache the primary owner response [3] :) Actually that
> would tilt the performance discussion the other way.  Makes me think
> deprecating current L1OnRehash and adding primary owner L1 caching
> should be first and then reevaluate if the new L1OnRehash support is
> even needed.
>
> The originator firing the invalidations is interesting, but don't
> think it is feasible.  With async transport this is not doable at all.
>  Also if the originator goes down and the value is persisted we will
> have invalid L1 values cached still.  The latter could be fixed with
> txs but non tx would still be broken.
>
>>
>> One tangible benefit of having the setting is that we can run the test suite
>> with TCP only, and still cover every path in L1Manager. If removed it
>> completely, it would still be possible to change the toggle in L1ManagerImpl
>> via reflection, but it would be a little hacky.
>>
>>>
>>> > What do you guys think?  I am thinking that no one minds the removal
>>> > of L1OnRehash that we have currently (if so let me know).  I am quite
>>> > curious what others think about the changes for L1 threshold value of
>>> > 0, maybe this configuration value is never used?
>>> >
>>
>>
>> Since we don't give any guidance as to what a good threshold value would be,
>> I doubt many people use it.
>>
>> My alternative proposal would be to replace the
>> invalidationThreshold=-1|0|>0 setting with a traceRequestors=true|false
>> setting.
>> 1. If traceRequestors == false, don't keep track of requestors, only send
>> the invalidation from the originator, and enable l1OnRehash.
>>     This means we can keep the entries that are in L1 after a rehash as
>> well.
>> 2. If traceRequestors == true, track requestors, send unicast/multicast
>> invalidations depending on the transport, and disable l1OnRehash.
>
> I have to admit I am struggling with whether we even need this
> configuration option anymore and just solely enable requestors based
> on the transport configuration.  I do like the option though,
> especially if we find out not tracking requestors is faster.  The
> default value though would be based on whether the transport allows
> for multicast or not.
>
>>
>>
>> [1] https://issues.jboss.org/browse/ISPN-186
>
> [2] https://issues.jboss.org/browse/ISPN-3648
> [3] https://issues.jboss.org/browse/ISPN-3684
>
>>
>> Cheers
>> Dan
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev.
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev