[infinispan-dev] DeltaAware: different local/remote behaviour

Dan Berindei dan.berindei at gmail.com
Thu Dec 4 03:41:30 EST 2014


Hi Radim

I'm afraid the DeltaAware javadoc is quite clear:

> * Implementations of DeltaAware automatically gain the ability to perform fine-grained replication in Infinispan,
> * since Infinispan's data container is able to detect these types and only serialize and transport Deltas around
> * the network rather than the entire, serialized object.

So deltas is only used for minimizing the replication cost, not for
local operations.

> * Using DeltaAware makes sense if your custom object is large in size and often only sees small portions of the
> * object being updated in a transaction.  Implementations would need to be able to track these changes during the
> * course of a transaction though, to be able to produce a {@link Delta} instance, so this too is a consideration
> * for implementations.

Meaning put(K, DeltaAware) expects the originator to have the previous
value, not just a Delta. And that's why AtomicHashMap requires
transactions.

Further comments inline.

On Tue, Dec 2, 2014 at 6:17 PM, Radim Vansa <rvansa at redhat.com> wrote:
> Hi Erik,
>
> it's great to get community (users') feedback on API :)
>
> Comments inline
>
> On 12/02/2014 04:04 PM, Erik Salter wrote:
>> Hi Radim,
>>
>> We may be doing something similar.  I was implementing something along the
>> lines of a queue of operations that resolve into a single value. This
>> implementation uses Total Order and CRDTs.  I also want a changelog to
>> send to the backups.
>>
>> I already use DeltaAware quite liberally in my production environment.
>> I've always looked at it as an implementation detail if the originator ==
>> primary owner.  While this does make for some inefficiencies, like
>> increased memory utilization (I have a lot of keys for very large
>> objects), it's worth it to me from a simplicity standpoint.
>
> Yes, and that's what I'd like to do :) When designing the object, I was
> expecting that all updates will be in the delta-way, and therefore I
> report that it's not this way and I have to adapt the code in a hackish way.
>

I think you already have a hack in your code when you create a fake
DeltaAware when you only have a Delta :)

Your scenario is useful, indeed our Map/Reduce implementation uses it,
but it's not what DeltaAware was intended for. I'd rather add a new
API (like JCache's EntryProcessor) instead of "fixing" DeltaAware to
do things it wasn't intended for.

>>
>> I always use DeltaAware with SKIP_REMOTE_LOOKUP.
>
> I see. But in some use cases you'd want some condensed report what was
> the result of applying delta.
>

Again, this would be a hack: the put operation should return the
previous value, not an arbitrary value.

>>
>> The real fun with DeltaAware are the cases where a backup receives a
>> DeltaAware instance and the key isn't in its data container.  It will
>> issue a remote get to pull the complete context before applying the delta.
>>   During state transfer, this will lead to increased thread utilization on
>> the joining nodes.  I have a use case where I must restart half my cluster
>> while there's 100K DeltaAware keys being written at a high data rate.
>> With numOwners == 2, there are 3 nodes in the union CH.  A new backup will
>> issue 2 remote GetKeyValueCommands.
>
> Hmm, does not sound really convenient but I don't see what other could
> be done when the delta-updated entry is not in place yet.
>
>> I have a hack to stagger the gets to
>> reduce bandwidth, but if we're rethinking the implementation this should
>> be an additional consideration.
>
> Nobody said we're rethinking this - I was just providing the feedback
> from my POV after first starting to play with DeltaAware.
>

I've created ISPN-5042 [1] to keep track of this, but we might
implement ISPN-825 [2] sooner.

[1] https://issues.jboss.org/browse/ISPN-5042
[2] https://issues.jboss.org/browse/ISPN-825

> Radim
>
>>
>> Regards,
>>
>> Erik
>>
>>
>> On 12/2/14, 9:33 AM, "Radim Vansa" <rvansa at redhat.com> wrote:
>>
>>> Hi,
>>>
>>> I was trying to implement an effective atomic counters [1] in Infinispan
>>> using the DeltaAware interface, but trying to use DeltaAware I've
>>> spotted an unexpected behaviour; I wanted to have a Delta for
>>> getAndIncrement() method, that would simply increment the value without
>>> knowing the previous value ahead, and return this previous value.
>>> Therefore, I was inserting a fake DeltaAware object into the cache that
>>> generates this relative Delta.
>>>
>>> This works as long as the originator != primary owner, as the delta is
>>> generated during marshalling. However, if I store that object locally,
>>> the fake object is not used to generate the delta and reapply it on
>>> current instance in data container, but it is stored directly.
>>>
>>> Is such difference in local/remote behaviour bug or feature? (this is
>>> the main question in this mail)
>>>
>>> It seems to me that there are two reasons to use deltas: reducing size
>>> of RPCs and reduce their total number. So the design should optimize both.
>>>
>>> I have another doubts about DeltaAware interface usefulness, tracked in
>>> ISPN-5035 [2] - while it reduces bandwith from originator to primary
>>> owner, the response from primary owner to originator carries the full
>>> value. I also find quite inconvenient that only PutKeyValueCommand
>>> somehow works with deltas, but ReplaceCommand does not.
>>>
>>> I've also noticed that the backup carries the full value [3], not quite
>>> a good idea when we're trying to reduce bandwith.
>>>
>>> Generally, I think that EntryProcessor-like interface would be more
>>> useful than DeltaAware.
>>>
>>> Radim
>>>
>>> [1] https://github.com/rvansa/infinispan/tree/t_objects
>>> [2] https://issues.jboss.org/browse/ISPN-5035
>>> [3] https://issues.jboss.org/browse/ISPN-5037
>>>
>>> --
>>> Radim Vansa <rvansa at redhat.com>
>>> JBoss DataGrid QA
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Radim Vansa <rvansa at redhat.com>
> JBoss DataGrid QA
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


More information about the infinispan-dev mailing list