[infinispan-dev] Asynchronous cache's "void put()" call expectations changed from 6.0.0 to 6.0.1/7.0

Mon Mar 2 07:11:47 EST 2015

On 03/02/2015 12:08 PM, Dan Berindei wrote:
> The AsyncCache interface already has an xxxAsync method for every
> write operation in the Cache interface. Those operations really behave
> just like the sync operations, except they return a NotifyingFuture -
> basically what .

But the implementation is sub-optimal now, as it only delegates the sync 
operation to threadpool, requiring context switch. But this will go away 
with the reactive-Infinispan design as discussed in Berlin. And 
NotifyingFeature will be replaced with standard CompletionFuture. I 
though this was already agreed upon.

>
> The FORCE_ASYNCHRONOUS flag is different, because you have no way to
> find out if the operation was successful. In fact, it behaves as if
> the cache was REPL_ASYNC. This can indeed cause some confusion, as the
> async operation can break consistency guarantees for concurrent sync
> operations. I'm all for removing REPL_ASYNC and the FORCE_ASYNCHRONOUS
> flag, but I think users will complain about their removal because
> putAsync is slower and they don't care/know about potential
> consistency problems.

Slower? You mean that you can achieve lower throughput, right? When you 
declare that you don't need any ack (you should get a Future that would 
throw an exception on get() etc.) you don't have to send the 
confirmation from primary to originator. Do you also omit confirmations 
from backup to primary in *_ASYNC mode? I have a vague idea that this 
can work with relying on JGroups' primary -> backup ordering of 
messages, but during topology change this could be messed up (leading to 
non-consistent cache until the entry is overridden). If the users insist 
on fast and dirty cache, the configuration option shouldn't be <sync 
.../> || <async ... /> but rather <inconsistency allowed="true||false"/> 
- this would make the behaviour more explicit.

Radim

>
> There's also the question of putForExternalRead, which is currently
> defined to always replicate the value asynchronously. If REPL_ASYNC is
> removed, it doesn't make sense to keep the async replication code only
> for putForExternalRead, so it would have to behave like putAsync
> (maybe with an extra CACHE_MODE_LOCAL put).
>
> Cheers
> Dan
>
>
> On Mon, Mar 2, 2015 at 12:20 PM, Radim Vansa <rvansa at redhat.com> wrote:
>> I second Galder's opinion that sync/async should be rather a matter of
>> API than configuration, because you have to deal with that in code.
>> Therefore, when planning a behaviour of any operation, I would suggest
>> that it should be possible to host sync and async operations (currently
>> forcable through a flag) with the proper semantics - different paths of
>> messages would complicate this a lot (e.g. the guarantee that eventually
>> all owners share the same value).
>>
>> The most natural semantics would be IMO that async operations return
>> without any guarantee *when* this operation will be applied (and even
>> *if* since it can silently fail with error), with optional way to find
>> out that it has been applied (as soon as we add the Java 8 API). And the
>> effect should be the same as for sync version. It's a subject for
>> discussion whether the retry should be done upon topology change (but it
>> needs to be transparent to user anyway).
>>
>> My 2c
>>
>> Radim
>>
>> On 03/02/2015 09:55 AM, Dan Berindei wrote:
>>> Reviving the thread...
>>>
>>> Paul, this is a little more complicated than just the return value of
>>> the put. In Infinispan prior to 6.0.1, this would have always worked
>>> with a replicated asynchronous cache:
>>>
>>> cache.put(k, v); // return value ignored
>>> v1 = cache.get(k);
>>> assertEquals(v, v1);
>>>
>>> The behaviour of Infinispan prior to 6.0.1 was to write the value to
>>> the local cache synchronously, and only replicate the value to the
>>> other owners asynchronously.
>>> Of course, if the cache was distributed and the originator was not an
>>> owner, the assertion could have failed anyway, because the remote get
>>> could be executed before the remote put.
>>> The get(k) is not guaranteed to return the value inserted by the
>>> put(k, v), but it did work most of the time.
>>>
>>> Since 6.0.1, the originator doesn't update the local value
>>> synchronously any more, unless it is the primary owner. If it is a
>>> backup owner, it will send the update asynchronously to the primary
>>> owner, which will then update the backup owners (including the
>>> originator) synchronously, while holding the lock. This was supposed
>>> to change in 5.3.0, as replicated mode was already using lock
>>> delegation in 5.3.0, but EntryWrappingInterceptor had an outdated
>>> check that meant the originator did both the synchronous update and
>>> the asynchronous one. The fix in 6.0.1 was just to remove the
>>> synchronous update.
>>>
>>> Personally I think the pre-6.0.1 behaviour was a bit dangerous,
>>> because the get would return the updated value *almost* all of the
>>> time. So there was a chance that tests would work, and the app would
>>> only fail in production. putIfAbsent will also behave more like an
>>> atomic operation with 6.0.1+, at least as long as the primary owner of
>>> the key doesn't change.
>>>
>>> That's not to say the current behaviour is perfect. I'm open to
>>> changing REPL_ASYNC back to updating the local data container first
>>> (if owner) and then sending the write command asynchronously to all
>>> the members in parallel, like we did in 5.2.0 and before.
>>>
>>> We could also change DIST_ASYNC to behave the same, because the
>>> current behaviour is a little too close to what putAsync does in
>>> DIST_SYNC mode, and people using async replication should be ok with
>>> nodes applying modifications in a different order (it happens on
>>> primary owner change anyway).
>>>
>>> However, I'm not sure what we should do for atomic operations like
>>> replace(key, oldValue, newValues). I would suggest throwing a
>>> UnsupportedOperationException, because the operation is very unlikely
>>> to be atomic. It's true that PFER is also specified as behaving the
>>> same as putIfAbsent, but the PFER check is going to be on the
>>> originator only so we can just remove that bit from the Javadoc.
>>>
>>> Another option would be to change PFER only to behave like in 5.3.0
>>> and 6.0.0. We could say PFER performs two put operations, one that's
>>> local-only and one that's replicated but asynchronous (updating the
>>> value on the originator a second time). Or we could say PFER always
>>> updates the value on the local node only...
>>>
>>> Cheers
>>> Dan
>>>
>>> On Tue, Sep 2, 2014 at 3:19 PM, Paul Ferraro <paul.ferraro at redhat.com
>>> <mailto:paul.ferraro at redhat.com>> wrote:
>>>
>>>      ----- Original Message -----
>>>      > From: "Galder Zamarreño" <galder at redhat.com
>>>      <mailto:galder at redhat.com>>
>>>      > To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org
>>>      <mailto:infinispan-dev at lists.jboss.org>>, "Paul Ferraro"
>>>      <paul.ferraro at redhat.com <mailto:paul.ferraro at redhat.com>>
>>>      > Sent: Monday, September 1, 2014 11:08:45 AM
>>>      > Subject: Asynchronous cache's "void put()" call expectations
>>>      changed from 6.0.0 to 6.0.1/7.0
>>>      >
>>>      > Hi all,
>>>      >
>>>      > @Paul, this might be important for WF if using async repl caches
>>>      (the same I
>>>      > think applies to distributed async caches too)
>>>
>>>      Luckily, Dan warned me that of this behavioral change well in advance.
>>>      Any time we need a reliable return value from a Cache.put(...), we
>>>      use Flag.FORCE_SYNCHRONOUS so that the same code will work for
>>>      sync and async caches alike.
>>>
>>>
>>>      > Today I’ve been trying to upgrade Infinispan version in
>>>      Hibernate master from
>>>      > 6.0.0.Final to 7.0.0.Beta1. Overall, it’s all worked fine but
>>>      there’s one
>>>      > test that has started failing.
>>>      >
>>>      > Essentialy, this is a clustered test for a repl async cache (w/
>>>      cluster cache
>>>      > loader) where a non-owner cache node does put() and immediately,
>>>      on the same
>>>      > cache, it calls a get(). The test is failing because the get()
>>>      does not see
>>>      > the effects of the put(), even if both operations are called on
>>>      the same
>>>      > cache instance.
>>>      >
>>>      > According to Dan, this should have been happening since [1] was
>>>      implemented,
>>>      > but it’s really started happening since [2] when lock delegation
>>>      was enabled
>>>      > for replicated caches
>>>      (EntryWrappingInterceptor.isUsingLockDelegation is now
>>>      > true whereas in 6.0.0 it was false).
>>>      >
>>>      > Not sure we set expectations in this regard, but clearly it’s
>>>      big change in
>>>      > terms of expectations on when “void put()” completes for async
>>>      repl caches.
>>>      > I’m not sure how we should handle this, but it definitely needs some
>>>      > discussion and adjuts documentation/javadoc if needed. Can we do
>>>      something
>>>      > differently?
>>>      >
>>>      > Indepent of how we resolve this, this is the result of once
>>>      again of trying
>>>      > to shoehole async behaviour into sync APIs. Any async caches
>>>      (DIST, INV,
>>>      > REPL) should really be accessed exclusively via the AsyncCache
>>>      API, where
>>>      > you can return quickly and use the future, and any listener to
>>>      attach to it
>>>      > (a bit ala Java8’s CompletableFuture.map lambda calls) as a way
>>>      to signal
>>>      > that the operation has completed, and then you have an API and
>>>      cache mode
>>>      > that make sense and is consistent with how async APIs work.
>>>      >
>>>      > Right now, when a repl async cache’s “void put()” call is not
>>>      very well
>>>      > defined. Does it return when message has been put on the
>>>      network? What
>>>      > impact does it have in the local cache contents?
>>>      >
>>>      > Also, a very big problem of the change of behaviour is that if
>>>      left like
>>>      > that, you are forcing users to code differently, using the same
>>>      “void put()”
>>>      > API depending on the configuration (whether async/sync). As
>>>      clearly shown by
>>>      > the issue above, this is very confusing. It’s a lot more logical
>>>      IMO, and
>>>      > I’ve already sent an email on this very same topic [3] back in
>>>      January, that
>>>      > whether a cache is sync or async should be based purely on the
>>>      API used and
>>>      > forget about the static configuration flag on whether the cache
>>>      is async or
>>>      > sync.
>>>
>>>      I would agree would this last statement. Consistent semantics are
>>>      a good thing.
>>>      If you do change this, however, just let me know well in advance.
>>>
>>>      > Cheers,
>>>      >
>>>      > [1] https://issues.jboss.org/browse/ISPN-2772
>>>      > [2] https://issues.jboss.org/browse/ISPN-3354
>>>      > [3]
>>>      http://lists.jboss.org/pipermail/infinispan-dev/2014-January/014448.html
>>>      > --
>>>      > Galder Zamarreño
>>>      > galder at redhat.com <mailto:galder at redhat.com>
>>>      > twitter.com/galderz <http://twitter.com/galderz>
>>>      >
>>>      >
>>>
>>>      _______________________________________________
>>>      infinispan-dev mailing list
>>>      infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>>>      https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Radim Vansa <rvansa at redhat.com>
>> JBoss Performance Team
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team