[infinispan-dev] Asynchronous cache's "void put()" call expectations changed from 6.0.0 to 6.0.1/7.0

Mon Mar 2 05:20:48 EST 2015

I second Galder's opinion that sync/async should be rather a matter of 
API than configuration, because you have to deal with that in code. 
Therefore, when planning a behaviour of any operation, I would suggest 
that it should be possible to host sync and async operations (currently 
forcable through a flag) with the proper semantics - different paths of 
messages would complicate this a lot (e.g. the guarantee that eventually 
all owners share the same value).

The most natural semantics would be IMO that async operations return 
without any guarantee *when* this operation will be applied (and even 
*if* since it can silently fail with error), with optional way to find 
out that it has been applied (as soon as we add the Java 8 API). And the 
effect should be the same as for sync version. It's a subject for 
discussion whether the retry should be done upon topology change (but it 
needs to be transparent to user anyway).

My 2c

Radim

On 03/02/2015 09:55 AM, Dan Berindei wrote:
> Reviving the thread...
>
> Paul, this is a little more complicated than just the return value of 
> the put. In Infinispan prior to 6.0.1, this would have always worked 
> with a replicated asynchronous cache:
>
> cache.put(k, v); // return value ignored
> v1 = cache.get(k);
> assertEquals(v, v1);
>
> The behaviour of Infinispan prior to 6.0.1 was to write the value to 
> the local cache synchronously, and only replicate the value to the 
> other owners asynchronously.
> Of course, if the cache was distributed and the originator was not an 
> owner, the assertion could have failed anyway, because the remote get 
> could be executed before the remote put.
> The get(k) is not guaranteed to return the value inserted by the 
> put(k, v), but it did work most of the time.
>
> Since 6.0.1, the originator doesn't update the local value 
> synchronously any more, unless it is the primary owner. If it is a 
> backup owner, it will send the update asynchronously to the primary 
> owner, which will then update the backup owners (including the 
> originator) synchronously, while holding the lock. This was supposed 
> to change in 5.3.0, as replicated mode was already using lock 
> delegation in 5.3.0, but EntryWrappingInterceptor had an outdated 
> check that meant the originator did both the synchronous update and 
> the asynchronous one. The fix in 6.0.1 was just to remove the 
> synchronous update.
>
> Personally I think the pre-6.0.1 behaviour was a bit dangerous, 
> because the get would return the updated value *almost* all of the 
> time. So there was a chance that tests would work, and the app would 
> only fail in production. putIfAbsent will also behave more like an 
> atomic operation with 6.0.1+, at least as long as the primary owner of 
> the key doesn't change.
>
> That's not to say the current behaviour is perfect. I'm open to 
> changing REPL_ASYNC back to updating the local data container first 
> (if owner) and then sending the write command asynchronously to all 
> the members in parallel, like we did in 5.2.0 and before.
>
> We could also change DIST_ASYNC to behave the same, because the 
> current behaviour is a little too close to what putAsync does in 
> DIST_SYNC mode, and people using async replication should be ok with 
> nodes applying modifications in a different order (it happens on 
> primary owner change anyway).
>
> However, I'm not sure what we should do for atomic operations like 
> replace(key, oldValue, newValues). I would suggest throwing a 
> UnsupportedOperationException, because the operation is very unlikely 
> to be atomic. It's true that PFER is also specified as behaving the 
> same as putIfAbsent, but the PFER check is going to be on the 
> originator only so we can just remove that bit from the Javadoc.
>
> Another option would be to change PFER only to behave like in 5.3.0 
> and 6.0.0. We could say PFER performs two put operations, one that's 
> local-only and one that's replicated but asynchronous (updating the 
> value on the originator a second time). Or we could say PFER always 
> updates the value on the local node only...
>
> Cheers
> Dan
>
> On Tue, Sep 2, 2014 at 3:19 PM, Paul Ferraro <paul.ferraro at redhat.com 
> <mailto:paul.ferraro at redhat.com>> wrote:
>
>     ----- Original Message -----
>     > From: "Galder Zamarreño" <galder at redhat.com
>     <mailto:galder at redhat.com>>
>     > To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org
>     <mailto:infinispan-dev at lists.jboss.org>>, "Paul Ferraro"
>     <paul.ferraro at redhat.com <mailto:paul.ferraro at redhat.com>>
>     > Sent: Monday, September 1, 2014 11:08:45 AM
>     > Subject: Asynchronous cache's "void put()" call expectations
>     changed from 6.0.0 to 6.0.1/7.0
>     >
>     > Hi all,
>     >
>     > @Paul, this might be important for WF if using async repl caches
>     (the same I
>     > think applies to distributed async caches too)
>
>     Luckily, Dan warned me that of this behavioral change well in advance.
>     Any time we need a reliable return value from a Cache.put(...), we
>     use Flag.FORCE_SYNCHRONOUS so that the same code will work for
>     sync and async caches alike.
>
>
>     > Today I’ve been trying to upgrade Infinispan version in
>     Hibernate master from
>     > 6.0.0.Final to 7.0.0.Beta1. Overall, it’s all worked fine but
>     there’s one
>     > test that has started failing.
>     >
>     > Essentialy, this is a clustered test for a repl async cache (w/
>     cluster cache
>     > loader) where a non-owner cache node does put() and immediately,
>     on the same
>     > cache, it calls a get(). The test is failing because the get()
>     does not see
>     > the effects of the put(), even if both operations are called on
>     the same
>     > cache instance.
>     >
>     > According to Dan, this should have been happening since [1] was
>     implemented,
>     > but it’s really started happening since [2] when lock delegation
>     was enabled
>     > for replicated caches
>     (EntryWrappingInterceptor.isUsingLockDelegation is now
>     > true whereas in 6.0.0 it was false).
>     >
>     > Not sure we set expectations in this regard, but clearly it’s
>     big change in
>     > terms of expectations on when “void put()” completes for async
>     repl caches.
>     > I’m not sure how we should handle this, but it definitely needs some
>     > discussion and adjuts documentation/javadoc if needed. Can we do
>     something
>     > differently?
>     >
>     > Indepent of how we resolve this, this is the result of once
>     again of trying
>     > to shoehole async behaviour into sync APIs. Any async caches
>     (DIST, INV,
>     > REPL) should really be accessed exclusively via the AsyncCache
>     API, where
>     > you can return quickly and use the future, and any listener to
>     attach to it
>     > (a bit ala Java8’s CompletableFuture.map lambda calls) as a way
>     to signal
>     > that the operation has completed, and then you have an API and
>     cache mode
>     > that make sense and is consistent with how async APIs work.
>     >
>     > Right now, when a repl async cache’s “void put()” call is not
>     very well
>     > defined. Does it return when message has been put on the
>     network? What
>     > impact does it have in the local cache contents?
>     >
>     > Also, a very big problem of the change of behaviour is that if
>     left like
>     > that, you are forcing users to code differently, using the same
>     “void put()”
>     > API depending on the configuration (whether async/sync). As
>     clearly shown by
>     > the issue above, this is very confusing. It’s a lot more logical
>     IMO, and
>     > I’ve already sent an email on this very same topic [3] back in
>     January, that
>     > whether a cache is sync or async should be based purely on the
>     API used and
>     > forget about the static configuration flag on whether the cache
>     is async or
>     > sync.
>
>     I would agree would this last statement. Consistent semantics are
>     a good thing.
>     If you do change this, however, just let me know well in advance.
>
>     > Cheers,
>     >
>     > [1] https://issues.jboss.org/browse/ISPN-2772
>     > [2] https://issues.jboss.org/browse/ISPN-3354
>     > [3]
>     http://lists.jboss.org/pipermail/infinispan-dev/2014-January/014448.html
>     > --
>     > Galder Zamarreño
>     > galder at redhat.com <mailto:galder at redhat.com>
>     > twitter.com/galderz <http://twitter.com/galderz>
>     >
>     >
>
>     _______________________________________________
>     infinispan-dev mailing list
>     infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team