[infinispan-dev] Asynchronous cache's "void put()" call expectations changed from 6.0.0 to 6.0.1/7.0

Mon Mar 2 03:55:47 EST 2015

Reviving the thread...

Paul, this is a little more complicated than just the return value of the
put. In Infinispan prior to 6.0.1, this would have always worked with a
replicated asynchronous cache:

cache.put(k, v); // return value ignored
v1 = cache.get(k);
assertEquals(v, v1);

The behaviour of Infinispan prior to 6.0.1 was to write the value to the
local cache synchronously, and only replicate the value to the other owners
asynchronously.
Of course, if the cache was distributed and the originator was not an
owner, the assertion could have failed anyway, because the remote get could
be executed before the remote put.
The get(k) is not guaranteed to return the value inserted by the put(k, v),
but it did work most of the time.

Since 6.0.1, the originator doesn't update the local value synchronously
any more, unless it is the primary owner. If it is a backup owner, it will
send the update asynchronously to the primary owner, which will then update
the backup owners (including the originator) synchronously, while holding
the lock. This was supposed to change in 5.3.0, as replicated mode was
already using lock delegation in 5.3.0, but EntryWrappingInterceptor had an
outdated check that meant the originator did both the synchronous update
and the asynchronous one. The fix in 6.0.1 was just to remove the
synchronous update.

Personally I think the pre-6.0.1 behaviour was a bit dangerous, because the
get would return the updated value *almost* all of the time. So there was a
chance that tests would work, and the app would only fail in production.
putIfAbsent will also behave more like an atomic operation with 6.0.1+, at
least as long as the primary owner of the key doesn't change.

That's not to say the current behaviour is perfect. I'm open to changing
REPL_ASYNC back to updating the local data container first (if owner) and
then sending the write command asynchronously to all the members in
parallel, like we did in 5.2.0 and before.

We could also change DIST_ASYNC to behave the same, because the current
behaviour is a little too close to what putAsync does in DIST_SYNC mode,
and people using async replication should be ok with nodes applying
modifications in a different order (it happens on primary owner change
anyway).

However, I'm not sure what we should do for atomic operations like
replace(key, oldValue, newValues). I would suggest throwing a
UnsupportedOperationException, because the operation is very unlikely to be
atomic. It's true that PFER is also specified as behaving the same as
putIfAbsent, but the PFER check is going to be on the originator only so we
can just remove that bit from the Javadoc.

Another option would be to change PFER only to behave like in 5.3.0 and
6.0.0. We could say PFER performs two put operations, one that's local-only
and one that's replicated but asynchronous (updating the value on the
originator a second time). Or we could say PFER always updates the value on
the local node only...

Cheers
Dan

> On Tue, Sep 2, 2014 at 3:19 PM, Paul Ferraro <paul.ferraro at redhat.com>
wrote:

> ----- Original Message -----
> > From: "Galder Zamarreño" <galder at redhat.com>
> > To: "infinispan -Dev List" <infinispan-dev at lists.jboss.org>, "Paul
> Ferraro" <paul.ferraro at redhat.com>
> > Sent: Monday, September 1, 2014 11:08:45 AM
> > Subject: Asynchronous cache's "void put()" call expectations changed
> from 6.0.0 to 6.0.1/7.0
> >
> > Hi all,
> >
> > @Paul, this might be important for WF if using async repl caches (the
> same I
> > think applies to distributed async caches too)
>
> Luckily, Dan warned me that of this behavioral change well in advance.
> Any time we need a reliable return value from a Cache.put(...), we use
> Flag.FORCE_SYNCHRONOUS so that the same code will work for sync and async
> caches alike.

> > Today I’ve been trying to upgrade Infinispan version in Hibernate master
> from
> > 6.0.0.Final to 7.0.0.Beta1. Overall, it’s all worked fine but there’s one
> > test that has started failing.
> >
> > Essentialy, this is a clustered test for a repl async cache (w/ cluster
> cache
> > loader) where a non-owner cache node does put() and immediately, on the
> same
> > cache, it calls a get(). The test is failing because the get() does not
> see
> > the effects of the put(), even if both operations are called on the same
> > cache instance.
> >
> > According to Dan, this should have been happening since [1] was
> implemented,
> > but it’s really started happening since [2] when lock delegation was
> enabled
> > for replicated caches (EntryWrappingInterceptor.isUsingLockDelegation is
> now
> > true whereas in 6.0.0 it was false).
> >
> > Not sure we set expectations in this regard, but clearly it’s big change
> in
> > terms of expectations on when “void put()” completes for async repl
> caches.
> > I’m not sure how we should handle this, but it definitely needs some
> > discussion and adjuts documentation/javadoc if needed. Can we do
> something
> > differently?
> >
> > Indepent of how we resolve this, this is the result of once again of
> trying
> > to shoehole async behaviour into sync APIs. Any async caches (DIST, INV,
> > REPL) should really be accessed exclusively via the AsyncCache API, where
> > you can return quickly and use the future, and any listener to attach to
> it
> > (a bit ala Java8’s CompletableFuture.map lambda calls) as a way to signal
> > that the operation has completed, and then you have an API and cache mode
> > that make sense and is consistent with how async APIs work.
> >
> > Right now, when a repl async cache’s “void put()” call is not very well
> > defined. Does it return when message has been put on the network? What
> > impact does it have in the local cache contents?
> >
> > Also, a very big problem of the change of behaviour is that if left like
> > that, you are forcing users to code differently, using the same “void
> put()”
> > API depending on the configuration (whether async/sync). As clearly
> shown by
> > the issue above, this is very confusing. It’s a lot more logical IMO, and
> > I’ve already sent an email on this very same topic [3] back in January,
> that
> > whether a cache is sync or async should be based purely on the API used
> and
> > forget about the static configuration flag on whether the cache is async
> or
> > sync.
>
> I would agree would this last statement.  Consistent semantics are a good
> thing.
> If you do change this, however, just let me know well in advance.
>
> > Cheers,
> >
> > [1] https://issues.jboss.org/browse/ISPN-2772
> > [2] https://issues.jboss.org/browse/ISPN-3354
> > [3]
> http://lists.jboss.org/pipermail/infinispan-dev/2014-January/014448.html
> > --
> > Galder Zamarreño
> > galder at redhat.com
> > twitter.com/galderz
> >
> >
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20150302/6f5c0f36/attachment-0001.html