Re: [infinispan-dev] REPL async semantics in the context of Hibernate 2LC

Thursday, 26 January 2017

Hi Galder,

I think that this was changed in Infinispan version 5.3 or so :) The 
reason for this is that updates even in async cache are applied in the 
same order on all owners. If you'd update local node A first to X, and 
then asynchronously update the other node B, there could be a concurrent 
update to Y on the other node B, and then the cluster would likely end 
up with A having Y and B having X, without anything eventually resolving 
this. Some locking has to be involved, too, and the algorithm in 5.3 
actually did not allow the values to diverge, but caused a deadlock.

In 2LC, this can be eliminated in some cases, though - e.g. if we do 
putIfAbsents with the same value, it's safe to apply the value locally 
and sent the update asynchronously to the other node. For removals, it's 
safe, too. Therefore, I have recently replaced distribution & locking 
interceptors with 'optimized' version [1][2].

While I am strong adversary of the *_ASYNC modes in general, I think 
that the consistent order of updates should be preserved there. And if 
you do an async put to dist cache, you can't be sure that following read 
will return the value either (and repl is just read-optimized+failure 
resilient case of dist).

Radim

[1] 
https://github.com/hibernate/hibernate-orm/blob/master/hibernate-infinisp...
[2] 
https://github.com/hibernate/hibernate-orm/blob/master/hibernate-infinisp...

On 01/26/2017 01:24 PM, Galder Zamarreño wrote:
...
 Hi all,

 Forgive me if we've discussed this before (I vaguely remember...), but the current
async semantics always through me off a bit, let me explain:

 I've been working on/off on Hibernate 2LC tutorial that demonstrates how to run 2LC
on embedded, Wildfly and Spring set ups, and for each of them, explains how it all works
in local vs clustered mode.

 One of the sections involves working with queries, updating an entity that's part of
the query, and seeing how that query gets re-executed from the db. When an entity is
updated, that entity's update timestamp gets updated in a cache, which in a cluster
environment is configured with repl async.

 If you have two nodes A and B, it was expected that if you updated the entity in node A,
you'd want to wait a tiny bit to run the query in node B so that the timestamp update
would propagate to node B.

 However, recent async semantics work in such way that if you updated the entity in node A
and wanted to execute the query in node A, you still might want to add a little delay...

 The reason for that is that the logic changes based on whether the ownership of entity
type key in the update timestamp cache is in node A or node B. If the owner is node A, the
cache is updated directly by the main thread. So you can execute a query on node A
immediately after the update and it'll be fine.

 However, if the owner is node B, even if the update was done in node A, node A will only
be updated asynchronously. So, if after calling an update on node A, you do a query on
node A, in this scenario you'd get outdated results for a small period of time. [1]

 So, my question here is: can we do anything to make this more predictable from a users
perspective? Or is it just not worth doing it? Or is it just a side effect that we must be
aware off?

 Cheers,

 [1] https://gist.github.com/galderz/676f689884969658b01a7695f08dd7a2
 --
 Galder Zamarreño
 Infinispan, Red Hat

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev 

-- 
Radim Vansa <rvansa(a)redhat.com&gt;
JBoss Performance Team

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] REPL async semantics in the context of Hibernate 2LC