Re: [infinispan-dev] L1 Consistency with Sync Caches

Thursday, 27 June 2013

On Thu, Jun 27, 2013 at 4:18 PM, William Burns <mudokonman(a)gmail.com&gt; wrote:

...
 First off I apologize for the length.

 There have been a few Jiras recently that have identified L1 consistency
 issues with both TX and non TX sync caches.  Async caches with L1 have
 their own issues as well, but I only wanted to talk about sync caches.

 https://issues.jboss.org/browse/ISPN-3197
 https://issues.jboss.org/browse/ISPN-2965
 https://issues.jboss.org/browse/ISPN-2990

 I have proposed a solution in
 https://github.com/infinispan/infinispan/pull/1922 which should start L1
 consistency down the right track.  There are quite a few comments on it if
 you want to look into it more, but because of that I am moving this to the
 dev mailing list.

 The key changes in the PR are the following (non-tx):

 1. Concurrent reads for a key that can retrieve a remote value are
 "corralled" into a single thread of execution for that given key.  This
 would reduce network traffic with concurrent gets for the same key.  Note
 the "corralling" only happens on a per key basis.

Get commands on owners should not be serialized. Get commands on non-owners
should not be serialized either, if the key already exists in L1. So I'd
say L1ReadSynchronizer should be L1WriteSynchronizer instead :)

...
 2. The single thread that is doing the remote get would update the L1
if
 able (without locking) and make available the value to all the requests
 waiting on the get.

Well, L1ReadSynchronizer does prevent other threads from modifying the same
key, so we are locking the key - just not using LockManager.
It would also require StateTransferLock.acquireSharedTopologyLock() to make
sure it doesn't write an L1 entry after the node became a proper owner.

...
 3. Invalidations that are received would first check to see if there
is a
 current remote get occurring for it's keys.  If there is it will attempt to
 cancel the L1 write(s) before it occurs.  If it cannot cancel the L1 write,
 then it must also wait on the current remote get completion and
 subsequently run the invalidation.  Note the cancellation would fail when
 the remote get was done and it is in the middle of updating the L1, so this
 would be very small window.

I think it would be clearer to describe this as the L1 invalidation
cancelling the remote get, not the L1 update, because the actual L1 update
can't be cancelled.

We also have to remove the logic in AbstractLockingInterceptor that skips
L1 invalidation for a key if it can't acquire a lock with a 0 timeout.

...
 4. Local writes will also do the same thing as the invalidation with
 cancelling or waiting.  Note that non tx local writes only do L1
 invalidations and don't write the value to the data container.  Reasons why
 I found at https://issues.jboss.org/browse/ISPN-3214

I didn't know about ISPN-3214 or that non-tx writes don't write to L1, but
it sounds fair.

...
  5. Writes that require the previous value and don't have it in
the L1
 would also do it's get operations using the same "corralling" method.

The remoteGetBeforeWrites are a bit different - they don't happen on
non-owners, they only happen on writeCH-owners that didn't receive that
entry via state transfer yet. They put the value in the InvocationContext,
but they don't write it to the data container - nor do they invalidate the
L1 entry, if it exists.

...

 4/5 are not currently implemented in PR.

 This approach would use no locking for non tx caches for all L1
 operations.  The synchronization point would be done through the
 "corralling" method and invalidations/writes communicating to it.

 Transactional caches would do almost the same thing as non-tx.  Note these
 changes are not done in any way yet.

 1. Gets would now update the L1 immediately after retrieving the value
 without locking, but still using the "corralling" technique that non-tx
 does.  Previously the L1 update from a get was transactional.  This
 actually would remedy issue [1]
 2. Writes currently acquire the remote lock when committing, which is why
...
 tx caches are able to update the L1 with the value.  Writes would do
the
 same cancellation/wait method as non-tx.
 3. Writes that require the previous value and don't have it in the L1 would
...
 also do it's get operations using the same method.

Just like for non-tx caches, I don't think these remote gets have to be
stored in L1.

...
 4. For tx cache [2] would also have to be done.

 [1] -

https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&p...
 [2] - https://issues.jboss.org/browse/ISPN-1540

 Also rehashing is another issue, but we should be able to acquire the
 state transfer lock before updating the L1 on a get, just like when an
 entry is committed to the data container.

 The same for L1 invalidations - we don't want to remove real entries from
the data container after the local node became an owner.

...
 Any comments/concerns would be appreciated.

 Thanks,

  - Will

 _______________________________________________
 infinispan-dev mailing list
 infinispan-dev(a)lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] L1 Consistency with Sync Caches