[infinispan-dev] L1 Consistency with Sync Caches

Thu Jun 27 16:40:38 EDT 2013

On Thu, Jun 27, 2013 at 4:40 PM, William Burns <mudokonman at gmail.com> wrote:

> Comments that were outstanding on PR:
>
> @danberindei:
>
>  > +1 to move the discussion to the mailing list, could you summarize
> your changes (preferably for both non-tx and tx cases) and send an
> email to the list?
>  > And now to add some more to this already unwieldy discussion :)
>
>  >  1. Many interceptors check ownership, I don't think that would be
> a problem. Besides, I think creating a new L1ReadSynchronizer for
> every read is just as bad for performance as locking the key on every
> read, so you'd need that check either way.
> >
> We can't use a try lock approach for L1 invalidation when gets are
> locked without possibly dropping a L1 update
> We can't use a timed lock approach for L1 invalidation when writes
> lock the key as we could get into a timed deadlock situation when
> another node concurrently writes to a value/stripe
>
> I still don't see how we can get away with locking on a get.  What are
> you proposing?
>

I'm proposing something like this:

1. For get commands: acquire local lock + remote get + store in L1 +
release local lock
2. For invalidate commands: acquire local lock + remove entry from L1 +
release local lock
3. For write commands: invoke on primary owner, and the primary owner sends
invalidation back to originator; alternatively, skip the invalidation
command to the originator and do invoke on primary owner + acquire local
lock + remove entry from L1 + release local lock

All the lock acquisitions would use a regular timeout, not a try lock
approach.

> >   2. By default L1 invalidations are sent as multicasts, so I'm not sure
> ISPN-3273 really matters here. BTW, I wonder if we have a check to only
> send L1 invalidations from one node if the threshold is 0...
> I agree that is the default, but we should support the operation,
> although it doesn't matter for this discussion.  Also I am curious as
> to why multicast for L1 isn't set to say 2 by default?  It seems
> wasteful to send a multicast to all members that they process when
> only 1 would do anything about it.  Do you know why this is like that?
>

I suppose it's because we don't have good perf numbers for different L1
invalidation threshold numbers...

The problem is, we don't have a way to count all the requestors of a key in
the cluster, so it's reasonably likely that with a threshold of 2 you'd get
1 unicast invalidation from one owner + 1 multicast invalidation from the
other owner, making it less efficient than a single multicast invalidation.

 >
> >  3a. Right, for put commands we can't hold the local lock while
> executing the remote put, or we'll have a deadlock. But I think a shorter
> lock, held only after the remote put completed (or after the lock on the
> primary owner was acquired, with txs) should work.
> Same point under 1
>

I don't see how we could get a deadlock if we don't hold the local lock
during the remote write invocation.

> >
> >  3b. We'd also have an ownership check before, so we'd only serialize
> the get commands that need to go remotely for the same key. I think it
> would be almost the same as your solution (although it does have one ?
> disadvantage - if the key doesn't exist in the cache, all the get commands
> will go remotely). The number of L1 writes should be very small compared to
> the number of L1 reads anyway, otherwise it would be more efficient to get
> the key from the owner every time.
> You are saying an optimization for owner nodes so they don't do the
> "corralling" for keys they own?  I like that.  Also I don't think it
> has the disadvantage, it only does remotes it if isn't an owner.
>

I meant your corralling strategy means if you have 2 concurrent get
commands and one of them retrieves a null from the entry owners, the other
command will return null directly. With regular locking, the other command
wouldn't find anything in L1 and it would do another remote get.

I don't think there's any disadvantage in skipping the corralling for key
owners, in fact I think we need to skip it if the key already exists in L1,
too.

> >
> > It would be nice to agree on what guarantees we want to provide for L1
> invalidation in non-tx caches, I'm not sure if we can do anything to
> prevent this scenario:
> Actually this scenario doesn't occur with non-tx since writes don't
> update the L1 with their value, they just invalidate.  Tx caches are
> fine with this because they acquire the primary owner lock for the
> duration of the write including the L1 update so you can't have this
> ordering.
>

Sounds good.

> >
> > A initiates a put(k, v1) to the primary owner B
> > B performs the put(k, v1), invalidates every non-owner and returns
> > B performs another put(k, v2), invalidating every non-owner
> > A receives the result from B and puts k=v1 in its L1
>
> @pruivo:
>
> > The invalidation does not need to wait for the remote get. When you
> receive an invalidation, you can mark the current remote get invalid. The
> invalidation command can return immediately and the remote get can be
> repeated. Also, it removes the key from data container (if exists)
> Dan hit it right in the head.  Unfortunately there is no guarantee the
> cancellation can work properly, so it is a best effort and if not wait
> until we know we will invalidate properly.
> > The writes can update the L1 through your L1Synchronized by adding a
> simple method like updateL1(newValue). The blocking threads will return
> immediately the new value and they don't need to wait for the reply.
> Non tx cache write operations aren't safe to update L1 with the value
> since they don't acquire the owning lock while updating the L1, which
> means you could have interleaved writes.  Which is the primary reason
> I rejected ISPN- 3214.  For tx caches we can't do this since the
> update has to take part of the tx, which the get would be updating the
> L1 outside of a transaction.
> > I see... However, I think that all the events should synchronize at some
> point (update by remote get, update by local put and invalidation).
> I was hoping that would cover this.  Other than the outstanding issue
> in ISPN-2965.
>
> On Thu, Jun 27, 2013 at 9:18 AM, William Burns <mudokonman at gmail.com>
> wrote:
> > First off I apologize for the length.
> >
> > There have been a few Jiras recently that have identified L1 consistency
> > issues with both TX and non TX sync caches.  Async caches with L1 have
> their
> > own issues as well, but I only wanted to talk about sync caches.
> >
> > https://issues.jboss.org/browse/ISPN-3197
> > https://issues.jboss.org/browse/ISPN-2965
> > https://issues.jboss.org/browse/ISPN-2990
> >
> > I have proposed a solution in
> > https://github.com/infinispan/infinispan/pull/1922 which should start L1
> > consistency down the right track.  There are quite a few comments on it
> if
> > you want to look into it more, but because of that I am moving this to
> the
> > dev mailing list.
> >
> > The key changes in the PR are the following (non-tx):
> >
> > 1. Concurrent reads for a key that can retrieve a remote value are
> > "corralled" into a single thread of execution for that given key.  This
> > would reduce network traffic with concurrent gets for the same key.  Note
> > the "corralling" only happens on a per key basis.
> > 2. The single thread that is doing the remote get would update the L1 if
> > able (without locking) and make available the value to all the requests
> > waiting on the get.
> > 3. Invalidations that are received would first check to see if there is a
> > current remote get occurring for it's keys.  If there is it will attempt
> to
> > cancel the L1 write(s) before it occurs.  If it cannot cancel the L1
> write,
> > then it must also wait on the current remote get completion and
> subsequently
> > run the invalidation.  Note the cancellation would fail when the remote
> get
> > was done and it is in the middle of updating the L1, so this would be
> very
> > small window.
> > 4. Local writes will also do the same thing as the invalidation with
> > cancelling or waiting.  Note that non tx local writes only do L1
> > invalidations and don't write the value to the data container.  Reasons
> why
> > I found at https://issues.jboss.org/browse/ISPN-3214
> > 5. Writes that require the previous value and don't have it in the L1
> would
> > also do it's get operations using the same "corralling" method.
> >
> > 4/5 are not currently implemented in PR.
> >
> > This approach would use no locking for non tx caches for all L1
> operations.
> > The synchronization point would be done through the "corralling" method
> and
> > invalidations/writes communicating to it.
> >
> > Transactional caches would do almost the same thing as non-tx.  Note
> these
> > changes are not done in any way yet.
> >
> > 1. Gets would now update the L1 immediately after retrieving the value
> > without locking, but still using the "corralling" technique that non-tx
> > does.  Previously the L1 update from a get was transactional.  This
> actually
> > would remedy issue [1]
> > 2. Writes currently acquire the remote lock when committing, which is
> why tx
> > caches are able to update the L1 with the value.  Writes would do the
> same
> > cancellation/wait method as non-tx.
> > 3. Writes that require the previous value and don't have it in the L1
> would
> > also do it's get operations using the same method.
> > 4. For tx cache [2] would also have to be done.
> >
> > [1] -
> >
> https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12779780
> > [2] - https://issues.jboss.org/browse/ISPN-1540
> >
> > Also rehashing is another issue, but we should be able to acquire the
> state
> > transfer lock before updating the L1 on a get, just like when an entry is
> > committed to the data container.
> >
> > Any comments/concerns would be appreciated.
> >
> > Thanks,
> >
> >  - Will
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130627/aa66d6c8/attachment-0001.html