[infinispan-dev] L1 Consistency with Sync Caches

Dan Berindei dan.berindei at gmail.com
Fri Jun 28 09:11:32 EDT 2013


On Fri, Jun 28, 2013 at 3:41 PM, William Burns <mudokonman at gmail.com> wrote:

> On Fri, Jun 28, 2013 at 5:53 AM, Dan Berindei <dan.berindei at gmail.com>
> wrote:
> >
> >
> > On Fri, Jun 28, 2013 at 12:51 AM, William Burns <mudokonman at gmail.com>
> > wrote:
> >>
> >> On Thu, Jun 27, 2013 at 4:40 PM, Dan Berindei <dan.berindei at gmail.com>
> >> wrote:
> >> >
> >> >
> >> > On Thu, Jun 27, 2013 at 4:40 PM, William Burns <mudokonman at gmail.com>
> >> > wrote:
> >> >>
> >> >> Comments that were outstanding on PR:
> >> >>
> >> >> @danberindei:
> >> >>
> >> >>  > +1 to move the discussion to the mailing list, could you summarize
> >> >> your changes (preferably for both non-tx and tx cases) and send an
> >> >> email to the list?
> >> >>  > And now to add some more to this already unwieldy discussion :)
> >> >>
> >> >>  >  1. Many interceptors check ownership, I don't think that would be
> >> >> a problem. Besides, I think creating a new L1ReadSynchronizer for
> >> >> every read is just as bad for performance as locking the key on every
> >> >> read, so you'd need that check either way.
> >> >> >
> >> >> We can't use a try lock approach for L1 invalidation when gets are
> >> >> locked without possibly dropping a L1 update
> >> >> We can't use a timed lock approach for L1 invalidation when writes
> >> >> lock the key as we could get into a timed deadlock situation when
> >> >> another node concurrently writes to a value/stripe
> >> >>
> >> >> I still don't see how we can get away with locking on a get.  What
> are
> >> >> you proposing?
> >> >
> >> >
> >> > I'm proposing something like this:
> >> >
> >> > 1. For get commands: acquire local lock + remote get + store in L1 +
> >> > release
> >> > local lock
> >> > 2. For invalidate commands: acquire local lock + remove entry from L1
> +
> >> > release local lock
> >> > 3. For write commands: invoke on primary owner, and the primary owner
> >> > sends
> >> > invalidation back to originator; alternatively, skip the invalidation
> >> > command to the originator and do invoke on primary owner + acquire
> local
> >> > lock + remove entry from L1 + release local lock
> >>
> >> I do like this and think it would be simpler for non-tx.  However this
> >> will still not be as performant I wouldn't think (just a gut feeling).
> >
> >
> > It would indeed be less performant in some cases, e.g. if we have two get
> > commands for a remote key at the same time and the key doesn't exist, my
> > proposal would request the key twice.
> >
> > The writes to L1 *should* be rare enough that the extra lock contention
> > there would be minimal. There is a chance that an invalidation will be
> stuck
> > waiting for a remote get, so it will again be slower than your approach.
> >
> >>  However I am more than certain it will be slower when collisions
> >> occur, which is more likely with lock striping.
> >>
> >
> > I'm not that worried about lock striping - the user can always increase
> the
> > number of stripes to get less contention. And I hope they know better
> than
> > to use lock striping and transactions with multiple keys, or they'll get
> > deadlocks with or without L1 enabled.
>
> I didn't think of the deadlock problem with tx and lock striping.  I
> am wondering should we log a warning message so that user's get an
> explicit warning?
>

Yeah, I think a warning would be nice - especially in the documentation, it
looks way too innocuous.

>
> >>
> >> I don't think this however is feasible for tx caches.
> >>
> >> Optimistic - we don't want to have gets being blocked when the tx is
> >> being committed - that could be quite a bit of a performance hit
> >> especially if it has to do a 2 phase commit.  Updating the L1 on a
> >> remoteGetBeforeWrites  could help remedy this (since get wouldn't need
> >> to lock), however then you would be almost implementing what I have
> >> and still have locking overhead.  Maybe this would occur so
> >> infrequently it wouldn't matter though?
> >>
> >
> > Yes, I would propose removing the L1 commit in EntryWrappingInterceptor
> as
> > well, and moving it to the L1 interceptor - immediately after the value
> was
> > retrieved from the remote owner. The local lock would also be released
> > immediately.
> >
> > The part I wanted to avoid from your proposal was
> > L1ReadSync/L1ReadSynchronizer, because I found it kind of hard to
> > understand. I guess I'd find it more approachable all the synchronization
> > (the map of L1ReadSynchronizers and the interaction with it) would be in
> a
> > separate class, with its own tests, and the L1 interceptor would only
> know
> > about the high-level stuff. Like the LockManager also hides a map of
> locks;
> > not like the LockManager unit tests, because they're kind of lacking, but
> > LockManager has the advantage that almost all the tests touch it
> indirectly.
> >
>
> True, I agree I like it as a separate class better - much better
> separation.  Maybe I can move that over and add tests and see if that
> makes it more clear?  If not we can still always use the locking
> approach.
>

+1


> >>
> >> Pessimistic - I haven't looked into pessimistic closely yet, but in
> >> that case I don't think it ever acquires the local locks so it would
> >> have to acquire a remote lock on a get, which would be pretty
> >> disastrous (gets would indirectly have a weak FORCE_WRITE_LOCK
> >> enabled).  Making this to acquire local locks might work as well, but
> >> then would still have the same issue as optimistic when committing - I
> >> really am not as familiar with Pessimistic to say for sure.
> >>
> >
> > Nope, acquiring a local lock should be enough. We don't want to prevent
> > other txs from modifying the key at all, we just want to delay the
> > invalidation commands triggered by those modifications on the local node.
> > And the lock would only be held for the duration of the remote get + L1
> > update, not all the way to the commit.
> >
> Okay, cool.  I didn't know for sure.
> >>
> >> >
> >> > All the lock acquisitions would use a regular timeout, not a try lock
> >> > approach.
> >> In this case they could, yes.
> >> >
> >> >>
> >> >> >   2. By default L1 invalidations are sent as multicasts, so I'm not
> >> >> > sure
> >> >> > ISPN-3273 really matters here. BTW, I wonder if we have a check to
> >> >> > only send
> >> >> > L1 invalidations from one node if the threshold is 0...
> >> >> I agree that is the default, but we should support the operation,
> >> >> although it doesn't matter for this discussion.  Also I am curious as
> >> >> to why multicast for L1 isn't set to say 2 by default?  It seems
> >> >> wasteful to send a multicast to all members that they process when
> >> >> only 1 would do anything about it.  Do you know why this is like
> that?
> >> >
> >> >
> >> > I suppose it's because we don't have good perf numbers for different
> L1
> >> > invalidation threshold numbers...
> >> >
> >> > The problem is, we don't have a way to count all the requestors of a
> key
> >> > in
> >> > the cluster, so it's reasonably likely that with a threshold of 2
> you'd
> >> > get
> >> > 1 unicast invalidation from one owner + 1 multicast invalidation from
> >> > the
> >> > other owner, making it less efficient than a single multicast
> >> > invalidation.
> >>
> >> It is a nitpick anyways, and really shouldn't make that big of a
> >> difference.
> >>
> >>
> >> >
> >> >> >
> >> >> >  3a. Right, for put commands we can't hold the local lock while
> >> >> > executing the remote put, or we'll have a deadlock. But I think a
> >> >> > shorter
> >> >> > lock, held only after the remote put completed (or after the lock
> on
> >> >> > the
> >> >> > primary owner was acquired, with txs) should work.
> >> >> Same point under 1
> >> >
> >> >
> >> > I don't see how we could get a deadlock if we don't hold the local
> lock
> >> > during the remote write invocation.
> >>
> >> Agree.
> >>
> >> >
> >> >>
> >> >> >
> >> >> >  3b. We'd also have an ownership check before, so we'd only
> serialize
> >> >> > the get commands that need to go remotely for the same key. I think
> >> >> > it would
> >> >> > be almost the same as your solution (although it does have one ?
> >> >> > disadvantage - if the key doesn't exist in the cache, all the get
> >> >> > commands
> >> >> > will go remotely). The number of L1 writes should be very small
> >> >> > compared to
> >> >> > the number of L1 reads anyway, otherwise it would be more efficient
> >> >> > to get
> >> >> > the key from the owner every time.
> >> >> You are saying an optimization for owner nodes so they don't do the
> >> >> "corralling" for keys they own?  I like that.  Also I don't think it
> >> >> has the disadvantage, it only does remotes it if isn't an owner.
> >> >
> >> >
> >> > I meant your corralling strategy means if you have 2 concurrent get
> >> > commands
> >> > and one of them retrieves a null from the entry owners, the other
> >> > command
> >> > will return null directly. With regular locking, the other command
> >> > wouldn't
> >> > find anything in L1 and it would do another remote get.
> >> >
> >> > I don't think there's any disadvantage in skipping the corralling for
> >> > key
> >> > owners, in fact I think we need to skip it if the key already exists
> in
> >> > L1,
> >> > too.
> >> +1
> >> >
> >> >>
> >> >> >
> >> >> > It would be nice to agree on what guarantees we want to provide for
> >> >> > L1
> >> >> > invalidation in non-tx caches, I'm not sure if we can do anything
> to
> >> >> > prevent
> >> >> > this scenario:
> >> >> Actually this scenario doesn't occur with non-tx since writes don't
> >> >> update the L1 with their value, they just invalidate.  Tx caches are
> >> >> fine with this because they acquire the primary owner lock for the
> >> >> duration of the write including the L1 update so you can't have this
> >> >> ordering.
> >> >
> >> >
> >> > Sounds good.
> >> >
> >> >>
> >> >> >
> >> >> > A initiates a put(k, v1) to the primary owner B
> >> >> > B performs the put(k, v1), invalidates every non-owner and returns
> >> >> > B performs another put(k, v2), invalidating every non-owner
> >> >> > A receives the result from B and puts k=v1 in its L1
> >> >>
> >> >> @pruivo:
> >> >>
> >> >> > The invalidation does not need to wait for the remote get. When you
> >> >> > receive an invalidation, you can mark the current remote get
> invalid.
> >> >> > The
> >> >> > invalidation command can return immediately and the remote get can
> be
> >> >> > repeated. Also, it removes the key from data container (if exists)
> >> >> Dan hit it right in the head.  Unfortunately there is no guarantee
> the
> >> >> cancellation can work properly, so it is a best effort and if not
> wait
> >> >> until we know we will invalidate properly.
> >> >> > The writes can update the L1 through your L1Synchronized by adding
> a
> >> >> > simple method like updateL1(newValue). The blocking threads will
> >> >> > return
> >> >> > immediately the new value and they don't need to wait for the
> reply.
> >> >> Non tx cache write operations aren't safe to update L1 with the value
> >> >> since they don't acquire the owning lock while updating the L1, which
> >> >> means you could have interleaved writes.  Which is the primary reason
> >> >> I rejected ISPN- 3214.  For tx caches we can't do this since the
> >> >> update has to take part of the tx, which the get would be updating
> the
> >> >> L1 outside of a transaction.
> >> >> > I see... However, I think that all the events should synchronize at
> >> >> > some
> >> >> > point (update by remote get, update by local put and invalidation).
> >> >> I was hoping that would cover this.  Other than the outstanding issue
> >> >> in ISPN-2965.
> >> >>
> >> >> On Thu, Jun 27, 2013 at 9:18 AM, William Burns <mudokonman at gmail.com
> >
> >> >> wrote:
> >> >> > First off I apologize for the length.
> >> >> >
> >> >> > There have been a few Jiras recently that have identified L1
> >> >> > consistency
> >> >> > issues with both TX and non TX sync caches.  Async caches with L1
> >> >> > have
> >> >> > their
> >> >> > own issues as well, but I only wanted to talk about sync caches.
> >> >> >
> >> >> > https://issues.jboss.org/browse/ISPN-3197
> >> >> > https://issues.jboss.org/browse/ISPN-2965
> >> >> > https://issues.jboss.org/browse/ISPN-2990
> >> >> >
> >> >> > I have proposed a solution in
> >> >> > https://github.com/infinispan/infinispan/pull/1922 which should
> start
> >> >> > L1
> >> >> > consistency down the right track.  There are quite a few comments
> on
> >> >> > it
> >> >> > if
> >> >> > you want to look into it more, but because of that I am moving this
> >> >> > to
> >> >> > the
> >> >> > dev mailing list.
> >> >> >
> >> >> > The key changes in the PR are the following (non-tx):
> >> >> >
> >> >> > 1. Concurrent reads for a key that can retrieve a remote value are
> >> >> > "corralled" into a single thread of execution for that given key.
> >> >> > This
> >> >> > would reduce network traffic with concurrent gets for the same key.
> >> >> > Note
> >> >> > the "corralling" only happens on a per key basis.
> >> >> > 2. The single thread that is doing the remote get would update the
> L1
> >> >> > if
> >> >> > able (without locking) and make available the value to all the
> >> >> > requests
> >> >> > waiting on the get.
> >> >> > 3. Invalidations that are received would first check to see if
> there
> >> >> > is
> >> >> > a
> >> >> > current remote get occurring for it's keys.  If there is it will
> >> >> > attempt
> >> >> > to
> >> >> > cancel the L1 write(s) before it occurs.  If it cannot cancel the
> L1
> >> >> > write,
> >> >> > then it must also wait on the current remote get completion and
> >> >> > subsequently
> >> >> > run the invalidation.  Note the cancellation would fail when the
> >> >> > remote
> >> >> > get
> >> >> > was done and it is in the middle of updating the L1, so this would
> be
> >> >> > very
> >> >> > small window.
> >> >> > 4. Local writes will also do the same thing as the invalidation
> with
> >> >> > cancelling or waiting.  Note that non tx local writes only do L1
> >> >> > invalidations and don't write the value to the data container.
> >> >> > Reasons
> >> >> > why
> >> >> > I found at https://issues.jboss.org/browse/ISPN-3214
> >> >> > 5. Writes that require the previous value and don't have it in the
> L1
> >> >> > would
> >> >> > also do it's get operations using the same "corralling" method.
> >> >> >
> >> >> > 4/5 are not currently implemented in PR.
> >> >> >
> >> >> > This approach would use no locking for non tx caches for all L1
> >> >> > operations.
> >> >> > The synchronization point would be done through the "corralling"
> >> >> > method
> >> >> > and
> >> >> > invalidations/writes communicating to it.
> >> >> >
> >> >> > Transactional caches would do almost the same thing as non-tx.
>  Note
> >> >> > these
> >> >> > changes are not done in any way yet.
> >> >> >
> >> >> > 1. Gets would now update the L1 immediately after retrieving the
> >> >> > value
> >> >> > without locking, but still using the "corralling" technique that
> >> >> > non-tx
> >> >> > does.  Previously the L1 update from a get was transactional.  This
> >> >> > actually
> >> >> > would remedy issue [1]
> >> >> > 2. Writes currently acquire the remote lock when committing, which
> is
> >> >> > why tx
> >> >> > caches are able to update the L1 with the value.  Writes would do
> the
> >> >> > same
> >> >> > cancellation/wait method as non-tx.
> >> >> > 3. Writes that require the previous value and don't have it in the
> L1
> >> >> > would
> >> >> > also do it's get operations using the same method.
> >> >> > 4. For tx cache [2] would also have to be done.
> >> >> >
> >> >> > [1] -
> >> >> >
> >> >> >
> >> >> >
> https://issues.jboss.org/browse/ISPN-2965?focusedCommentId=12779780&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12779780
> >> >> > [2] - https://issues.jboss.org/browse/ISPN-1540
> >> >> >
> >> >> > Also rehashing is another issue, but we should be able to acquire
> the
> >> >> > state
> >> >> > transfer lock before updating the L1 on a get, just like when an
> >> >> > entry
> >> >> > is
> >> >> > committed to the data container.
> >> >> >
> >> >> > Any comments/concerns would be appreciated.
> >> >> >
> >> >> > Thanks,
> >> >> >
> >> >> >  - Will
> >> >>
> >> >> _______________________________________________
> >> >> infinispan-dev mailing list
> >> >> infinispan-dev at lists.jboss.org
> >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >> >
> >> >
> >> >
> >> > _______________________________________________
> >> > infinispan-dev mailing list
> >> > infinispan-dev at lists.jboss.org
> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >> _______________________________________________
> >> infinispan-dev mailing list
> >> infinispan-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130628/56b4041a/attachment-0001.html 


More information about the infinispan-dev mailing list