[infinispan-dev] Per-entry lock container

Wed Oct 17 13:19:55 EDT 2012

On Oct 17, 2012, at 11:44 AM, Dan Berindei <dan.berindei at gmail.com> wrote:

> On Wed, Oct 17, 2012 at 6:46 PM, Jason Greene <jason.greene at redhat.com> wrote:
> 
> On Oct 17, 2012, at 8:18 AM, Dan Berindei <dan.berindei at gmail.com> wrote:
> 
> >
> > On Wed, Oct 17, 2012 at 3:51 PM, Manik Surtani <manik at jboss.org> wrote:
> > The problem is that I have 2 code paths:
> >
> > 1.  Acquiring a lock.
> > 1.1. Check CHM for lock
> > 1.2. If lock == null, create new lock, do a putIfAbsent on CHM
> > 1.3. acquire the lock
> > 1.4. Check (in a loop) if the lock we have acquired is the same as the lock in the CHM, if not, release and retry.
> >
> >
> > I assume the retry goes back to 1.2?
> >
> >
> > 2.  Releasing the lock.
> > 2.1.  Get lock from CHM
> > 2.2.  Release lock.
> > 2.3.  Remove lock from CHM
> >
> >
> > I believe you need to remove the lock from the CHM before you release the lock, otherwise a thread can acquire the lock before you've removed it. The release should look something like this:
> >
> > 2.1 Get the lock from CHM
> > 2.2. Check if the current thread is the owner, if not throw exception
> > 2.3. Remove lock from CHM
> > 2.4. Release lock
> 
> Exactly.
> 
> >
> > It's not very fair, because threads that try to lock this key after we have removed the lock from the CHM have an advantage compared to threads that have been waiting for a while on our lock and now have to acquire the lock, unlock, and try again to lock on a new key.
> 
> Well the locks already aren't fair right? Also aren't the only cases of scenario 2 key removal and rollback of a put? So we would be talking insert/remove competition on the same key.
> 
> 
> The key lock's lifecycle is not tied to the entry's lifecycle, instead it lives just for the duration of a transaction. As soon as the tx is committed/rolled back, the lock is removed, and any tx that was waiting on the lock will have to create its own lock.

Ah ok so it is a very big problem then. I agree then you need reference counting. You want that lock to stay around for as long as you have contention.

>  
> > This putIfAbsent+lock+unlock loop may hurt performance in high contention cases as well, and using a reference counting scheme would avoid it in most cases.
> 
> That's what I was getting at earlier with a CAS state value. The removal case you can make it so the last waiter has the ability to remove the lock. However the cost is that every lock attempt has to CAS a value, which if what I mention above is considered the "less common" case, it would likely impact performance of the "more common cases". You could make it a weak CAS, but that will be of marginal benefit.  On the putIfAbsent case though, if Manik is now using computeIfAbsent you can do a fast path lock acquire. When you create the lock, no other threads can see it, so you just do a volatile write on the lock state and owner values. This eliminates any need to spin for insert.
> 
> 
> Every lock/unlock already requires one "successful" CAS in CHM to put/remove the lock. With refcounting we'd basically replace that with a "successful" CAS to increment/decrement the reference counter in some of the cases (The AtomicInteger constructor doesn't need a CAS, so we only do the CAS if we did find the key).

Good point. You also don't need to touch it if you already hold the lock, so it only impacts waiters.
> 
>  
> >
> >
> > The problem is that we may have several competing threads in code path 1.  Imagine T1, waiting on 1.3., and T2, who owns the lock, releasing it in 2.2.  With some unfortunate timing, I have seen:
> >
> > * T1 acquires the lock (1.3), does checks (1.4) and leaves code path 1.
> > * T2 removes the lock from the CHM (2.3)
> > * T3 comes in to code path 1, sees the lock missing from the CHM, creates a new lock acquires it, etc.
> > * T1 now tries to unlock the lock it thinks it owns.  Finds a different lock instance in the CHM.  All sorts of problems by this stage.
> >
> > I have a working branch where I solved this by:
> >
> > * Replacing the putIfAbsent in 1.2 with a compute() closure, which means the null check and store of the value is atomic wrt. any other modification on the same key. (with pIA, the null check didn't seem to be atomic?!)
> > * Replacing 2.1, 2.2 and 2.3 with a computeIfPresent() to release the lock and remove.
> >
> > This seems to have solved things, because step 1.4 cannot proceed until any remove operation completes (competing for the same entry space in the map) and process 1 cannot get beyond 1.3 since the lock is still held by the releasing thread in step 2.  But I'm still testing further in case of edge cases.
> >
> >
> > Not sure about the safety of the computeIfAbsent/computeIfPresent approach, as I don't have any experience with it, but doesn't CHMV8 use unsafe operations that prevent us from using in SecurityManager scenarios?
> >
> >
> >
> > On 16 Oct 2012, at 16:33, Sanne Grinovero <sanne at infinispan.org> wrote:
> >
> >>> But still, since the process of creating, acquiring and adding the lock to the lock map needs to be atomic, and not just atomic but also safe with regards to competing threads (say, an old owner) releasing the lock and removing it from the map (also atomic), I think a concurrent map isn't good enough anymore.
> >>
> >> are you sure that adding + creating + acquiring needs to be atomic?
> >>
> >> I see several solutions by using the standard CHM; assuming you
> >> already checked for Lock existence:
> >> - create a Lock, use a putIfAbsent to store it, then *try* to acquire it
> >> - create a Lock, acquire it, putIfAbsent to store it, throw your
> >> instance away if you failed to store it and try again by starting over
> >> from _get_ to lock on an eventually created new instance.
> >>
> >> For removal of the Lock I assume you're only allowed to do that when
> >> owning it? Which is even simpler.
> >> - CHM.remove(), release the returned value;
> >> Take care that threads waiting on that removed lock don't assume they
> >> acquired it when they get this instance but go through the acquire
> >> routine again, or they might end up owning the wrong instance;
> >> basically after a succesfull acquire you'll have to check you didn't
> >> get an outdated instance.
> >>
> >> What I don't like of this is that you need to get/put on the same key
> >> multiple times, hashing the key over and over, looking up the same
> >> segment again and potentially traversing segment locks for each of
> >> these steps: the V8 solution from Jason looks like far more efficient.
> >>
> >> Sanne
> >>
> >> On 16 October 2012 11:32, Manik Surtani <manik at jboss.org> wrote:
> >>>
> >>> On 16 Oct 2012, at 11:03, Dan Berindei <dan.berindei at gmail.com> wrote:
> >>>
> >>> Manik, how about adding a reference count to the lock entry? If there is a
> >>> waiter on the lock, the reverence count will be > 0 and the owner won't
> >>> remove the key on unlock.
> >>>
> >>>
> >>> Then you have a race on reading the counter/removing.
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Oct 16, 2012 at 3:43 AM, Manik Surtani <manik at jboss.org> wrote:
> >>>>
> >>>> Hmm, that actually might just do the trick.  Thanks!
> >>>>
> >>>> On 15 Oct 2012, at 17:46, Jason Greene <jason.greene at redhat.com> wrote:
> >>>>
> >>>> I think what you are looking for is this:
> >>>>
> >>>>
> >>>> http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/ConcurrentHashMapV8.html#computeIfAbsent(K,
> >>>> jsr166e.ConcurrentHashMapV8.Fun)
> >>>>
> >>>> On Oct 15, 2012, at 11:23 AM, Manik Surtani <manik at jboss.org> wrote:
> >>>>
> >>>> Guys, after investigating https://issues.jboss.org/browse/ISPN-2381 and
> >>>> https://github.com/infinispan/infinispan/pull/1382, I've discovered a pretty
> >>>> nasty race condition in our per-entry lock containers (whether
> >>>> OwnableReentrantLocks or JDK locks for non-transactional caches).
> >>>>
> >>>> The problem is that we maintain a lock map, and any given request can
> >>>> acquire a lock, if a lock doesn't exist for a given key, create the lock and
> >>>> acquire it, and when done, release the lock and remove it from the lock map.
> >>>> There's lots of room for races to occur.  The current impl uses a
> >>>> ConcurrentMap, where concurrent operations on the map are used to make sure
> >>>> locks are not overwritten.  But still, since the process of creating,
> >>>> acquiring and adding the lock to the lock map needs to be atomic, and not
> >>>> just atomic but also safe with regards to competing threads (say, an old
> >>>> owner) releasing the lock and removing it from the map (also atomic), I
> >>>> think a concurrent map isn't good enough anymore.
> >>>>
> >>>> The sledgehammer approach is to synchronise on this map for these two
> >>>> operations, but that causes all sorts of suckage.  Ideally, I'd just hold on
> >>>> to the segment lock for the duration of these operations, but these aren't
> >>>> exposed.  Extending CHM to expose methods like acquireLockAndGet() and
> >>>> unlockAndRemove() would work perfectly, but again a lot of CHM internals are
> >>>> private or package protected.
> >>>>
> >>>> So my options are: completely re-implement a CHM-like structure, like
> >>>> we've done for BCHM, or perhaps think of a new, specialised structure to
> >>>> contain locks.  In terms of contract, I just need a fast way to look up a
> >>>> value under given a key, efficient put and remove as well.  It should be
> >>>> thread-safe (naturally), and allow for an atomic operation (like "get, do
> >>>> work, put").
> >>>>
> >>>> Any interesting new data structures on peoples' minds?
> >>>>
> >>>> Cheers
> >>>> Manik
> >>>> --
> >>>> Manik Surtani
> >>>> manik at jboss.org
> >>>> twitter.com/maniksurtani
> >>>>
> >>>> Platform Architect, JBoss Data Grid
> >>>> http://red.ht/data-grid
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Manik Surtani
> >>>> manik at jboss.org
> >>>> twitter.com/maniksurtani
> >>>>
> >>>> Platform Architect, JBoss Data Grid
> >>>> http://red.ht/data-grid
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> infinispan-dev mailing list
> >>>> infinispan-dev at lists.jboss.org
> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>>
> >>> --
> >>> Manik Surtani
> >>> manik at jboss.org
> >>> twitter.com/maniksurtani
> >>>
> >>> Platform Architect, JBoss Data Grid
> >>> http://red.ht/data-grid
> >>>
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> infinispan-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>
> >> _______________________________________________
> >> infinispan-dev mailing list
> >> infinispan-dev at lists.jboss.org
> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> > --
> > Manik Surtani
> > manik at jboss.org
> > twitter.com/maniksurtani
> >
> > Platform Architect, JBoss Data Grid
> > http://red.ht/data-grid
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev