[infinispan-dev] Semaphore vs Lock

Tue Mar 13 21:11:15 EDT 2012

On Tue, Mar 13, 2012 at 6:35 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
> On 13 March 2012 16:06, Dan Berindei <dan.berindei at gmail.com> wrote:
>> On Tue, Mar 13, 2012 at 3:57 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>> As it already came up during other design discussions, we should make
>>> a very clear split between a logical lock (something owned by a
>>> transaction) and an internal entry lock.
>>> A logical lock needs to be addressable uniquely, striping or
>>> semaphores are not a viable option as these are
>>>  - long lived (definitely more than 1 RPC)
>>>  - defeats any attempt from end users / intermediate patterns to avoid deadlocks
>>>  - likely a small ration of overall keys, so should not be too
>>> expensive to store
>>>
>>
>> While the locks may not be very expensive to store, they are complex
>> objects and creating them over and over again can get quite expensive.
>> For instance, if 10 tx threads wait for a key lock, the lock will keep
>> a queue with the 10 waiting threads. When the tx that owned the key
>> unlocks it, the key is removed, and all 10 threads wake up and try to
>> create a new lock. Only one of them succeeds, the others add
>> themselves to the queue once again.
>
> Agreed but if you think of a lock as a logical attribute of the entry,
> it's essentially a boolean,
> not an object which needs pooling.
> You don't even need "expensive" AtomicBoolean instances assuming you
> have the "structure locks" in place anyway.
> (not sure about the name, I mean the ones we talk about below).
>

We need a way to wait on a lock to become available without polling,
so we do need a higher level construct than a simple boolean
attribute.

>> This may not be a problem in our perf tests because we update keys
>> randomly using a uniform distribution, but regular application will
>> almost certainly have a non-uniform distribution and a few keys will
>> be highly contended compared to the rest. This makes me think that
>> LIRS could be a nice fit for a LockContainer implementation.
>
> you lost me here :) We don't want to drop lock information?
>

Right, proper LIRS won't work, because LIRS will do anything to
maintain its size limit - including evicting locks while they're in
use.

Maybe something similar to LIRS but ignoring the actually used locks
during eviction. Like keeping a LIRS pool of unused locks but without
moving the keys from the pool to the lock container and then back to
the pool all the time, so that waiters can stay in the same queue.

>>> Internal entry locks are very short lived (never live across multiple
>>> RPCs) and are essentially matching what we have as segments locks in
>>> the CHM (which is already striping == concurrency level), just that
>>> the CHM model isn't fitting well our needs: we need excplicit control
>>> of these, for example for when a value is being moved from the
>>> DataContainer to a CacheLoader.
>>>
>>
>> Sanne, I'm not sure about surfacing the DataContainer's locking, as
>> ConcurrentHashMapV8 doesn't have segment locks any more and we'll
>> probably want to move our BCHM towards that model as well in the
>> future.
>>
>> For the situation you mentioned I would prefer making the CacheLoader
>> a participant in cache transactions and holding the logical lock while
>> passivating the entry.
>
> In many cases you don't have a transaction, or at least you don't need
> to agree on an eviction operation with other nodes.
> Nice idea anyway to consider moving a value from memory container to
> CacheLoader as transactional! You should open a JIRA on that.
>

Thanks, but I'm not the first one to think of it -
https://issues.jboss.org/browse/ISPN-201 describes pretty much the
same thing.

> Think as well about CAS operations on our entries, such in-place swaps
> can be done almost atomically: almost in the sense that some lock on
> the structure will be needed, still it's something that can be swapped
> "in place" without having such a lock scope leaving some specific
> method.
> And let's extend this to MVCC CAS with vector clocks.. you definitely
> don't need a cluster wide lock to apply changes on the entry
> container.
>

If you're thinking about non-transactional caches, the locks we
acquire are already local only, not cluster-wide.
But making our CAS cache operations rely on DataContainer CAS
operations instead of locks sounds like a neat idea.

>>> In conclusion, I'd not spend time arguing on small improvements of the
>>> existing design - at least it's serving well for now.
>>
>> The discussion may be a bit premature, but it's a nice change from
>> thinking about NBST ;-)
>
> I'm not saying it's premature nor pointless, sorry.. I meant let's
> push this beyond little optimisations ;-)
>

I'm considering it a bit premature because there are probably other,
lower-hanging fruits to pick on the performance front. I don't think
it's pointless either.

Cheers
Dan