[infinispan-dev] Let me understand DIST

Dan Berindei dan.berindei at gmail.com
Thu Mar 15 14:10:08 EDT 2012


Hi Pedro

On Thu, Mar 15, 2012 at 4:42 PM, Pedro Ruivo <pruivo at gsd.inesc-id.pt> wrote:
> On 3/15/12 1:36 PM, Dan Berindei wrote:
>> On Thu, Mar 15, 2012 at 10:31 AM, Bela Ban<bban at redhat.com>  wrote:
>>> On 3/12/12 3:03 PM, Dan Berindei wrote:
>>>> On Sat, Mar 10, 2012 at 7:07 PM, Bela Ban<bban at redhat.com>    wrote:
>>>>> If my last statement is correct, is it safe to assume that with DIST and
>>>>> transactional modifications, I will have a lot of TX contention /
>>>>> collisions ?
>>>>>
>>>> Not sure what you mean by lot of TX contention - lock contention
>>>> should only depend on the dataset size, unless we use lock striping,
>>>> in which case it depends on the configured concurrency level.
>>>
>>> I meant TX rollbacks due to overlapping locks at different nodes,  the
>>> stuff Pedro wrote about in his paper on total order.
>>>
>> Hmm, I thought because we sort the keys before locking it shouldn't be
>> possible to have deadlocks between prepare commands. I was assuming
>> that the Tx aborts in Pedro's tests were due to write skew check
>> failures, but I just read his message again and he mentions write skew
>> check is disabled.
>> I must be missing something...
> I think that a transaction aborts if a scenario like this occurs:
>
> 4 nodes, N1 to N4.
> N2 is the primary owner of KeyA.
> N3 is the primary owner of KeyB.
> N1 is executing the transaction Tx1 which writes in A and B
> N4 is executing the transaction Tx2 which writes in A and B.
> Both transactions try to prepare at the same time. This scenario can
> occurs (I think):
>
> N2 -> deliver(Tx1), lock(KeyA), deliver(Tx2), tryLock(KeyA) //Tx2 is
> blocked until the lock of KeyA is released
> N3 -> deliver(Tx2), lock(KeyB), deliver(Tx1), tryLock(KeyB) //Tx1 is
> blocked until the lock of KeyB is released
>
> Eventually Tx1 or Tx2 (or both) will be aborted by a timeout. Is this
> behavior correct? Am I missing something?
>

Thanks for the example Pedro, a deadlock can indeed appear in this scenario.

I kept thinking that because we sort the keys, the Tx2 prepare won't
start locking KeyB until it has successfully acquired KeyA.
But that's not true, because the Tx2 prepare on N3 doesn't even try to
acquire KeyA (as N3 is not the primary owner). So it can go ahead and
lock KeyB instead, leading to the deadlock you described.

Cheers
Dan



More information about the infinispan-dev mailing list