On 15 Mar 2012, at 18:10, Dan Berindei wrote:

Hi Pedro

On Thu, Mar 15, 2012 at 4:42 PM, Pedro Ruivo <pruivo@gsd.inesc-id.pt> wrote:
On 3/15/12 1:36 PM, Dan Berindei wrote:
On Thu, Mar 15, 2012 at 10:31 AM, Bela Ban<bban@redhat.com> wrote:
On 3/12/12 3:03 PM, Dan Berindei wrote:
On Sat, Mar 10, 2012 at 7:07 PM, Bela Ban<bban@redhat.com> wrote:
If my last statement is correct, is it safe to assume that with DIST and
transactional modifications, I will have a lot of TX contention /
collisions ?

Not sure what you mean by lot of TX contention - lock contention
should only depend on the dataset size, unless we use lock striping,
in which case it depends on the configured concurrency level.

I meant TX rollbacks due to overlapping locks at different nodes, the
stuff Pedro wrote about in his paper on total order.

Hmm, I thought because we sort the keys before locking it shouldn't be
possible to have deadlocks between prepare commands. I was assuming
that the Tx aborts in Pedro's tests were due to write skew check
failures, but I just read his message again and he mentions write skew
check is disabled.
I must be missing something...
I think that a transaction aborts if a scenario like this occurs:

4 nodes, N1 to N4.
N2 is the primary owner of KeyA.
N3 is the primary owner of KeyB.
N1 is executing the transaction Tx1 which writes in A and B
N4 is executing the transaction Tx2 which writes in A and B.
Both transactions try to prepare at the same time. This scenario can
occurs (I think):

N2 -> deliver(Tx1), lock(KeyA), deliver(Tx2), tryLock(KeyA) //Tx2 is
blocked until the lock of KeyA is released
N3 -> deliver(Tx2), lock(KeyB), deliver(Tx1), tryLock(KeyB) //Tx1 is
blocked until the lock of KeyB is released

Eventually Tx1 or Tx2 (or both) will be aborted by a timeout. Is this
behavior correct? Am I missing something?

Thanks for the example Pedro, a deadlock can indeed appear in this scenario.

I kept thinking that because we sort the keys, the Tx2 prepare won't
start locking KeyB until it has successfully acquired KeyA.
But that's not true, because the Tx2 prepare on N3 doesn't even try to
acquire KeyA (as N3 is not the primary owner). So it can go ahead and
lock KeyB instead, leading to the deadlock you described.