Re: [infinispan-dev] Issue about propagation of the RollbackCommand in Infinispan 5.2.0

Friday, 17 August 2012

Hi Mircea,

On 8/17/12 4:59 PM, Mircea Markus wrote:
...
 On 17 Aug 2012, at 13:16, Sebastiano Peluso wrote:

> Hi all,
>
> I have a question about the propagation of the RollbackCommand in
> Infinispan 5.2.0 when I use the Optimistic locking scheme and the
> Distribution clustering mode.
>
> In particular I have noticed that a RollbackCommand command for a
> transaction T is propagated on a set of nodes S even if T's coordinator
> has never sent and it will never send a PrepareCommand command to nodes
> in S.
>
> I try to make clear the issue by the following example.
>   Suppose you have a transaction T executing on node N0 and T writes on
> keys k0, k1, k2,...., km (m+1 keys) until it reaches the prepare phase.
> In addition node Ni, with i=0,...,m+1, is the ki's primary owner. If at
> prepare time, during the lock acquisition on the local node N0 (see
> visitPrepareCommand method in OptimisticLockingInterceptor class) T
> fails to acquire the lock on k0, an exception is thrown (e.g.
> TimeoutException) and T will be rolled back. In this case, when T starts
> the rollback phase, it seems to me that a RollbackCommand command is
> multicast to all nodes Nj, with j=1,...,d, if k0 is sorted after kj
> during the local lock acquisition (see acquireAllLocks method in
> OptimisticLockingInterceptor), because:
>
>   - shouldInvokeRemoteCommand method on the TxInvocationContext returns
> true (see BaseRpcInterceptor class);
>   - getAffectedKeys on the TxInvocationContext returns the set {k1,...,
> kd} (see visitRollbackCommand in DistributionInterceptor class).
>
> Is it correct?
>
> If I'm not wrong, which is the design choice behind this implementation?
 This is indeed sub-optimal, but not incorrect. Yes, you're right: this is
correct but it can be sub-optimal.
...
 Does this break the TOA/TOB stuff? Mind creating a JIRA for it?
I think this does not affect TOA/TOB but the problem here is the 
relationship between this issue and the solution for the bug described 
in the JIRA [1]. In particular a possible solution for [1] can be 
registering an "out-of-order" rollback entry when a rollback message R 
is delivered on a remote node N that has not seen the related prepare 
message P, and annihilating that entry when P arrives. Unfortunately 
there is the possibility that P is not delivered to N just because R is 
a rollback generated due to a locally failed lock acquisition (as in the 
previous example). This behavior generates problems for garbage 
collecting those "out-of-order" entries thus making that solution 
inefficient and so (in my opinion) unfeasible.

Therefore maintaining that sub-optimal implementation for the Rollbacks 
propagation makes solving the problem in [1] harder.

Just a note on the possible solution you reported in [1]: waiting for 
all the ack/nack messages is not enough because a replication timeout 
exception can be thrown before a prepare message reaches all the 
participants.

Thank you for the reply.

Cheers,

     Sebastiano

[1] https://issues.jboss.org/browse/ISPN-2081

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] Issue about propagation of the RollbackCommand in Infinispan 5.2.0