[infinispan-dev] Why DistTxInterceptor in use with Hot Rod ?

Dan Berindei dan.berindei at gmail.com
Thu Sep 15 07:31:06 EDT 2011


On Thu, Sep 15, 2011 at 12:25 PM, Galder Zamarreño <galder at redhat.com> wrote:
>
> On Sep 14, 2011, at 11:40 AM, Dan Berindei wrote:
>
>> Going back to your original question Galder, the exception is most
>> likely thrown because of this sequence of events:
>>
>> 0. Given a cluster {A, B}, a key k and a node C joining.
>> 1. Put acquires the "transaction lock" on node A (blocking rehashing)
>> 2. Put acquires lock for key k on node A
>> 3. Rehashing starts on node B, blocking transactions
>> 4. Put tries to acquire transaction lock on node B
>>
>> Since it's impossible to finish rehashing while the put operation
>> keeps the transaction lock on node A, the best option was to kill the
>> put operation by throwing a RehashInProgressException.
>>
>> I was thinking in the context of transactions when I wrote this code
>> (see https://github.com/danberindei/infinispan/commit/6ed94d3b2e184d4a48d4e781db8d404baf5915a3,
>> this scenario became just a footnote to the generic case with multiple
>> caches), but the scenario also occurs without transactions. Actually I
>> renamed it "state transfer lock" and I moved it to a separate
>> interceptor in my ISPN-1194 branch.
>
> Right, but it shouldn't happen when transactions are off, right?
>

It will still happen with transactions off, because state transfer
will acquire the state transfer lock in exclusive mode and
WriteCommands will acquire the state transfer lock in shared mode
instead of PrepareCommands with transactions on.

If we don't acquire the state transfer lock than we run the risk of
pushing stale data to other nodes, are you saying that without
transactions enabled this risk is acceptable?

>>
>> Maybe the locking changes in 5.1 will eliminate this scenario, but
>> otherwise we could improve the user experience by retrying the command
>> after the rehashing finishes.
>
> I'd prefer that, otherwise users are likely to code similar logic which is not nice.
>

After looking at the scenario again it became clear that step 2. is
not necessary at all and the locking enhancements will not change
anything.

We could change DistributionInterceptor to not hold the state transfer
lock during remote calls, which will allow rehashing to proceed on the
origin. But then the ownership of the keys might change during the
call, so we'll need a retry phase anyway.

Dan



More information about the infinispan-dev mailing list