[infinispan-dev] DLD continued/optimizing modifications within a tx

Fri Jul 24 05:43:11 EDT 2009

Galder Zamarreno wrote:
>
>
> On 07/23/2009 10:32 PM, Mircea Markus wrote:
>> Hi,
>>
>> Right now we support DLD in the following scenarios: local caches,
>> symmetric tx that create deadlocks over two replicated caches (symmetric
>> in the sense that  A replicates to B and B replicates to A at the same
>> time).
>> In the case of distribution (replication as well, especially async one)
>> there might still be a deadlock if both A and B replicate to C, but they
>> replicate transactions that would result in deadlocks, so deadlock would
>> result in C.
>> Let's take an example, A wants to replicate Tx1 that affects (key1,
>> key2) in this sequence. B wants to replicate Tx2 that affects (key2,
>> key1) in this sequence. While replicating on C, Tx1 and Tx2 would result
>> in a deadlock (classic scenario).
>> Now a simple way of solving this (if I'm not missing something!!) is to
>> order the keys in the replicated transaction based on some criterion
>> (e.g. lexicographic) and apply them in the same sequence: Tx1(key1,key2)
>> and Tx2(key1, key2). This will avoid deadlocks ->  increase throughput.
>>      Now, at the core of this approach is the fact that the order of
>> operations in the transaction is not relevant - which does not stand in
>> current implementation. E.g. let's say we have a tx that does
>> (remove(key), put(key, 'otherVal')). If we change the order  result is
>> totally different - not good! A way to avoid this (and to also reduce
>> the amount of replication, by compacting changes by key) is to keep the
>> modifications in a Map<key,Modification>. For each key we keep only the
>> last value set within it, so if we modify the same key 1000 times within
>> a tx we only replicate last modification vs current approach where 1000
>> operations are replicated. If a key is deleted, we still keep it in the
>> map, with a marker for deletion. This way we only replicate the delta of
>> modifications, and the order of operations is no longer relevant, so
>> that we can leverage previously described deadlock detection.
>
> This is very similar to what we implemented for the coalesced async 
> cache loader actually. 
yes indeed. Would be good to use the same code in both places.
> I think it makes sense to do so but I have doubts on the ordering that 
> we can impose on the keys and how to make this order be consistent in 
> the cluster. We definitely cannot force keys to be Comparable.
yes, of course. This functionality should be optional, so a user needs 
to manually enable it. User can also specify a Comaprator, or rely on 
the keys natural order (similar to the way SorteSets works).
>
>>
>> The advantages of this would be higher throughput by reducing the chance
>> of 'asymmetric' deadlocks (A,B encounter a deadlock while replicating on
>> C) and possible reduction in the size of the transaction (if it has
>> multiple operations on the same key). The drawback is mainly the fact
>> that some additional computation needs to be made to manage the map
>> content (not a lot I reckon, instead of a List.add we'll do an map.get()
>> +  map.put(), for each modification within the transaction).
>>
>> Wdyt?
>>
>> Cheers,
>> Mircea
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>