[infinispan-dev] DLD continued/optimizing modifications within a tx

Mon Jul 27 07:44:45 EDT 2009

On 23 Jul 2009, at 21:32, Mircea Markus wrote:

> Hi,
>
> Right now we support DLD in the following scenarios: local caches,
> symmetric tx that create deadlocks over two replicated caches  
> (symmetric
> in the sense that  A replicates to B and B replicates to A at the same
> time).
> In the case of distribution (replication as well, especially async  
> one)
> there might still be a deadlock if both A and B replicate to C, but  
> they
> replicate transactions that would result in deadlocks, so deadlock  
> would
> result in C.
> Let's take an example, A wants to replicate Tx1 that affects (key1,
> key2) in this sequence. B wants to replicate Tx2 that affects (key2,
> key1) in this sequence. While replicating on C, Tx1 and Tx2 would  
> result
> in a deadlock (classic scenario).
> Now a simple way of solving this (if I'm not missing something!!) is  
> to
> order the keys in the replicated transaction based on some criterion
> (e.g. lexicographic) and apply them in the same sequence:  
> Tx1(key1,key2)
> and Tx2(key1, key2). This will avoid deadlocks -> increase throughput.

Imposing Comparable on keys is too intrusive.  And re: Galder's  
suggestion on hashcodes, hashcodes may not be the same on different  
JVM impls/OSs for JDK classes (e.g., String).

>    Now, at the core of this approach is the fact that the order of
> operations in the transaction is not relevant - which does not stand  
> in
> current implementation. E.g. let's say we have a tx that does
> (remove(key), put(key, 'otherVal')). If we change the order  result is
> totally different - not good! A way to avoid this (and to also reduce
> the amount of replication, by compacting changes by key) is to keep  
> the
> modifications in a Map<key,Modification>. For each key we keep only  
> the
> last value set within it, so if we modify the same key 1000 times  
> within
> a tx we only replicate last modification vs current approach where  
> 1000
> operations are replicated. If a key is deleted, we still keep it in  
> the
> map, with a marker for deletion. This way we only replicate the  
> delta of
> modifications, and the order of operations is no longer relevant, so
> that we can leverage previously described deadlock detection.

Isn't this expensive though?  We spoke about compacting for JBC 3, and  
the reason why we did not do this is because traversing through a  
modification list each time a new mod is encountered was considered  
unnecessarily expensive especially since there as no guarantee as to  
whether modifications to the same key is a very common thing in the  
same transaction.

Thinking about it again, it should be an O(1) problem though - if you  
maintain a hash map of keys -> Modifications as you say.  Ah, except  
where you have PutMapCommands and ClearCommands in the modification  
list... then you still need to traverse the mod list and test for  
whether a modification affects a given key.

> The advantages of this would be higher throughput by reducing the  
> chance
> of 'asymmetric' deadlocks (A,B encounter a deadlock while  
> replicating on
> C) and possible reduction in the size of the transaction (if it has
> multiple operations on the same key). The drawback is mainly the fact
> that some additional computation needs to be made to manage the map
> content (not a lot I reckon, instead of a List.add we'll do an  
> map.get()
> +  map.put(), for each modification within the transaction).
>
> Wdyt?
>
> Cheers,
> Mircea
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org