On 15 Sep 2011, at 15:41, Paolo Romano wrote:

On 9/15/11 2:51 PM, Manik Surtani wrote:

On 15 Sep 2011, at 14:44, Paolo Romano wrote:

Concerning costs. For option 2), the prepare message should piggyback 
the version identifiers of *each* data item that needs to be write-skew 
checked...which may lead to big messages, if you needed to test a lot of 
data items. But the ws-check is done only on the data items that are 
both read and written within the same xact. So I'd expect that normally 
just a few keys would need to be write-skew checked (at least this would 
be the case for the wide majority of DBMS/STM benchmarks I've been using 
so far).  Therefore I would not be too concerned with this issue.

True, but if a vector clock is used as the underlying version scheme, then the updating node would need to send across its local clock for each data item, regardless of whether a ws-check is needed for that data item or not.  Correct?
In fact, my answer was targeting the non-eventual consistency case.

I don't know exactly what's the algorithm you've in your mind for eventual consistency, thus I may be missing something here... but if the updating node (say node i) increases its node clock (say to value v) when one of its transactions commits, then the i-th entry of the vector clock of *every* updated data item could be set to the same value, namely v. So why not sending v only once?

Do you want to increase the value stored in the i-th entry of each data item updated by a committing transaction independently (i.e. data_item.VC[i]=data_item.VC[i]+1 instead of data_item.VC[i]=++Node_clock_at_i)?

I think the latter should be sufficient.  As you say it means less data on the wire, but also it scales better as you don't need to maintain one counter per node per data item.  Can you think of why this approach may fall short?

--
Manik Surtani
manik@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org