[infinispan-dev] New partial replication protocol providing serializability guarantees in Infinispan

Sebastiano Peluso peluso at gsd.inesc-id.pt
Sat Dec 10 06:46:43 EST 2011


Hi Manik,

in order to provide correct answers to your questions, I want to ask you 
the following:

  - Why do you say that the current MVCC implementation is non-genuine? 
I think that for any transaction T, only the sites that replicate the 
data items read/written by T exchange messages during the execution of T 
and in order to decide the final outcome of T. Is it correct?

  - About JTA semantics, what do you mean by the term "full"? During the 
integration of GMU, we have not changed the way a transaction was 
already managed in Infinispan (e.g. 2-PC, interaction between 
Transaction Manager and XAResource), so I think that the answer is yes. 
But since I have not a deep knowledge about JTA specification, maybe 
there is some aspect that I have not considered.


Thank you for your feedback!


Cheers

    Sebastiano

> This is very interesting, Paolo.  In terms of numbers of RPC, how does 
> this compare with a classic, non-genuine MVCC we currently have in 
> Infinispan?  I presume you still support full JTA semantics over GMU?
>
> Cheers
> Manik
>
>
>
>> Hi,
>>
>> within the context Cloud-TM project we have developed a new partial 
>> replication algorithm (corresponding to distribution mode of 
>> Infinispan) that guarantees serializability in a very scalable 
>> fashion. We have called the algorithm GMU, Genuine Multiversion 
>> Update Serializability, and we've integrated it into Infinispan (5.0.0).
>>
>> The source code is available on github:
>>
>> http://github.com/cloudtm/infinispan-5.0.0.SERIALIZABLE
>>
>> GMU's key features are:
>>
>> 1. unlike any other partial replication protocol we are aware of, GMU 
>> is the first distributed multi-versioned based partial replication 
>> protocol that does not rely on a single global clock in order to 
>> determine consistent snapshots. Conversely, the protocol guarantees 
>> to involve only the nodes that maintain data accessed by a committing 
>> transaction T (a property that is known in literature as 
>> "genuineness"). This is a property that is crucial, in our opinion, 
>> to achieve high scalability.
>>
>> 2. read-only tranasctions are never aborted, and do not need to be 
>> validated at commit time, making them very fast. Read-only 
>> transactions are guaranteed to observe a consistent snapshot of the 
>> data using a novel mechanism based on vector clocks. Note that in 
>> order to achieve this results we integrated in ISPN a multiversion 
>> concurrency control, very similar to the one used in PostgreSQL or 
>> JVSTM, that maintains multiple data item versions tagged with scalars 
>> per each key.
>>
>> 3. The consistency guarantees ensured by GMU are a variant of classic 
>> 1-Copy-Serialiability (1CS), and, more precisely, "extended update 
>> serializable" (EUS). You can check the tech. report in attach for 
>> more details on this, but, roughly speaking, US guarantees that 
>> update transactions execute according to 1CS. Concurrent read-only 
>> transactions, instead, may observe the updates generated by two 
>> *non-conflicting* update transactions in different order.
>> In practice, we could not think of any realistic application for 
>> which the schedules admitted by US would represent an issue, which 
>> leads us to argue that US is, in practical settings, as good as 1CS, 
>> but brings the key advantage of allowing way more scalable (genuine) 
>> implementations.
>>
>> We have evaluated GMU performance using up to 20 physical machines in 
>> our in-house cluster, and in 40 VMs in the FutureGrid (and we are 
>> currently trying to use more VMs in FutureGrid to see if we can make 
>> it scale up to hundreds of machines... we'll keep you posted on 
>> this!) with the YCSB (https://github.com/brianfrankcooper/YCSB/wiki) 
>> and TPC-C benchmarks.
>>
>> Our experimental results show that in low conflict scenarios, the 
>> protocol performs as good as the existing Repeatable Read 
>> implementation... and actually, in some scenarios, even slightly 
>> better, given that GMU spares the cost of saving the values read in 
>> the transactional context, unlike the existing Repeatable Read 
>> implementation.
>>
>> In high contention scenarios, GMU does pay a higher toll in terms of 
>> aborts, but it still drastically outperform classic non-genuine MVCC 
>> implementations as the size of the system grows. Also, we've a bunch 
>> of ideas on how to improve GMU performance in high contention 
>> scenarios... but that's another story!
>>
>> You find the technical report at this url:
>>
>> http://www.inesc-id.pt/ficheiros/publicacoes/7549.pdf
>>
>> Comments are more than welcome of course!
>>
>> Cheers,
>>
>>     Paolo
>>
>> -- 
>>
>> Paolo Romano, PhD
>> Coordinator of the Cloud-TM ICT FP7 Project (www.cloudtm.eu)
>> Senior Researcher @ INESC-ID (www.inesc-id.pt)
>> Invited Professor @ Instituto Superior Tecnico (www.ist.utl.pt)
>> Rua Alves Redol, 9
>> 1000-059, Lisbon Portugal
>> Tel. + 351 21 3100300
>> Fax  + 351 21 3145843
>> Webpage http://www.gsd.inesc-id.pt/~romanop
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org <mailto:manik at jboss.org>
> twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20111210/64082445/attachment.html 


More information about the infinispan-dev mailing list