[infinispan-dev] New partial replication protocol providing serializability guarantees in Infinispan
Paolo Romano
romano at inesc-id.pt
Sat Dec 10 16:08:35 EST 2011
On 12/9/11 4:19 PM, Manik Surtani wrote:
> This is very interesting, Paolo. In terms of numbers of RPC, how does
> this compare with a classic, non-genuine MVCC we currently have in
> Infinispan?
Same number of round. We're using plain 2PC in this case, but we could
plug-in total-order multi-cast in future to avoid deadlocks.
> I presume you still support full JTA semantics over GMU?
I presume it too ;-)
Cheers,
Paolo
>
> Cheers
> Manik
>
> On 29 Nov 2011, at 13:11, Paolo Romano wrote:
>
>> Hi,
>>
>> within the context Cloud-TM project we have developed a new partial
>> replication algorithm (corresponding to distribution mode of
>> Infinispan) that guarantees serializability in a very scalable
>> fashion. We have called the algorithm GMU, Genuine Multiversion
>> Update Serializability, and we've integrated it into Infinispan (5.0.0).
>>
>> The source code is available on github:
>>
>> http://github.com/cloudtm/infinispan-5.0.0.SERIALIZABLE
>>
>> GMU's key features are:
>>
>> 1. unlike any other partial replication protocol we are aware of, GMU
>> is the first distributed multi-versioned based partial replication
>> protocol that does not rely on a single global clock in order to
>> determine consistent snapshots. Conversely, the protocol guarantees
>> to involve only the nodes that maintain data accessed by a committing
>> transaction T (a property that is known in literature as
>> "genuineness"). This is a property that is crucial, in our opinion,
>> to achieve high scalability.
>>
>> 2. read-only tranasctions are never aborted, and do not need to be
>> validated at commit time, making them very fast. Read-only
>> transactions are guaranteed to observe a consistent snapshot of the
>> data using a novel mechanism based on vector clocks. Note that in
>> order to achieve this results we integrated in ISPN a multiversion
>> concurrency control, very similar to the one used in PostgreSQL or
>> JVSTM, that maintains multiple data item versions tagged with scalars
>> per each key.
>>
>> 3. The consistency guarantees ensured by GMU are a variant of classic
>> 1-Copy-Serialiability (1CS), and, more precisely, "extended update
>> serializable" (EUS). You can check the tech. report in attach for
>> more details on this, but, roughly speaking, US guarantees that
>> update transactions execute according to 1CS. Concurrent read-only
>> transactions, instead, may observe the updates generated by two
>> *non-conflicting* update transactions in different order.
>> In practice, we could not think of any realistic application for
>> which the schedules admitted by US would represent an issue, which
>> leads us to argue that US is, in practical settings, as good as 1CS,
>> but brings the key advantage of allowing way more scalable (genuine)
>> implementations.
>>
>> We have evaluated GMU performance using up to 20 physical machines in
>> our in-house cluster, and in 40 VMs in the FutureGrid (and we are
>> currently trying to use more VMs in FutureGrid to see if we can make
>> it scale up to hundreds of machines... we'll keep you posted on
>> this!) with the YCSB (https://github.com/brianfrankcooper/YCSB/wiki)
>> and TPC-C benchmarks.
>>
>> Our experimental results show that in low conflict scenarios, the
>> protocol performs as good as the existing Repeatable Read
>> implementation... and actually, in some scenarios, even slightly
>> better, given that GMU spares the cost of saving the values read in
>> the transactional context, unlike the existing Repeatable Read
>> implementation.
>>
>> In high contention scenarios, GMU does pay a higher toll in terms of
>> aborts, but it still drastically outperform classic non-genuine MVCC
>> implementations as the size of the system grows. Also, we've a bunch
>> of ideas on how to improve GMU performance in high contention
>> scenarios... but that's another story!
>>
>> You find the technical report at this url:
>>
>> http://www.inesc-id.pt/ficheiros/publicacoes/7549.pdf
>>
>> Comments are more than welcome of course!
>>
>> Cheers,
>>
>> Paolo
>>
>> --
>>
>> Paolo Romano, PhD
>> Coordinator of the Cloud-TM ICT FP7 Project (www.cloudtm.eu)
>> Senior Researcher @ INESC-ID (www.inesc-id.pt)
>> Invited Professor @ Instituto Superior Tecnico (www.ist.utl.pt)
>> Rua Alves Redol, 9
>> 1000-059, Lisbon Portugal
>> Tel. + 351 21 3100300
>> Fax + 351 21 3145843
>> Webpage http://www.gsd.inesc-id.pt/~romanop
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org <mailto:infinispan-dev at lists.jboss.org>
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org <mailto:manik at jboss.org>
> twitter.com/maniksurtani <http://twitter.com/maniksurtani>
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20111210/be9b3459/attachment.html
More information about the infinispan-dev
mailing list