Hi Manik,
in order to provide correct answers to your questions, I want to ask you
the following:
- Why do you say that the current MVCC implementation is non-genuine?
I think that for any transaction T, only the sites that replicate the
data items read/written by T exchange messages during the execution of T
and in order to decide the final outcome of T. Is it correct?
- About JTA semantics, what do you mean by the term "full"? During the
integration of GMU, we have not changed the way a transaction was
already managed in Infinispan (e.g. 2-PC, interaction between
Transaction Manager and XAResource), so I think that the answer is yes.
But since I have not a deep knowledge about JTA specification, maybe
there is some aspect that I have not considered.
Thank you for your feedback!
Cheers
Sebastiano
This is very interesting, Paolo. In terms of numbers of RPC, how
does
this compare with a classic, non-genuine MVCC we currently have in
Infinispan? I presume you still support full JTA semantics over GMU?
Cheers
Manik
> Hi,
>
> within the context Cloud-TM project we have developed a new partial
> replication algorithm (corresponding to distribution mode of
> Infinispan) that guarantees serializability in a very scalable
> fashion. We have called the algorithm GMU, Genuine Multiversion
> Update Serializability, and we've integrated it into Infinispan (5.0.0).
>
> The source code is available on github:
>
>
http://github.com/cloudtm/infinispan-5.0.0.SERIALIZABLE
>
> GMU's key features are:
>
> 1. unlike any other partial replication protocol we are aware of, GMU
> is the first distributed multi-versioned based partial replication
> protocol that does not rely on a single global clock in order to
> determine consistent snapshots. Conversely, the protocol guarantees
> to involve only the nodes that maintain data accessed by a committing
> transaction T (a property that is known in literature as
> "genuineness"). This is a property that is crucial, in our opinion,
> to achieve high scalability.
>
> 2. read-only tranasctions are never aborted, and do not need to be
> validated at commit time, making them very fast. Read-only
> transactions are guaranteed to observe a consistent snapshot of the
> data using a novel mechanism based on vector clocks. Note that in
> order to achieve this results we integrated in ISPN a multiversion
> concurrency control, very similar to the one used in PostgreSQL or
> JVSTM, that maintains multiple data item versions tagged with scalars
> per each key.
>
> 3. The consistency guarantees ensured by GMU are a variant of classic
> 1-Copy-Serialiability (1CS), and, more precisely, "extended update
> serializable" (EUS). You can check the tech. report in attach for
> more details on this, but, roughly speaking, US guarantees that
> update transactions execute according to 1CS. Concurrent read-only
> transactions, instead, may observe the updates generated by two
> *non-conflicting* update transactions in different order.
> In practice, we could not think of any realistic application for
> which the schedules admitted by US would represent an issue, which
> leads us to argue that US is, in practical settings, as good as 1CS,
> but brings the key advantage of allowing way more scalable (genuine)
> implementations.
>
> We have evaluated GMU performance using up to 20 physical machines in
> our in-house cluster, and in 40 VMs in the FutureGrid (and we are
> currently trying to use more VMs in FutureGrid to see if we can make
> it scale up to hundreds of machines... we'll keep you posted on
> this!) with the YCSB (
https://github.com/brianfrankcooper/YCSB/wiki)
> and TPC-C benchmarks.
>
> Our experimental results show that in low conflict scenarios, the
> protocol performs as good as the existing Repeatable Read
> implementation... and actually, in some scenarios, even slightly
> better, given that GMU spares the cost of saving the values read in
> the transactional context, unlike the existing Repeatable Read
> implementation.
>
> In high contention scenarios, GMU does pay a higher toll in terms of
> aborts, but it still drastically outperform classic non-genuine MVCC
> implementations as the size of the system grows. Also, we've a bunch
> of ideas on how to improve GMU performance in high contention
> scenarios... but that's another story!
>
> You find the technical report at this url:
>
>
http://www.inesc-id.pt/ficheiros/publicacoes/7549.pdf
>
> Comments are more than welcome of course!
>
> Cheers,
>
> Paolo
>
> --
>
> Paolo Romano, PhD
> Coordinator of the Cloud-TM ICT FP7 Project (
www.cloudtm.eu)
> Senior Researcher @ INESC-ID (
www.inesc-id.pt)
> Invited Professor @ Instituto Superior Tecnico (
www.ist.utl.pt)
> Rua Alves Redol, 9
> 1000-059, Lisbon Portugal
> Tel. + 351 21 3100300
> Fax + 351 21 3145843
> Webpage
http://www.gsd.inesc-id.pt/~romanop
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org <mailto:infinispan-dev@lists.jboss.org>
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org <mailto:manik@jboss.org>
twitter.com/maniksurtani <
http://twitter.com/maniksurtani>
Lead, Infinispan
http://www.infinispan.org
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev