Re: [infinispan-dev] IRC meeting

Wednesday, 2 May 2012

Hi Paolo

On Tue, May 1, 2012 at 8:13 PM, Paolo Romano <romano(a)inesc-id.pt&gt; wrote:
...
 Hi Dan,

 the easiest way to me seems to treat the state transfer as a special
 transaction that is TO-broadcast using the sequencer, as you have also
 been suggesting in your email.

 I guess that this way you may even get rid of the ST lock, as
 transactions that request a commit after a ST is started will be
 TO-delivered after the "ST transaction", which will:
     a) start transfering state only after having waited for the
 completion of the txs TO-delivered before the ST transaction, and
     b) prevent the thread in charge of managing TO-delivered
 transactions from processing transactions that are TO-delivered after
 the ST transaction, until the ST transaction is finished on that node.

 Let me try to clarify this by outlining a possible protocol:
 A) the coordinator TO-broadcasts the PREPARE_VIEW
 B) when this message is TO-delivered at a node n, the thread (say thread
 t) that is responsible of managing incoming TO messages, enqueues itself
 on the count-down latches associated with the transactions that are
 being appied by the TO thread pool. This ensures that every node starts
 transferring state only after having applied all the updates of the
 previously TO delivered xacts.
 C) The state transfer is activated at node n by the same thread t
 responsible of processing incoming TO messages. This guarantees that no
 updates will be performed while the state is being transferred 
This part isn't very clear to me - aren't the tx latches necessary
exactly because there may be more than one thread processing incoming
TO messages?

...
 D) An ACK_VIEW is sent to the coordinator.

This is the part that I didn't like about this (otherwise very cool)
idea. Making the PREPARE_VIEW commands asynchronous + adding a new
ACK_VIEW command would mean making the cache view installation process
on the coordinator a multithreaded affair, and changing a little too
much code in that region (considering NBST touches that area as well).

...
 Notice that:
 - in parallel application level threads may request to commit new
 transactions, and TO-broadcast their requests, which will be however
 enqueued by JGroups, as the thread "t" will not process them until it
 has finalized ST. 
Same as above, I think we'd need a tx latch for the ST transaction as
well, and all txs will need to enqueue on that latch.

...
 - if you wanted to squeeze in an extra-optimization, after sending
the
 ACK_VIEW you may already let thread t conclude the ST transaction (after
 having released any count-down latch that it may have acquired in phase
 B above), without waiting for a COMMIT_VIEW. The rationale is that node
 has finished transferring its portion of state, and may already start
 processing the transactions that have been totally ordered after the ST
 request. I admit however not to have carefully thought about the case of
 view changes being cancelled, as I don't know exactly if/why/how you are
 doing this.

When a node leaves the cluster/cache during state transfer, it means
it may not have sent all the state that it was responsible for to the
joiners. Since we have no way to verify if it actually did send all
the state, we just assume that it didn't and we cancel the state
transfer/cache view installation.

If we did not have a COMMIT_VIEW command, some nodes would notice the
node leaving and they would cancel the cache view installation, while
some nodes would not notice the leaver and install the new cache view
anyway. E.g. on a join, the joiner may not notice the leaver and may
start handling transactions, even though it did not receive all the
state (and may never receive all the state). If a second node joins,
we need all the nodes to have the same cache view installed in order
to determine who is responsible for sending each "chunk" of state to
the joiner - if we didn't have the same cache view on all nodes then
some chunks will have 2 "pushing owners", and some chunks will have
none.

Cheers
Dan

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] IRC meeting