Re: [infinispan-dev] IRC meeting

Wednesday, 2 May 2012

On Wed, May 2, 2012 at 1:34 PM, Pedro Ruivo <pruivo(a)gsd.inesc-id.pt&gt; wrote:
...
 hi,

 On 5/2/12 11:29 AM, Dan Berindei wrote:
> Hi guys
>
> We're getting closer...
>
> On Wed, May 2, 2012 at 10:53 AM, Pedro Ruivo&lt;pruivo(a)gsd.inesc-id.pt&gt;  wrote:
>> Hi Dan,
>>
>> comment inline :)
>>
>> Cheers,
>> Pedro
>>
>> On 5/2/12 8:36 AM, Dan Berindei wrote:
>>> Hi Paolo
>>>
>>> On Tue, May 1, 2012 at 8:13 PM, Paolo Romano&lt;romano(a)inesc-id.pt&gt;  
 wrote:
>>>> Hi Dan,
>>>>
>>>> the easiest way to me seems to treat the state transfer as a special
>>>> transaction that is TO-broadcast using the sequencer, as you have also
>>>> been suggesting in your email.
>>>>
>>>> I guess that this way you may even get rid of the ST lock, as
>>>> transactions that request a commit after a ST is started will be
>>>> TO-delivered after the "ST transaction", which will:
>>>>       a) start transfering state only after having waited for the
>>>> completion of the txs TO-delivered before the ST transaction, and
>>>>       b) prevent the thread in charge of managing TO-delivered
>>>> transactions from processing transactions that are TO-delivered after
>>>> the ST transaction, until the ST transaction is finished on that node.
>>>>
>>>> Let me try to clarify this by outlining a possible protocol:
>>>> A) the coordinator TO-broadcasts the PREPARE_VIEW
>>>> B) when this message is TO-delivered at a node n, the thread (say thread
>>>> t) that is responsible of managing incoming TO messages, enqueues itself
>>>> on the count-down latches associated with the transactions that are
>>>> being appied by the TO thread pool. This ensures that every node starts
>>>> transferring state only after having applied all the updates of the
>>>> previously TO delivered xacts.
>>>> C) The state transfer is activated at node n by the same thread t
>>>> responsible of processing incoming TO messages. This guarantees that no
>>>> updates will be performed while the state is being transferred
>>> This part isn't very clear to me - aren't the tx latches necessary
>>> exactly because there may be more than one thread processing incoming
>>> TO messages?
>> The tx latches is used when write skew is enabled and ispn is executing
>> distributed transaction (via XA Resource). But, only one thread is
>> delivering the message, and only in TotalOrderInterceptor, it puts the
>> transaction in a thread pool.
>>
> Yeah, obviously you're both right... Mircea explained this to me some
> time ago but I completely forgot about it.
>
>> For example:
>>
>> thread T: delivers PrepareCommand with Tx invoking handle() in the
>> CommandAwareRPCDispatcher.
>> T: when Tx arrives to TotalOrderIntercerptor, it uses the latches to
>> ensure the deliver order, and then puts Tx in the thread pool
>> another thread T': start processing tx
>> T: returns from handle() and picks the next transaction
>>
>> What Paolo is suggesting is:
>> 1) sending PrepareView through Sequencer, in synchronous mode
>>
>> thread T: delivers PrepareView invoking handle()
>> T: PrepareView, asks for the latches of previous transactions
>> T: PrepareView waits in all of them //this will block thread T, so no
>> transactions are deliver in this phase
>> T: the same thread, executes the state transfer, pushing the state //the
>> Apply State messages should be sent with OOB flag if possible
>> T: when finish, it returns from handle() //jgroups will pick the return
>> value and send it back to coordinator as a response. Paolo named ACK_VIEW
> Ok, I thought you were using the in-VM communication between the user
> thread and the TO thread because of some limitation of SEQUENCER, but
> I realize now it was just an optimization to avoid receiving responses
> from all the nodes. I think I can change JGroupsTransport to not mark
> sync messages as OOB when totalOrder is also set to true (without
> breaking anything else), so this should indeed work without any
> changes to state transfer itself.
>
 Agree :)
>> T: pick the next transaction
>>
>> When JGroups receives all the responses, it will unblock the synchronous
>> remote invocation and then the coordinator sends the Commit/Rollback View
>>
>> In my opinion, this should work.
>>
> I think this will properly block prepare commands, but not necessarily
> commit/rollback commands for txs that have already been prepared -
> since they could be OOB.
>
> Am I missing something again?
>
 yes that the point. We don't want to block the commit or rollback,
 otherwise, the previous delivered prepare (transaction) will not finish.
 And we want them to finish before starting sending the state.

 Right?

Yeah, I saw that once I looked closer at the code of
ParallelTotalOrderManager. I guess I'm still thinking in terms of
non-TO transactions, I was expecting the commit command to write
something to the data container :)

This means we don't really have an option about the state transfer
lock - we can't acquire it for commits on the originator, like we do
without TO, or the remote prepares already in progress would never
finish. I guess I got lucky, because I never saw this happening in the
test suite :)

One question I haven't been able to figure out is about the unpaired
prepares and commits that arrive to a joiner before the PREPARE_VIEW
command. (Unpaired because the joiner didn't receive the corresponding
commit/prepare.) I think this could even happen for commit commands
that arrive after PREPARE_VIEW. What happens to those?

Off-topic, in non-TO synchronous caches we have a problem when
queueing is enabled in the OOB thread pool: "active" prepare commands
waiting for queued commit commands, leading to a deadlock. I was
thinking you may have the same problem with your thread pool, but it
looks like you avoid the issue because the commits use JGroups'
regular/OOB thread pool instead of the TO one. I wonder if we could
make our 2PC code use separate 2 thread pools for prepare and commit
as well...

Cheers
Dan

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] IRC meeting