You're talking about the case where P applies the PUT, and sends an ACK
back to O, but the async updates to the Bs are received by only a subset
(or none) of the Bs, and then P crashes.
As I was referring about the non-transactional case, wouldn't this be
fine? Or do we want the *non-transactional* case to be an atomic update
of P and all Bs? IMO, the latter should be done as part of a TX, not for
the non-transactional case.
So I think we need to come up with a concise definition of what the
transactional versus non-transaction semantics are.
But even if we go with a design where O waits for ACKs from *all* Bs, we
can still end up with inconsistencies; e.g. when not all Bs received the
updates. O will fail the PUT, but the question is what do we do in such
a case? Re-submit the PUT?
On 27/11/15 11:12, Radim Vansa wrote:
The update needs to be applied to *all* owners before the call
returns
on O. With your strategy, P could apply update, send ACK but the async
backup updates would not be delivered on Bs; so an ACKed update would
get completely lost.
I don't say that these async Bs are not possible, but not in the basic
case - for default configuration, we need to keep the guarantees.
Radim
On 11/27/2015 10:34 AM, Bela Ban wrote:
> Adding to what Radim wrote (below), would the following make sense
> (conditions: non-TX, P != O && O != B)?
>
> The lock we acquire on P is actually used to establish an ordering for
> updates to the Bs. So this is very similar to SEQUENCER, expect that we
> have a sequencer (P) *per key*.
>
> Phase 1
> -------
> - O sends a PUT(x) message to P
>
> Phase 2
> -------
> - P adds PUT(x) to a queue and returns (freeing the up-thread)
> - A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs
> (possible optimization: send updates to the same key sets as batches)
> - PUT(x) is applied locally and an ACK is sent back to O
>
> O times out and throws an exception if it doesn't receive the ack from P.
>
> This would reduce the current 4 phases (for the above conditions) to 2,
> plus the added latency of processing PUT(x) in the queue. However, we'd
> get rid of the put-while-holding-the-lock issue.
>
> P's updates to the Bs are FIFO ordered, therefore all we need to do is
> send the update down into UNICAST3 (or NAKACK2, if we use multicasts)
> which guarantees ordering. Subsequent updates are ordered according to
> send order. The updates are guaranteed to be retransmitted as long as P
> is alive.
>
> If P crashes before returning the ack to O, or while updating the Bs,
> then O will time out and throw an exception. And, yes, there can be
> inconsistencies, but we're talking about the non-TX case. Perhaps O
> could resubmit PUT(x) to the new P.
>
> I don't know how this behaves wrt rebalancing: are we flushing pending
> updates before installing the new CH?
>
> Thoughts?
>
>
>> I think that the source of optimization is that once primary decides to
>> backup the operation, he can forget about it and unlock the entry. So,
>> we don't need any ACK from primary unless it's an exception/noop
>> notification (as with conditional ops). If primary waited for ACK from
>> backup, we wouldn't save anything.
>
--
Bela Ban, JGroups lead (
http://www.jgroups.org)