The update needs to be applied to *all* owners before the call returns
on O. With your strategy, P could apply update, send ACK but the async
backup updates would not be delivered on Bs; so an ACKed update would
get completely lost.
I don't say that these async Bs are not possible, but not in the basic
case - for default configuration, we need to keep the guarantees.
Radim
On 11/27/2015 10:34 AM, Bela Ban wrote:
Adding to what Radim wrote (below), would the following make sense
(conditions: non-TX, P != O && O != B)?
The lock we acquire on P is actually used to establish an ordering for
updates to the Bs. So this is very similar to SEQUENCER, expect that we
have a sequencer (P) *per key*.
Phase 1
-------
- O sends a PUT(x) message to P
Phase 2
-------
- P adds PUT(x) to a queue and returns (freeing the up-thread)
- A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs
(possible optimization: send updates to the same key sets as batches)
- PUT(x) is applied locally and an ACK is sent back to O
O times out and throws an exception if it doesn't receive the ack from P.
This would reduce the current 4 phases (for the above conditions) to 2,
plus the added latency of processing PUT(x) in the queue. However, we'd
get rid of the put-while-holding-the-lock issue.
P's updates to the Bs are FIFO ordered, therefore all we need to do is
send the update down into UNICAST3 (or NAKACK2, if we use multicasts)
which guarantees ordering. Subsequent updates are ordered according to
send order. The updates are guaranteed to be retransmitted as long as P
is alive.
If P crashes before returning the ack to O, or while updating the Bs,
then O will time out and throw an exception. And, yes, there can be
inconsistencies, but we're talking about the non-TX case. Perhaps O
could resubmit PUT(x) to the new P.
I don't know how this behaves wrt rebalancing: are we flushing pending
updates before installing the new CH?
Thoughts?
> I think that the source of optimization is that once primary decides to
> backup the operation, he can forget about it and unlock the entry. So,
> we don't need any ACK from primary unless it's an exception/noop
> notification (as with conditional ops). If primary waited for ACK from
> backup, we wouldn't save anything.
--
Radim Vansa <rvansa(a)redhat.com>
JBoss Performance Team