[infinispan-dev] The "Triangle" pattern for reducing Put latency

Bela Ban bban at redhat.com
Fri Nov 27 04:34:23 EST 2015


Adding to what Radim wrote (below), would the following make sense 
(conditions: non-TX, P != O && O != B)?

The lock we acquire on P is actually used to establish an ordering for 
updates to the Bs. So this is very similar to SEQUENCER, expect that we 
have a sequencer (P) *per key*.

Phase 1
-------
- O sends a PUT(x) message to P

Phase 2
-------
- P adds PUT(x) to a queue and returns (freeing the up-thread)
- A thread dequeues PUT(x) and sends an (async) UPDATE message to all Bs
   (possible optimization: send updates to the same key sets as batches)
- PUT(x) is applied locally and an ACK is sent back to O

O times out and throws an exception if it doesn't receive the ack from P.

This would reduce the current 4 phases (for the above conditions) to 2, 
plus the added latency of processing PUT(x) in the queue. However, we'd 
get rid of the put-while-holding-the-lock issue.

P's updates to the Bs are FIFO ordered, therefore all we need to do is 
send the update down into UNICAST3 (or NAKACK2, if we use multicasts) 
which guarantees ordering. Subsequent updates are ordered according to 
send order. The updates are guaranteed to be retransmitted as long as P 
is alive.

If P crashes before returning the ack to O, or while updating the Bs, 
then O will time out and throw an exception. And, yes, there can be 
inconsistencies, but we're talking about the non-TX case. Perhaps O 
could resubmit PUT(x) to the new P.

I don't know how this behaves wrt rebalancing: are we flushing pending 
updates before installing the new CH?

Thoughts?


> I think that the source of optimization is that once primary decides to
> backup the operation, he can forget about it and unlock the entry. So,
> we don't need any ACK from primary unless it's an exception/noop
> notification (as with conditional ops). If primary waited for ACK from
> backup, we wouldn't save anything.


-- 
Bela Ban, JGroups lead (http://www.jgroups.org)



More information about the infinispan-dev mailing list