Re: [infinispan-dev] DIST-SYNC, put(), a problem and a solution

Wednesday, 30 July 2014

On 29/07/14 23:35, Sanne Grinovero wrote:

...
 The strategy I've proposed is only to be applied for the
communication
 from the primary owner to its backups:
 the value to be written is well known as it's the primary owner which
 defines it unilaterally (for example if there is an atomic replacement
 to be computed)
 and there is no need for extra RPCs as the sequence is not related to
 a group of changes but for the specific entry only. 
How would this work with TXs involving multiple keys on different 
primary owners ? Each owner replicates with seqnos to the backup owners, 
so changes for single keys are received in order, but do we (need to) 
guarantee that TXs consistency is preserved ? In other words, do we 
preserve isolation: all changes of a TX are observed at the same logical 
time, across multiple backup owners ?

...
 There is no such thing as a need for consensus across owners, nor
need
 for a central source for sequences.

 Also I don't see it as an alternative to TOA, I rather expect it to
 work nicely together: when TOA is enabled you could trust the
 originating sequence source rather than generate a per-entry sequence,
 and in neither case you need to actually use a Lock.
 I haven't thought how the sequences would need to interact (if they
 need), but they seem complementary to resolve different aspects, and
 also both benefit from the same cleanup and basic structure.

>> Another aspect is that the "user thread" on the primary owner needs to
>> wait (at least until we improve further) and only proceed after ACK
>> from backup nodes, but this is better modelled through a state
>> machine. (Also discussed in Farnborough).
>
>
> To be clear, I don't think keeping the user thread on the originator blocked
> until we have the write confirmations from all the backups is a problem - a
> sync operation has to block, and it also serves to rate-limit user
> operations.

 There are better ways to rate-limit than to make all operations slow;
 we don't need to block a thread, we need to react on the reply from
 the backup owners. 
Agreed. I think Dan mentioned it as a side effect.

...
 You still have an inherent rate-limit in the outgoing packet queues:
 if these fill up, then and only then it's nice to introduce some back
 pressure.

> The problem appears when the originator is not the primary owner, and the
> thread blocking for backup ACKs is from the remote-executor pool (or OOB,
> when the remote-executor pool is exhausted).

 Not following. I guess this is out of scope now that I clarified the
 proposed solution is only to be applied between primary and backups?

>>
>> It's also conceptually linked to:
>>   - https://issues.jboss.org/browse/ISPN-1599
>> As you need to separate the locks of entries from the effective user
>> facing lock, at least to implement transactions on top of this model.
>
>
> I think we fixed ISPN-1599 when we changed passivation to use
> DataContainer.compute(). WDYT Pedro, is there anything else you'd like to do
> in the scope of ISPN-1599?
>
>>
>> I expect this to improve performance in a very significant way, but
>> it's getting embarrassing that it's still not done; at the next face
>> to face meeting we should also reserve some time for retrospective
>> sessions.
>
>
> Implementing the state machine-based interceptor stack may give us a
> performance boost, but I'm much more certain that it's a very complex, high
> risk task... and we don't have a stable test suite yet :)

 Cleaning up and removing some complexity such as
 TooManyExecutorsException might help to get it stable, and keep it
 there :)
 BTW it was quite stable for me until you changed the JGroups UDP
 default configuration.

 Sanne 
-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [infinispan-dev] DIST-SYNC, put(), a problem and a solution