On 29/07/14 23:35, Sanne Grinovero wrote:
The strategy I've proposed is only to be applied for the
communication
from the primary owner to its backups:
the value to be written is well known as it's the primary owner which
defines it unilaterally (for example if there is an atomic replacement
to be computed)
and there is no need for extra RPCs as the sequence is not related to
a group of changes but for the specific entry only.
How would this work with TXs involving multiple keys on different
primary owners ? Each owner replicates with seqnos to the backup owners,
so changes for single keys are received in order, but do we (need to)
guarantee that TXs consistency is preserved ? In other words, do we
preserve isolation: all changes of a TX are observed at the same logical
time, across multiple backup owners ?
There is no such thing as a need for consensus across owners, nor
need
for a central source for sequences.
Also I don't see it as an alternative to TOA, I rather expect it to
work nicely together: when TOA is enabled you could trust the
originating sequence source rather than generate a per-entry sequence,
and in neither case you need to actually use a Lock.
I haven't thought how the sequences would need to interact (if they
need), but they seem complementary to resolve different aspects, and
also both benefit from the same cleanup and basic structure.
>> Another aspect is that the "user thread" on the primary owner needs to
>> wait (at least until we improve further) and only proceed after ACK
>> from backup nodes, but this is better modelled through a state
>> machine. (Also discussed in Farnborough).
>
>
> To be clear, I don't think keeping the user thread on the originator blocked
> until we have the write confirmations from all the backups is a problem - a
> sync operation has to block, and it also serves to rate-limit user
> operations.
There are better ways to rate-limit than to make all operations slow;
we don't need to block a thread, we need to react on the reply from
the backup owners.
Agreed. I think Dan mentioned it as a side effect.
You still have an inherent rate-limit in the outgoing packet queues:
if these fill up, then and only then it's nice to introduce some back
pressure.
> The problem appears when the originator is not the primary owner, and the
> thread blocking for backup ACKs is from the remote-executor pool (or OOB,
> when the remote-executor pool is exhausted).
Not following. I guess this is out of scope now that I clarified the
proposed solution is only to be applied between primary and backups?
>>
>> It's also conceptually linked to:
>> -
https://issues.jboss.org/browse/ISPN-1599
>> As you need to separate the locks of entries from the effective user
>> facing lock, at least to implement transactions on top of this model.
>
>
> I think we fixed ISPN-1599 when we changed passivation to use
> DataContainer.compute(). WDYT Pedro, is there anything else you'd like to do
> in the scope of ISPN-1599?
>
>>
>> I expect this to improve performance in a very significant way, but
>> it's getting embarrassing that it's still not done; at the next face
>> to face meeting we should also reserve some time for retrospective
>> sessions.
>
>
> Implementing the state machine-based interceptor stack may give us a
> performance boost, but I'm much more certain that it's a very complex, high
> risk task... and we don't have a stable test suite yet :)
Cleaning up and removing some complexity such as
TooManyExecutorsException might help to get it stable, and keep it
there :)
BTW it was quite stable for me until you changed the JGroups UDP
default configuration.
Sanne
--
Bela Ban, JGroups lead (
http://www.jgroups.org)