[infinispan-dev] Triangle and ISPN-3918

Dan Berindei dan.berindei at gmail.com
Mon Nov 14 11:06:20 EST 2016


On Mon, Nov 14, 2016 at 4:49 PM, Pedro Ruivo <pedro at infinispan.org> wrote:
>
>
> On 14-11-2016 14:26, Radim Vansa wrote:
>> Hi,
>>
>> I was thinking about ISPN-3918 [1] and I've realized that while this
>> happens in current implementation only rarely during state transfer,
>> with Triangle v4 this could happen more often.
>>
>> Conditional command is always executed on primary owner, and so far
>> during the execution of conditional command (incl. replication to
>> backup-owners) the other commands to the same key were blocking in the
>> locking layer. Triangle v4 removes this blocking, and if in thread T1
>> you do:
>>
>> T1: replace(key, A, B)
>>
>> and in second thread T2
>>
>> T2: replace(key, A, C)
>> T2: get(key)
>>
>> the T2.replace can now fail before the T1.replace (successful) is
>> replicated to backup owner. When T2 is, by chance, the backup owner, the
>> T2.replace completes with false, the T2.get will be served locally and
>> it will still returns A.
>>
>> We should decide if this is an issue, and either close ISPN-3918 (not a
>> bug) or think about triangle routing of unsuccessful commands.
>
> well... I think we could send the unsuccessful ack in FIFO order(*1). In
> this way, it would force the backup owner to process the T1 operation
> before processing the ack. get() will then return the correct value.
>
> *1 or send only in FIFO when the backup owner is the originator and the
> command is unsuccessful.
> *1 or merge the ack command + backup-write command and send them in FIFO
>

Merging sounds like it would send too much extra stuff to the
originator. Sending only the ack command to the originator when it's
also a backup owner (and making it FIFO) sounds much better :)

OTOH, having T2 run on the backup owner guarantees that get() will
look up the key locally, but there is a chance of that happening when
T2 runs on any non-owner. So I don't think making the ack command FIFO
would really solve the problem.

The more I think about it, the more it seems this bug is just another
example of distributed caches not having session consistency [1]. The
fact is that distributed caches allow read operations to return the
values of concurrent writes in a any order, and having one of those
reads be also a write muddies the water a bit, but doesn't really
change anything. (Except, of course, an implementation that makes it
preserve the order most of the time in master.)

I vote to close ISPN-3918, but I would like to open another issue (or
reuse this one) to add a "force consistent read"
operation/flag/configuration that would force the cache to read the
value from the primary owner. We've been talking about this a lot, and
at the very least we need to have the option so we know whether users
actually choose it.

[1]: https://github.com/infinispan/infinispan/wiki/Consistency-guarantees-in-Infinispan#41-non-transactional


More information about the infinispan-dev mailing list