[infinispan-dev] Atomic operations and transactions

Tue Jul 5 09:49:48 EDT 2011

On Tue, Jul 5, 2011 at 4:04 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
> 2011/7/5 Dan Berindei <dan.berindei at gmail.com>:
>> On Tue, Jul 5, 2011 at 1:39 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>> 2011/7/5 Dan Berindei <dan.berindei at gmail.com>:
>>>> Here is a contrived example:
>>>>
>>>> 1. Start tx Tx1
>>>> 2. cache.get("k") -> "v0"
>>>> 3. cache.replace("k", "v0", "v1")
>>>> 4. gache.get("k") -> ??
>>>>
>>>> With repeatable read and suspend/resume around atomic operations, I
>>>> believe operation 4 would return "v0", and that would be very
>>>> surprising for a new user.
>>>> So I'd rather require explicit suspend/resume calls to make sure
>>>> anyone who uses atomic operations in a transaction understands what
>>>> results he's going to get.
>>>
>>> The problem is that as a use case it makes no sense to use an atomic
>>> operation without evaluating the return value.
>>> so 3) should actually read like
>>>
>>> 3. boolean done = cache.replace("k", "v0", "v1")
>>> and based on this value, the application would branch in some way, and
>>> so acquiring locks and waiting for each other is not enough, we can
>>> only support this if write skew checks are enabled, and mandate the
>>> full operation to rollback in the end. That might be one option, but I
>>> really don't like to make it likely to rollback transactions, I'd
>>> prefer to have an alternative like a new flag which enforces a "fresh
>>> read" skipping the repeatable read guarantees. Of course this wouldn't
>>> work if we're not actually sending the operations to the key owners,
>>> so suspending the transaction is a much nicer approach from the user
>>> perspective. Though I agree this behaviour should be selectable.
>>>
>>
>> Ok, I'm slowly remembering your arguments... do you think the "fresh
>> read" flag should be available for all operations, or does it make
>> sense to make it an internal flag that only the atomic operations will
>> use?
>>
>> To summarize, with this example:
>> 1. Start tx
>> 2. cache.get("k") -> "v0"
>> 3. cache.replace("k", "v0", "v1")
>> 4. gache.get("k") -> ??
>> 5. Commit tx
>>
>> The options we could support are:
>> a. Tx suspend: no retries, but also no rollback for replace() and 4)
>> will not see the updated value
>
> might work, but looks like a crazy user experience.
>

We could "support" it by letting the user suspend/resume the tx
manually. Then the only people experiencing this would be the ones who
explicitly requested it.

>> b. Optimistic locking + write skew check: if the key is modified by
>> another tx between 2) and 5), the entire transaction has to be redone
>
> might work as well, since people opted in for "optimistic" they should
> be prepared to experience failures.
> I'm not sure what the "+" stands for, how can you have optimistic
> locking without write skew checks?
>

I was never sure in what situations we do the write skew check, so I
thought I'd mention it to be clear.
Certainly atomic operations would never work without write skew
checks, but I don't think they're required for regular writes.

>> c. Optimistic locking + write skew check + fresh read: we only have to
>> redo the tx if the key is modified by another tx between 3) and 5)
>
> in this case we're breaking the repeatable read guarantees, so we
> should clarify this very well.
>

Yes, it would definitely need to be documented that atomic operations
always use read_committed if we go this way.

>> d. Pessimistic locking: if the key is modified between 2) and 5), the
>> entire transaction has to be redone
>
> I don't understand what's pessimistic about this? To be pessimistic it
> would attempt to guarantee success by locking at 2): during the get
> operation, before returning the value.
> Also "if they key is modified" implies write skew checks, so how would
> this be different than previous proposals?
> Generally as a user if I'm opting in for a pessimistic lock the only
> exception I'm prepared to handle is a timeout, definitely not a "try
> again, the values changed".
>

Sorry, I meant if the key is modified between 2) and 3), since 3)
would not read the key again we'd have to check the value in the
prepare phase.
But you're right that it doesn't really make sense, since we're going
to the owner to get the lock anyway we might as well check the value
in one go.

>> e. Pessimistic locking + fresh read: no redo, but decreased throughput
>> because we hold the lock between 3) and 5)
>
> I assume you really mean to do explicit pessimistic locking:
> 1. Start tx
> 2. cache.lock("k");
> 3. cache.get("k") -> "v0"
> 4. cache.replace("k", "v0", "v1") /// -> throw an exception if we're
> not owning the lock
> 5. gache.get("k") -> ??
> 6. Commit tx
>

No, I meant the replace call would do the locking itself and hold the
lock until the end of the tx, just like a regular put.

This option would be very similar to what we currently have, since any
write command already acquires a lock on the key. The only difference
would be that value check would be done on the main data owner, with
the current value of the key instead of the value in the invocation
context.

One big gotcha is that we will only update the value in the invocation
context if the replace succeeds, so the user will have to restart the
whole transaction if they want to see the new value. Perhaps we could
warn the user if he does a get(k) after a replace(k, v0, v1) failed.

Dan