[infinispan-dev] Reliability of return values

Dan Berindei dan.berindei at gmail.com
Wed May 14 03:36:55 EDT 2014


On Tue, May 13, 2014 at 6:40 PM, Radim Vansa <rvansa at redhat.com> wrote:

>  On 05/13/2014 03:58 PM, Dan Berindei wrote:
>
>
>
> On Mon, May 12, 2014 at 1:54 PM, Radim Vansa <rvansa at redhat.com> wrote:
>
>> @Dan: It's absolutely correct to do the further writes in order to make
>> the cache consistent, I am not arguing against that. You've fixed the
>> outcome (state of cache) well. My point was that we should let the user
>> know that the value he gets is not 100% correct when we already know
>> that - and given the API, the only option to do that seems to me as
>> throwing an exception.
>>
>
>  The problem, as I see it, is that users also expect methods that throw
> an exception to *not* modify the cache.
> So we would break some of the users' expectations anyway.
>
>
> When the response from primary owner does not arrive soon, we throw
> timeout exception and the cache is modified anyway, isn't it?
> If we throw ~ReturnValueUnreliableException, the user has at least some
> chance to react. Currently, for code requiring 100% reliable value, you
> can't do anything but ignore the return value, even for CAS operations.
>
>
Yes, but we don't expect the user to handle a TimeoutException in any
meaningful way. Instead, we expect the user to choose his hardware and
configuration to avoid timeouts, if he cares about consistency. How could
you handle an exception that tells you "I may have written the value you
asked me to in the cache, or maybe not. Either way, you will never know
what the previous value was. Muahahaha!" in an application that cares about
consistency?

But the proposed ReturnValueUnreliableException can't be avoided by the
user, it has to be handled every time the cluster membership changes. So it
would be more like WriteSkewException than TimeoutException. And when we
throw a WriteSkewException, we don't write anything to the cache.

Remember, most users do not care about the previous value at all - that's
the reason why JCache and our HotRod client don't return the previous value
by default. Those that do care about the previous value, use the
conditional write operations, and those already work (well, except for the
scenario below). So you would force everyone to handle an exception that
they don't care about.

It would make sense to throw an exception if we didn't return the previous
value by default, and the user requested the return value explicitly. But
we do return the value by default, so I don't think it would be a good idea
for us.


>
>
>>
>> @Sanne: I was not suggesting that for now - sure, value versioning is (I
>> hope) on the roadmap. But that's more complicated, I though just about
>> making an adjustment to the current implementation.
>>
>
>
> Actually, just keeping a history of values would not fix the the return
> value in all cases.
>
>  When retrying a put on the new primary owner, the primary owner would
> still have to compare our value with the latest value, and return the
> previous value if they are equal. So we could have something like this:
>
>  A is the originator, B is the primary owner, k = v0
> A -> B: put(k, v1)
> B dies before writing v, C is now primary owner
> D -> C: put(k, v1) // another put operation from D, with the same value
> C -> D: null
> A -> C: retry_put(k, v1)
> C -> A: v0 // C assumes A is overwriting its own value, so it's returning
> the previous one
>
>  To fix that, we'd need a unique version generated by the originator -
> kind of like a transaction id ;)
>
>
> Is it such a problem to associate unique ID with each write? History
> implementation seems to me like the more complicated part.
>

I also think maintaining a version history would be quite complicated, and
it also would make it harder for users to estimate their cache's memory
usage. That's why I was trying to show that it's not a panacea.



>   And to fix the HotRod use case, the HotRod client would have to be the
> one generating the version.
>
>
> I agree.
>
> Radim
>
>
>
>  Cheers
> Dan
>
>
>
>
>> Radim
>>
>> On 05/12/2014 12:02 PM, Sanne Grinovero wrote:
>> > I don't think we are in a position to decide what is a reasonable
>> > compromise; we can do better.
>> > For example - as Radim suggested - it might seem reasonable to have
>> > the older value around for a little while. We'll need a little bit of
>> > history of values and tombstones anyway for many other reasons.
>> >
>> >
>> > Sanne
>> >
>> > On 12 May 2014 09:37, Dan Berindei <dan.berindei at gmail.com> wrote:
>> >> Radim, I would contend that the first and foremost guarantee that put()
>> >> makes is to leave the cache in a consistent state. So we can't just
>> throw an
>> >> exception and give up, leaving k=v on one owner and k=null on another.
>> >>
>> >> Secondly, put(k, v) being atomic means that it either succeeds, it
>> writes
>> >> k=v in the cache, and it returns the previous value, or it doesn't
>> succeed,
>> >> and it doesn't write k=v in the cache. Returning the wrong previous
>> value is
>> >> bad, but leaving k=v in the cache is just as bad, even if the all the
>> owners
>> >> have the same value.
>> >>
>> >> And last, we can't have one node seeing k=null, then k=v, then k=null
>> again,
>> >> when the only write we did on the cache was a put(k, v). So trying to
>> undo
>> >> the write would not help.
>> >>
>> >> In the end, we have to make a compromise, and I think returning the
>> wrong
>> >> value in some of the cases is a reasonable compromise. Of course, we
>> should
>> >> document that :)
>> >>
>> >> I also believe ISPN-2956 could be fixed so that HotRod behaves just
>> like
>> >> embedded mode after the ISPN-3422 fix, by adding a RETRY flag to the
>> HotRod
>> >> protocol and to the cache itself.
>> >>
>> >> Incidentally, transactional caches have a similar problem when the
>> >> originator leaves the cluster: ISPN-3421 [1]
>> >> And we can't handle transactional caches any better than
>> non-transactional
>> >> caches until we expose transactions to the HotRod client.
>> >>
>> >> [1] https://issues.jboss.org/browse/ISPN-2956
>> >>
>> >> Cheers
>> >> Dan
>> >>
>> >>
>> >>
>> >>
>> >> On Mon, May 12, 2014 at 10:21 AM, Radim Vansa <rvansa at redhat.com>
>> wrote:
>> >>> Hi,
>> >>>
>> >>> recently I've stumbled upon one already expected behaviour (one
>> instance
>> >>> is [1]), but which did not got much attention.
>> >>>
>> >>> In non-tx cache, when the primary owner fails after the request has
>> been
>> >>> replicated to backup owner, the request is retried in the new
>> topology.
>> >>> Then, the operation is executed on the new primary (the previous
>> >>> backup). The outcome has been already fixed in [2], but the return
>> value
>> >>> may be wrong. For example, when we do a put, the return value for the
>> >>> second attempt will be the currently inserted value (although the
>> entry
>> >>> was just created). Same situation may happen for other operations.
>> >>>
>> >>> Currently, it's not possible to return the correct value (because it
>> has
>> >>> already been overwritten and we don't keep a history of values), but
>> >>> shouldn't we rather throw an exception if we were not able to fulfil
>> the
>> >>> API contract?
>> >>>
>> >>> Radim
>> >>>
>> >>> [1] https://issues.jboss.org/browse/ISPN-2956
>> >>> [2] https://issues.jboss.org/browse/ISPN-3422
>> >>>
>> >>> --
>> >>> Radim Vansa <rvansa at redhat.com>
>> >>> JBoss DataGrid QA
>> >>>
>> >>> _______________________________________________
>> >>> infinispan-dev mailing list
>> >>> infinispan-dev at lists.jboss.org
>> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >>
>> >>
>> >> _______________________________________________
>> >> infinispan-dev mailing list
>> >> infinispan-dev at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>>
>> --
>> Radim Vansa <rvansa at redhat.com>
>> JBoss DataGrid QA
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
>
>
> _______________________________________________
> infinispan-dev mailing listinfinispan-dev at lists.jboss.orghttps://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> --
> Radim Vansa <rvansa at redhat.com> <rvansa at redhat.com>
> JBoss DataGrid QA
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20140514/67d139ca/attachment-0001.html 


More information about the infinispan-dev mailing list