Radim, I would contend that the first and foremost guarantee that put() makes is to leave the cache in a consistent state. So we can't just throw an exception and give up, leaving k=v on one owner and k=null on another.

Secondly, put(k, v) being atomic means that it either succeeds, it writes k=v in the cache, and it returns the previous value, or it doesn't succeed, and it doesn't write k=v in the cache. Returning the wrong previous value is bad, but leaving k=v in the cache is just as bad, even if the all the owners have the same value.

And last, we can't have one node seeing k=null, then k=v, then k=null again, when the only write we did on the cache was a put(k, v). So trying to undo the write would not help.

In the end, we have to make a compromise, and I think returning the wrong value in some of the cases is a reasonable compromise. Of course, we should document that :)

I also believe ISPN-2956 could be fixed so that HotRod behaves just like embedded mode after the ISPN-3422 fix, by adding a RETRY flag to the HotRod protocol and to the cache itself.

Incidentally, transactional caches have a similar problem when the originator leaves the cluster: ISPN-3421 [1]
And we can't handle transactional caches any better than non-transactional caches until we expose transactions to the HotRod client.

[1] https://issues.jboss.org/browse/ISPN-2956

Cheers
Dan




On Mon, May 12, 2014 at 10:21 AM, Radim Vansa <rvansa@redhat.com> wrote:
Hi,

recently I've stumbled upon one already expected behaviour (one instance
is [1]), but which did not got much attention.

In non-tx cache, when the primary owner fails after the request has been
replicated to backup owner, the request is retried in the new topology.
Then, the operation is executed on the new primary (the previous
backup). The outcome has been already fixed in [2], but the return value
may be wrong. For example, when we do a put, the return value for the
second attempt will be the currently inserted value (although the entry
was just created). Same situation may happen for other operations.

Currently, it's not possible to return the correct value (because it has
already been overwritten and we don't keep a history of values), but
shouldn't we rather throw an exception if we were not able to fulfil the
API contract?

Radim

[1] https://issues.jboss.org/browse/ISPN-2956
[2] https://issues.jboss.org/browse/ISPN-3422

--
Radim Vansa <rvansa@redhat.com>
JBoss DataGrid QA

_______________________________________________
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev