Thinking about this a bit more (and implementing tests, etc for this)
I think even for the async case we need to do the remote get first
(and as a side-effect this would provide reliable return values). The
reason is that not doing this causes txs to behave very weird and
would need a lot of hacks to behave cleanly without doing the eager get.
Take this example (assuming dist-async)
1. tx.begin
2. putIfAbsent(k) // k exists elsewhere
3. get(k) // this will return the OLD val of k, since the tx hasnt
completed and the owners haven't seen the WriteCommand in 2 yet!
We could hack this to make a record of commands that will be executed
later, but in the case of conditional writes (like putIfAbsent) we
dont know if they will succeed. So we could do a get first as well,
but in this case then we may as well stick with a get-before-write
approach and thereby provide reliable retvals.
The actual commit would still be 1-phase and async though.
WDYT?
Also, for consistency, I suggest the same for non-tx-writes. This
then serves the added benefit of removing the weird unsafe flag, etc.
Ok, so it means doing a remote get before a put, but the cost of this
is mitigated because a) it is unicast to a small set of servers b)
the RPC call returns as soon as we get the first valid response and c)
with MVCC, the get is very quick - no locking needed on the remote end.
Comments?
On 21 Apr 2009, at 18:24, Manik Surtani wrote:
On 21 Apr 2009, at 18:18, Mircea Markus wrote:
> Manik Surtani wrote:
>>
>> On 21 Apr 2009, at 18:01, Mircea Markus wrote:
>>
>>> yes, indeed. that's what cache.retrieve("key1",
"key2",
>>> "keyn"...); would do, fetch all the remote values at once
>>> (multiple keys mapped to one node will result in one aggregated
>>> get).
>>>> Sounds pretty clunky though ...
>>> Might be. I think it is easy to grasp though, and can have
>>> significant benefits for clients that know all the key set they
>>> will manipulate in one session.
>>
>> But the keys retrieved could still be wiped out.
>>
>> 1. start tx
>> 2. retrieve(k1, k2, k3)
>> 3. // go make coffee; other processes changing stuff, which
>> removes keys from the L1, negating the effect of step 2
> Isn't that exactly what happens now with read mvcc entries being
> held in context? This won't break neither read_committed nor
> repetable_read.
Not quite. The return value is calculated atomically when the
command is performed, even though the old value is cached in
context. E.g., locally,
1. tx.begin
2. read K
3. // go make coffee
4. replace K. This command is atomic and the retval is extracted
from the datacontainer as this command is perform()'ed. So what
this invocation returns is accurate regardless of interleaving
writes between step 2 & 4
5. ...
>> 4. replace(k1, v1) // will return incorrect retval. Or will need
>> to do a remote get again at this point
>> 5. end tx
>>
>> --
>> Manik Surtani
>> manik(a)jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>>
http://www.infinispan.org
>>
http://www.jbosscache.org
>>
>>
>>
>>
>
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org