[infinispan-dev] API behaviour when using DIST
Mircea Markus
mircea.markus at jboss.com
Mon Apr 27 16:41:39 EDT 2009
Manik Surtani wrote:
> Ok, so this is my final, final solution. :-)
>
> Both sync and async are supported in DIST. DIST will always do a get
> before a write, and this call will be synchronous. (the actual write
> command will be executed remotely either sync or async based on cache
> mode).
>
> This will ensure proper return values as well as proper behaviour of
> conditional invocations such as putIfAbsent, replace, etc.
>
> I will still hang on to the <unsafe unreliableReturnValues /> option
> though, which will apply to both sync and async mode, which will allow
> for skipping the get-before-write, but only where this is unnecessary
> for accurate functioning of a method.
what about also allowing this on a per invocation basis, through Flag?
>
> I.e., even if we set unreliableReturnValues to true, a
> get-before-write will *still* be performed in the following case:
> performing a conditional command (putIfAbsent, replace x 2,
> conditional remove) within a tx.
>
> This provides total correctness with regards to all method operations
> in all cases, and only optionally breaking the return value contract
> and nothing else.
+1
>
> On 24 Apr 2009, at 10:35, Manik Surtani wrote:
>
>>
>> On 24 Apr 2009, at 06:52, Bela Ban wrote:
>>
>>> Whatever you do, I think quick'n'dirty fire&forget calls are
>>> important: a user might want to push data into the grid at a very
>>> high rate and does not care about return values (for now).
>>>
>>> Maybe an additional API ?
>>>
>>> void putAsync(key, value);
>>>
>>>
>>> Or a batch API:
>>>
>>> void putBatch(Map<K,V> data);
>>
>> Well, putAll(Map<K, V> data) behaves like this since it returns void...
>>
>>
>>>
>>>
>>> ?
>>>
>>>
>>>
>>> Manik Surtani wrote:
>>>> Thinking about this a bit more (and implementing tests, etc for
>>>> this) I think even for the async case we need to do the remote get
>>>> first (and as a side-effect this would provide reliable return
>>>> values). The reason is that not doing this causes txs to behave
>>>> very weird and would need a lot of hacks to behave cleanly without
>>>> doing the eager get.
>>>>
>>>> Take this example (assuming dist-async)
>>>>
>>>> 1. tx.begin
>>>> 2. putIfAbsent(k) // k exists elsewhere
>>>> 3. get(k) // this will return the OLD val of k, since the tx hasnt
>>>> completed and the owners haven't seen the WriteCommand in 2 yet!
>>>>
>>>> We could hack this to make a record of commands that will be
>>>> executed later, but in the case of conditional writes (like
>>>> putIfAbsent) we dont know if they will succeed. So we could do a
>>>> get first as well, but in this case then we may as well stick with
>>>> a get-before-write approach and thereby provide reliable retvals.
>>>>
>>>> The actual commit would still be 1-phase and async though.
>>>>
>>>> WDYT?
>>>>
>>>> Also, for consistency, I suggest the same for non-tx-writes. This
>>>> then serves the added benefit of removing the weird unsafe flag,
>>>> etc. Ok, so it means doing a remote get before a put, but the cost
>>>> of this is mitigated because a) it is unicast to a small set of
>>>> servers b) the RPC call returns as soon as we get the first valid
>>>> response and c) with MVCC, the get is very quick - no locking
>>>> needed on the remote end.
>>>>
>>>> Comments?
>>>>
>>>>
>>>> On 21 Apr 2009, at 18:24, Manik Surtani wrote:
>>>>
>>>>>
>>>>> On 21 Apr 2009, at 18:18, Mircea Markus wrote:
>>>>>
>>>>>> Manik Surtani wrote:
>>>>>>>
>>>>>>> On 21 Apr 2009, at 18:01, Mircea Markus wrote:
>>>>>>>
>>>>>>>> yes, indeed. that's what cache.retrieve("key1", "key2",
>>>>>>>> "keyn"...); would do, fetch all the remote values at once
>>>>>>>> (multiple keys mapped to one node will result in one aggregated
>>>>>>>> get).
>>>>>>>>> Sounds pretty clunky though ...
>>>>>>>> Might be. I think it is easy to grasp though, and can have
>>>>>>>> significant benefits for clients that know all the key set they
>>>>>>>> will manipulate in one session.
>>>>>>>
>>>>>>> But the keys retrieved could still be wiped out.
>>>>>>>
>>>>>>> 1. start tx
>>>>>>> 2. retrieve(k1, k2, k3)
>>>>>>> 3. // go make coffee; other processes changing stuff, which
>>>>>>> removes keys from the L1, negating the effect of step 2
>>>>>> Isn't that exactly what happens now with read mvcc entries being
>>>>>> held in context? This won't break neither read_committed nor
>>>>>> repetable_read.
>>>>>
>>>>> Not quite. The return value is calculated atomically when the
>>>>> command is performed, even though the old value is cached in
>>>>> context. E.g., locally,
>>>>>
>>>>> 1. tx.begin
>>>>> 2. read K
>>>>> 3. // go make coffee
>>>>> 4. replace K. This command is atomic and the retval is extracted
>>>>> from the datacontainer as this command is perform()'ed. So what
>>>>> this invocation returns is accurate regardless of interleaving
>>>>> writes between step 2 & 4
>>>>> 5. ...
>>>>>
>>>>>>> 4. replace(k1, v1) // will return incorrect retval. Or will
>>>>>>> need to do a remote get again at this point
>>>>>>> 5. end tx
>>>>>>>
>>>>>>> --
>>>>>>> Manik Surtani
>>>>>>> manik at jboss.org
>>>>>>> Lead, Infinispan
>>>>>>> Lead, JBoss Cache
>>>>>>> http://www.infinispan.org
>>>>>>> http://www.jbosscache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> --
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> --
>>> Bela Ban
>>> Lead JGroups / Clustering Team
>>> JBoss - a division of Red Hat
>>>
>>
>> --
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
More information about the infinispan-dev
mailing list