[infinispan-dev] API behaviour when using DIST

Mircea Markus mircea.markus at jboss.com
Mon Apr 27 16:41:39 EDT 2009


Manik Surtani wrote:
> Ok, so this is my final, final solution.  :-)
>
> Both sync and async are supported in DIST.  DIST will always do a get 
> before a write, and this call will be synchronous.  (the actual write 
> command will be executed remotely either sync or async based on cache 
> mode).
>
> This will ensure proper return values as well as proper behaviour of 
> conditional invocations such as putIfAbsent, replace, etc.
>
> I will still hang on to the <unsafe unreliableReturnValues /> option 
> though, which will apply to both sync and async mode, which will allow 
> for skipping the get-before-write, but only where this is unnecessary 
> for accurate functioning of a method.
what about also allowing this on a per invocation basis, through Flag?
>
> I.e., even if we set unreliableReturnValues to true, a 
> get-before-write will *still* be performed in the following case: 
> performing a conditional command (putIfAbsent, replace x 2, 
> conditional remove) within a tx.
>
> This provides total correctness with regards to all method operations 
> in all cases, and only optionally breaking the return value contract 
> and nothing else.
+1
>
> On 24 Apr 2009, at 10:35, Manik Surtani wrote:
>
>>
>> On 24 Apr 2009, at 06:52, Bela Ban wrote:
>>
>>> Whatever you do, I think quick'n'dirty fire&forget calls are 
>>> important: a user might want to push data into the grid at a very 
>>> high rate and does not care about return values (for now).
>>>
>>> Maybe an additional API ?
>>>
>>> void putAsync(key, value);
>>>
>>>
>>> Or a batch API:
>>>
>>> void putBatch(Map<K,V> data);
>>
>> Well, putAll(Map<K, V> data) behaves like this since it returns void...
>>
>>
>>>
>>>
>>> ?
>>>
>>>
>>>
>>> Manik Surtani wrote:
>>>> Thinking about this a bit more (and implementing tests, etc for 
>>>> this) I think even for the async case we need to do the remote get 
>>>> first (and as a side-effect this would provide reliable return 
>>>> values).  The reason is that not doing this causes txs to behave 
>>>> very weird and would need a lot of hacks to behave cleanly without 
>>>> doing the eager get.
>>>>
>>>> Take this example (assuming dist-async)
>>>>
>>>> 1.  tx.begin
>>>> 2.  putIfAbsent(k) // k exists elsewhere
>>>> 3.  get(k) // this will return the OLD val of k, since the tx hasnt 
>>>> completed and the owners haven't seen the WriteCommand in 2 yet!
>>>>
>>>> We could hack this to make a record of commands that will be 
>>>> executed later, but in the case of conditional writes (like 
>>>> putIfAbsent) we dont know if they will succeed.  So we could do a 
>>>> get first as well, but in this case then we may as well stick with 
>>>> a get-before-write approach and thereby provide reliable retvals.
>>>>
>>>> The actual commit would still be 1-phase and async though.
>>>>
>>>> WDYT?
>>>>
>>>> Also, for consistency, I suggest the same for non-tx-writes.  This 
>>>> then serves the added benefit of removing the weird unsafe flag, 
>>>> etc.  Ok, so it means doing a remote get before a put, but the cost 
>>>> of this is mitigated because a) it is unicast  to a small set of 
>>>> servers b) the RPC call returns as soon as we get the first valid 
>>>> response and c) with MVCC, the get is very quick - no locking 
>>>> needed on the remote end.
>>>>
>>>> Comments?
>>>>
>>>>
>>>> On 21 Apr 2009, at 18:24, Manik Surtani wrote:
>>>>
>>>>>
>>>>> On 21 Apr 2009, at 18:18, Mircea Markus wrote:
>>>>>
>>>>>> Manik Surtani wrote:
>>>>>>>
>>>>>>> On 21 Apr 2009, at 18:01, Mircea Markus wrote:
>>>>>>>
>>>>>>>> yes, indeed. that's what cache.retrieve("key1", "key2", 
>>>>>>>> "keyn"...); would do, fetch all the remote values at once 
>>>>>>>> (multiple keys mapped to one node will result in one aggregated 
>>>>>>>> get).
>>>>>>>>> Sounds pretty clunky though ...
>>>>>>>> Might be. I think it is easy to grasp though, and can have 
>>>>>>>> significant benefits for clients that know all the key set they 
>>>>>>>> will manipulate in one session.
>>>>>>>
>>>>>>> But the keys retrieved could still be wiped out.
>>>>>>>
>>>>>>> 1.  start tx
>>>>>>> 2.  retrieve(k1, k2, k3)
>>>>>>> 3.  // go make coffee; other processes changing stuff, which 
>>>>>>> removes keys from the L1, negating the effect of step 2
>>>>>> Isn't that exactly what happens now with read mvcc entries being 
>>>>>> held in context? This won't break neither read_committed nor 
>>>>>> repetable_read.
>>>>>
>>>>> Not quite.  The return value is calculated atomically when the 
>>>>> command is performed, even though the old value is cached in 
>>>>> context.  E.g.,  locally,
>>>>>
>>>>> 1.  tx.begin
>>>>> 2.  read K
>>>>> 3.  // go make coffee
>>>>> 4.  replace K.  This command is atomic and the retval is extracted 
>>>>> from the datacontainer as this command is perform()'ed.  So what 
>>>>> this invocation returns is accurate regardless of interleaving 
>>>>> writes between step 2 & 4
>>>>> 5. ...
>>>>>
>>>>>>> 4.  replace(k1, v1) // will return incorrect retval.  Or will 
>>>>>>> need to do a remote get again at this point
>>>>>>> 5.  end tx
>>>>>>>
>>>>>>> -- 
>>>>>>> Manik Surtani
>>>>>>> manik at jboss.org
>>>>>>> Lead, Infinispan
>>>>>>> Lead, JBoss Cache
>>>>>>> http://www.infinispan.org
>>>>>>> http://www.jbosscache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>> Manik Surtani
>>>>> manik at jboss.org
>>>>> Lead, Infinispan
>>>>> Lead, JBoss Cache
>>>>> http://www.infinispan.org
>>>>> http://www.jbosscache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>> -- 
>>>> Manik Surtani
>>>> manik at jboss.org
>>>> Lead, Infinispan
>>>> Lead, JBoss Cache
>>>> http://www.infinispan.org
>>>> http://www.jbosscache.org
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>> -- 
>>> Bela Ban
>>> Lead JGroups / Clustering Team
>>> JBoss - a division of Red Hat
>>>
>>
>> -- 
>> Manik Surtani
>> manik at jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>> http://www.infinispan.org
>> http://www.jbosscache.org
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> -- 
> Manik Surtani
> manik at jboss.org
> Lead, Infinispan
> Lead, JBoss Cache
> http://www.infinispan.org
> http://www.jbosscache.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev




More information about the infinispan-dev mailing list