On 6 Apr 2009, at 17:06, Mircea Markus wrote:
Manik Surtani wrote:
>
> On 6 Apr 2009, at 15:26, Mircea Markus wrote:
>
>> Manik Surtani wrote:
>>>
>>> On 3 Apr 2009, at 20:33, Mircea Markus wrote:
>>>
>>>> Hi,
>>>>
>>>> Here are two optimizations that can be implemented in our 2PC
>>>> model:
>>>> 1) if there are only two members int the cluster use an 1PC (or
>>>> if you only replicate to one buddy, like in buddy replication).
>>>> If the 1st phase fails remotely, then also rollback locally.
>>>> This would reduce one network roundtrip.
>>>
>>> Interesting. I assume with BR you mean DIST where a key is
>>> mapped on to 1 other peer - Infinispan won't support BR as in
>>> JBC. ;-)
>>>
>>> While this is an interesting thought, it does raise the potential
>>> for race conditions - since this decision will have to be taken
>>> in the TxInterceptor in the beforeCompletion phase of a
>>> transaction, and by the time the call gets to the interceptor for
>>> replication, the topology may have changed such that you need to
>>> replicate to 2 instead of 1 other peer. Which would mean a 2PC
>>> again. So it does need some thought.
>> Good point with concurrency. Even so, this is a valid optimization
>> and I think worths thinking about it.
>
> Yes, of course. I just pointed out one of the challenges that came
> to mind.
>
>>>> 2) when asked to prepare, a participant might return a value
>>>> indicating that no changes were made (read-only participant), so
>>>> this one won't need an commit message, so less roundtrip.
>>>
>>> No, prepares only contain modifications. Read commands don't
>>> get added to a prepare,
>> I know that :)
>>> and if a prepare doesn't contain any writes, it isn't broadcast
>>> economising on the network call.
>> e.g. a bunch of remove() operations on keys that does not exist
>> would cause an "read only" participant. There are other ops that
>> might not modify the remote node, e.g. putIfAbsent, replace(k,v)
>
> Yes, but we need to be sure the behaviour of all of these commands
> are the same cluster-wide. And they may not be. E.g., a remove()
> may be a no-op on one node due to eviction, but on the neighbour it
> may actually remove something. Same with pIA() and replace().
The optimization I described refers to an individual node. If a node
responds with "READ_ONLY" to a prepare message, then we won't have
to send an commit/rollback message to that node only, disregarding
the way other nodes responded.
Ah ok, I see what you mean. So instead of responding with a boolean
success flag to a prepare, to respond with a status code. Yeah, makes
a lot of sense.
Cheers
--
Manik Surtani
manik(a)jboss.org
Lead, JBoss Cache
http://www.jbosscache.org