[infinispan-dev] XAResource.isSameRM
Mark Little
mlittle at redhat.com
Fri Jan 7 06:51:10 EST 2011
+1
On 6 Jan 2011, at 19:01, Jonathan Halliday wrote:
> On 01/06/2011 05:43 PM, Mircea Markus wrote:
>>
>> On 6 Jan 2011, at 14:45, Jonathan Halliday wrote:
>>
>>> On 01/06/2011 02:29 PM, Mircea Markus wrote:
>>>
>>>> At the moment the *only way to transactionally* access a
>>>> node is by collocating the client and the server in the same
>>>> VM.
>>>
>>> So the scope of the transaction is limited to data
>>> residing in that local node? What if I want a single
>>> transaction to span the local node and data in a remote node?
>> that's possible. It's just that you have to always interact
>> with the local node that will acquire remote locks remotely
>> on behalf of your transaction.
>
> ok, so the cluster intelligence is in the local node rather
> than the client, not that there is any significant
> distinction for now as they are co-located.
>
>> a) node goes down before TM issued prepare
>> - when TM resurrects and calls XAResource.recover it
>> receives the given XID, realises that there's an heuristic
>> decision (because it didn't call prepare) and take some
>> action (rollbacks other participants, notify sys admin?).
>
> That's not a heuristic decision. A RM is perfectly entitled
> to throw away any tx state up until prepare. Under the
> presumed abort doctrine it simply throws an error from
> prepare and the tx aborts cleanly. Recovery is not involved
> - it applies only to tx that have reached prepare.
>
>> b) node goes down after TM issues prepare
>> - when TM issues a commit it receives an XAException
>> (perhaps XA_HEURRB) and again it is aware of the heuristic
>> outcome
>
> Returning cleanly from a prepare is a promise by the RM to
> successfully apply any subsequent commit. You're not in a
> position to make such a promise unless your state is fault
> tolerant, as a node crash would otherwise leave you with
> inconsistent state.
>
> It's not as simple as saying you'd rollback - what if you
> prepare, get told to commit, apply remote changes (step
> 4.2.1), then crash before applying local changes (4.2.2)?
> You can't report that as a rollback - you applied some of
> the updates. You have to include the tx in the recovery list
> as heuristic hazard (unless NodeA will transparently
> repopulate with the committed data, in which case you can
> mask the failure or report heuristic commit), but how to
> even detect the heuristic at recovery time? NodeA has no
> persistent record of the tx and NodeB thinks it completed
> cleanly and has cleaned up its tx record to avoid leaking.
> Where is the data that tells you you've got a problem?
>
> Or have a more sophisticated scenario where there is an
> additional NodeC, thus requiring multiple 'apply remote
> changes' calls. Are those atomic across the cluster? If
> there is a possibility that NodeB will apply the update but
> NodeC won't, or NodeA will crash after issuing a call to B
> but before C, you can wind up with inconsistent state in the
> surviving B and C. Alternatively, what if A survives but C
> crashes whilst applying changes that B has already
> sucessfully applied? That's not necessarily a recovery
> situation as far as the TM is concerned, but it may be from
> your perspective as you'll need to detect and (ideally) fix
> or (as a last resort) report the inconsistent data.
>
> A lot of your behaviour is going to depend on what it means
> for a node to recover after a crash. If it simply comes up
> empty and expects to be repopulated from an external source,
> as with a normal cache, then your relation to the XAResource
> of that external source is critical. On the other hand if
> your cluster node is itself fault tolerant through
> replication, then you need to think carefully about how the
> RM functionality ties into that replication - basically the
> tx state information is not local to the node where the
> XAResource resides, but must be replicated in the same
> manner as the other data in that node and that replication
> must be synchronous at certain state transitions in the tx
> lifecycle - it's logging recovery information through RPC
> rather than disk write. Really interesting things are going
> to happen if a single transaction spans data that is a mix
> of cache copy of data stored persistently in an XA database
> and data for which infinispan is the definitive, fault
> tolerant repository.
>
> To make a cluster appear to the outside world as a single
> logical entity for transaction purposes, you're pretty much
> going to wind up doing interposition. That means you're
> implementing not only an RM but substantial chunks of what
> amounts to a TM too. Have fun.
>
> Jonathan.
>
> --
> ------------------------------------------------------------
> Registered Address: Red Hat UK Ltd, Amberley Place, 107-111
> Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
> Registered in UK and Wales under Company Registration No.
> 3798903 Directors: Michael Cunningham (USA), Charlie Peters
> (USA), Matt Parsons (USA) and Brendan Lane (Ireland)
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
---
Mark Little
mlittle at redhat.com
JBoss, by Red Hat
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parsons (USA) and Brendan Lane (Ireland).
More information about the infinispan-dev
mailing list