[infinispan-dev] XAResource.isSameRM

Mark Little mlittle at redhat.com
Fri Jan 7 06:51:10 EST 2011


+1


On 6 Jan 2011, at 19:01, Jonathan Halliday wrote:

> On 01/06/2011 05:43 PM, Mircea Markus wrote:
>> 
>> On 6 Jan 2011, at 14:45, Jonathan Halliday wrote:
>> 
>>> On 01/06/2011 02:29 PM, Mircea Markus wrote:
>>> 
>>>> At the moment the *only way to transactionally* access a
>>>> node is by collocating the client and the server in the same
>>>> VM.
>>> 
>>> So the scope of the transaction is limited to data
>>> residing in that local node? What if I want a single
>>> transaction to span the local node and data in a remote node?
>> that's possible. It's just that you have to always interact
>> with the local node that will acquire remote locks remotely
>> on behalf of your transaction.
> 
> ok, so the cluster intelligence is in the local node rather 
> than the client, not that there is any significant 
> distinction for now as they are co-located.
> 
>> a) node goes down before TM issued prepare
>> - when TM resurrects and calls XAResource.recover it
>> receives the given XID, realises that there's an heuristic
>> decision (because it didn't call prepare) and take some
>> action (rollbacks other participants, notify sys admin?).
> 
> That's not a heuristic decision. A RM is perfectly entitled 
> to throw away any tx state up until prepare. Under the 
> presumed abort doctrine it simply throws an error from 
> prepare and the tx aborts cleanly. Recovery is not involved 
> - it applies only to tx that have reached prepare.
> 
>> b) node goes down after TM issues prepare
>> - when TM issues a commit it receives an XAException
>> (perhaps XA_HEURRB) and again it is aware of the heuristic
>> outcome
> 
> Returning cleanly from a prepare is a promise by the RM to 
> successfully apply any subsequent commit. You're not in a 
> position to make such a promise unless your state is fault 
> tolerant, as a node crash would otherwise leave you with 
> inconsistent state.
> 
> It's not as simple as saying you'd rollback - what if you 
> prepare, get told to commit, apply remote changes (step 
> 4.2.1), then crash before applying local changes (4.2.2)? 
> You can't report that as a rollback - you applied some of 
> the updates. You have to include the tx in the recovery list 
> as heuristic hazard (unless NodeA will transparently 
> repopulate with the committed data, in which case you can 
> mask the failure or report heuristic commit), but how to 
> even detect the heuristic at recovery time? NodeA has no 
> persistent record of the tx and NodeB thinks it completed 
> cleanly and has cleaned up its tx record to avoid leaking. 
> Where is the data that tells you you've got a problem?
> 
> Or have a more sophisticated scenario where there is an 
> additional NodeC, thus requiring multiple 'apply remote 
> changes' calls. Are those atomic across the cluster? If 
> there is a possibility that NodeB will apply the update but 
> NodeC won't, or NodeA will crash after issuing a call to B 
> but before C, you can wind up with inconsistent state in the 
> surviving B and C. Alternatively, what if A survives but C 
> crashes whilst applying changes that B has already 
> sucessfully applied? That's not necessarily a recovery 
> situation as far as the TM is concerned, but it may be from 
> your perspective as you'll need to detect and (ideally) fix 
> or (as a last resort) report the inconsistent data.
> 
> A lot of your behaviour is going to depend on what it means 
> for a node to recover after a crash. If it simply comes up 
> empty and expects to be repopulated from an external source, 
> as with a normal cache, then your relation to the XAResource 
> of that external source is critical. On the other hand if 
> your cluster node is itself fault tolerant through 
> replication, then you need to think carefully about how the 
> RM functionality ties into that replication - basically the 
> tx state information is not local to the node where the 
> XAResource resides, but must be replicated in the same 
> manner as the other data in that node and that replication 
> must be synchronous at certain state transitions in the tx 
> lifecycle - it's logging recovery information through RPC 
> rather than disk write. Really interesting things are going 
> to happen if a single transaction spans data that is a mix 
> of cache copy of data stored persistently in an XA database 
> and data for which infinispan is the definitive, fault 
> tolerant repository.
> 
> To make a cluster appear to the outside world as a single 
> logical entity for transaction purposes, you're pretty much 
> going to wind up doing interposition. That means you're 
> implementing not only an RM but substantial chunks of what 
> amounts to a TM too. Have fun.
> 
> Jonathan.
> 
> -- 
> ------------------------------------------------------------
> Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 
> Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
> Registered in UK and Wales under Company Registration No. 
> 3798903  Directors: Michael Cunningham (USA), Charlie Peters 
> (USA), Matt Parsons (USA) and Brendan Lane (Ireland)
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

---
Mark Little
mlittle at redhat.com

JBoss, by Red Hat
Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
Registered in UK and Wales under Company Registration No. 3798903 Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parsons (USA) and Brendan Lane (Ireland).







More information about the infinispan-dev mailing list