[infinispan-dev] XAResource.isSameRM

Wed Jan 5 09:51:32 EST 2011

FYI, a discussion I have with Jonathan around recovery support from TM 

On 5 Jan 2011, at 14:43, Jonathan Halliday wrote:
> On 01/05/2011 02:18 PM, Mircea Markus wrote:
> 
>> I don't know how the TM recovery process picks up the XAResource instance on which to call XAResource.recover, but I imagine it expects this method to return all the prepared(or heuristic completed) transactions from the _whole transaction branch_, i.e. from the entire cluster.
> 
> all from the logical RM, which you happen to implement as a cluster, yes.
> 
>> I'm asking this  because right now there's no way for a node to know all the prepared transaction in the entire cluster. This is doable but would involve an broadcast to query the cluster, which might be costly (time and bandwidth).
> 
> right. not to mention it should, strictly speaking, block or fail if any node is unreachable, which kinda sucks from an availability perspective.
> 
> Keep in mind the global list of known in-doubt tx is a point in time snapshot anyhow - in an active system it's out of date as soon as another running tx is prepared. So, you can serve one that's slightly stale without too much risk. Not that periodic broadcast of the in-doubt list between nodes is necessarily better than doing it on-demand in response to a recovery call, but at least it's O(1) rather than O(number of clients calling recover).  The (mild) problem we've seen in the past is where a large cluster of app server nodes i.e. tx managers, is started more or less simultaniously, the RMs get a storm of recovery requests every two minutes. If the impl of that is expensive and not cached it can cause performance spikes in the RM.
> 
> That said, keep in mind a recovery pass is (in the default config) only every two minutes and run on a background thread. It's not something you want to worry about performance optimization of in the initial implementation. Do it right rather than fast. Optimize it only when users scream.
> 
>> On the other hand I imagine this call is performed asynchronously and doesn't impact TM performance in managing ongoing transactions, so it might not be that bad after all.
> 
> correct
> 
>> Another approach would be to to consider each node as an transaction branch.
> 
> you mean a logically separate resource manager, yes. You are basically talking about not doing interposition in the driver/server but rather relying on the transaction manager to handle multiple resources. It may make your implementation simpler but probably less performant on the critical path (transaction commit) vs. recovery.
> 
> 
>> The advantage here is that recovery can be easily implemented, as the TM recovery would ask all the nodes that were registered for prepared transactions
> 
> 'registered'? you plan of having the config on each transaction manager contain a list of all cluster nodes? That's not admin friendly.
> 
>> , and no cluster broadcast would be required when XAResource.recover is called.
> 
> the broadcast is automatic. Maintaining the list of known nodes in a config file is not. No contest.
> 
>> Considering this approach, do you see any drawbacks compared with the other one?
>> E.g. each node being a branch might involve multiple RPC between remote TM and XAResource on each node (v.s. one in prev example).
> 
> yeah, as mentioned above its a non-interposed model rather than one where your driver/server is doing interposition. Favour the one that makes the commit path fast, even though it makes recovery an utter pain.
> 
> Jonathan.
> 
> -
> ------------------------------------------------------------
> Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom.
> Registered in UK and Wales under Company Registration No. 3798903  Directors: Michael Cunningham (USA), Charlie Peters (USA), Matt Parsons (USA) and Brendan Lane (Ireland)