David Lloyd [
http://community.jboss.org/people/dmlloyd] created the discussion
"Re: Remoting Transport Transaction Inflow Design Discussion"
To view the discussion, visit:
http://community.jboss.org/message/621525#621525
--------------------------------------------------------------
Jonathan Halliday wrote:
>> [remote UserTransaction] ... behaves in an intuitive fashion only for a very
limited, albeit common, set of use cases. For more complex scenarios its inherent
limitations manifest in ways that can be confusing to users.
> or "some complex scenarios"
yup, I can give you a some of them too:
1) The 'client' is actually another AS instance, either of the same or earlier
vintage, doing JNDI lookup of UserTransaction against a remote AS7.
2) The client wants to talk to two remote AS instances in the same tx.
3) The client is an environment that has its own UserTransaction implementation. This is
actually just a more general version of case 1). but in which you can't use tricks
like patching the client side lookup to return your actual UserTransaction instead of the
remote proxy.
4) you want to support load balancing or failover for the client-server connection.
Okay so these basically correspond to the same scenarios which have already been
outlined. As far as I know there's no need (in terms of existing functionality or
explicit requirement) to support #4 during mid-transaction though.
Jonathan Halliday wrote:
>> JCA inflow was either designed for propagation to leaf nodes only, or incredibly
badly thought out.
> or "badly thought out"
yup, although it's really pretty obvious: The JCA inflow API uses an Xid as a poor
man's transaction propagation context. Xids were designed only for control flow
between a transaction manager and a resource manager, not for use in multi-level trees.
The JCA has no provision for allowing subordinates to create new branches in the global
transaction. For that it would have to pass in a mask of free bits in the bqual array as
well as the Xid to the subordinate. Indeed the JCA expressly prohibits the container
handling the inflow from altering the Xid. It has to remain immutable because without any
knowledge of which bits can safely be mutated, the container can't guarantee to
generate uniq Xids, a property which is required by the spec.
I didn't find this
in the JCA spec (there was a bit about RMs not altering an XID data bits in transit but
this is not the same thing), but I see your point about XID generation in a hierarchical
system (it'd be fine as long as there are no cycles and you could just patch on stuff
to the end of the branch ID, but that's not technically very robust, and could violate
the XID "format" if there is one). I'm curious to know how other vendors
solve this problem with EIS transaction inflow. I could see a workaround in which
additional XAResources are enlisted to the root controller by propagating them back *up*
the chain, but this is back into custom SPI territory which I'd just as soon stay out
of.
Alternatively the subordinate TM could simply generate a new global transaction ID for
+its+ subordinate resources. It'd technically be a lie but it'd cleanly solve
this problem at least as far as transaction completion goes - recovery semantics might be
hard to work out though.
Jonathan Halliday wrote:
> or "not capable enough"
The XA spec expects that each resource manager (give or take its XAResource's
isSameRM implementation) gets its own branch i.e. uniq Xid. With inflowed Xids you
can't generate new Xids to meet that expectation, you have to use the inflowed one
verbatim. That causes problems with the state machine for the XA protocol lifecycle, as
it's tied to the Xid. For example, if the inflowed tx is used to connect to two
resource managers, you can't recover from crashes cleanly as the recovery mechanism is
tracking state on the assumption that the Xid belongs to at most one RM and once it has
cleaned that one up it's done. Actually on further thought even an upper limit of one
is optimistic - the Xid contains the node Id of the originating parent and that parent may
connect to the same resource manager, in which case it's going to incorrectly manage
the lifecycle because it can't distinguish the XAResource representing the subordinate
tx from the one representing the RM as they have the same Xid. That last case is an
artifact of our implementation rather than the spec though.
Again I can't find
this in the spec. It clearly says that an XID is used to identify the incoming
transaction, but nothing says that it cannot in turn generate different XIDs for its own
resources.
As for your latter point though, recalling that we're dealing with a strictly
hierarchical relationship here; even if the same transaction recursively flows in to a
node into which it had already flowed, it doesn't really have to treat it as another
branch of the same transaction, even if it were possible to do so. It's a departure
from CORBA-style distribution in that every inflow can be a new level in the transaction
hierarchy even if it passes through the same node (which you would not normally do in a
hierarchical relationship, by definition, because resources could then be accessed from
two wholly different XIDs even if they are logically a part of the same transaction). If
true distribution is desired, there's always JTS, after all. That's what this is
- you trade away the functionality you don't want anyway when you're in a
client/server environment, and in return you get much simpler semantics (and in turn, less
overhead) and the benefits of the optimized transport. Choices are good.
Jonathan Halliday wrote:
> or "unintuitive behavior"
yup, I can give you one for that too - the afterCompletions run relative to the commit in
the local node where they are registered, which may actually be before the commit in
another node and not correctly reflect heuristics outcomes or be suitable for triggering
subsequent steps in a process that depend on running after commits in the other nodes.
Likewise beforeCompletions run relative to the prepare in the local node, thus may run
after a prepare in another node. In the best case that's merely inefficient, in the
worst case, where resource managers are shared, it causes a flush of cached data to occur
after a prepare, which will fail. It that's not complicated enough for you, take the
inflowed transaction context and make a transactional call back to the originating parent
server. fun, fun.
I woudn't be worried about the Synchronization stuff in a multi-tier environment -
especially if we disallow resource sharing (i.e. treat each node's access to a
resource as separate), which seems prudent given my above thoughts about unorthodox XID
handling. In my experience, the use cases for the kind of boss/subordinate cascading
which we are talking about would generally not rely on the ability (resource sharing)
anyway. And if you're not sharing resources then if you look at the synchronization
issues, you'll see that their semantics probably only matter relative to what the
local node can see anyway. I think this lack of capability is fair if it saves us
implementation effort.
That isn't to say that we couldn't invent some great new SPI which does this all
much better. Given unlimited (or less limited) resources, this would be fine by me.
Furthermore since all of this XATerminator/XAResource stuff is implementation details, we
could do it one way now and change to a different, more feature-rich solution later on.
Maybe at the same time we can tackle the XID deficiency in the JCA spec somehow.
Jonathan Halliday wrote:
> You're basically saying that an MDB can never access more than one resource.
That's a major problem in and of itself.
Not at all. MDBs don't normally run in inflowed transactions. The server hosting the
MDB container starts a top level transaction, enlists the JMS as a resource manager and
additionally enlists any resource managers the MDB calls e.g. a database. It's a flat
structure, not a hierarchic one.
The purpose is execute Work in the context of a
transaction controlled by an outside party, and delivering messages as part of an imported
transaction is allowed and described in the spec as one of the three models (with respect
to transactions) in which messages my be delivered.
In any case, if that API was intended for flat execution then yeah it's an utter
failure of an SPI. If it's intended for hierarchical execution then it's only a
moderate failure (due to the XID problem), one that's actually workable in practice
(in my opinion). Without resources to control, after all, there's not a lot of point
to transactional inflow.
Jonathan Halliday wrote:
> Finally "unacceptable to ship a solution that may require manual transaction
cleanup" - you should know that any two-phase transaction system may require manual
transaction cleanup; that's the nature of two-phase transactions.
sure, but they are the small number of outcomes that result from one or more of the
players not behaving in accordance with the spec e.g. resource managers making autonomous
outcome decisions. We don't automatically do anything about those because we simply
can't - that's the point at which the spec basically says 'give up, throw a
heuristic and let a human deal with the mess'. You're talking about the much more
numerous expected failure cases that can be handled under automatically under the spec.
Indeed exactly the kinds of run of the mill system failures a distributed transaction
protocol is designed to protect a user against. Intentionally shipping a non spec
compliant XAResource implementation that will result in a support case for many of those
common failures is borderline business suicide, see above.
The whole idea is
predicated on complying with the XAResource contract; we would not intentionally ship a
non spec compliant XAResource implementation.
Jonathan Halliday wrote:
> I'm pretty sure that if someone unplugs the ethernet cable of the transaction
coordinator after prepare but before commit, there's going to have to be some manual
cleanup.
Really? Got a test case for that? Other than the one a certain competitor wrote and we
soundly refuted as FUD? Because I've got an extensive test suite that shows no such
outcomes. Well, except for MS SQLServer and mysql, neither of which is fully XA compliant
at present. Ensuring clean transaction completion in crash situations is exactly what the
transaction manager is for after all.
Okay, great. What I was trying to get across
with those requirement items is that we're only going to implement the contracts, and
we're not implementing any special recovery semantics beyond what the contracts
specify and what the TM does for us. If the TM can handle every crash scenario ever, all
the better.
--------------------------------------------------------------
Reply to this message by going to Community
[
http://community.jboss.org/message/621525#621525]
Start a new discussion in JBoss Transactions Development at Community
[
http://community.jboss.org/choose-container!input.jspa?contentType=1&...]