Jonathan Halliday [
http://community.jboss.org/people/jhalliday] created the discussion
"Re: Remoting Transport Transaction Inflow Design Discussion"
To view the discussion, visit:
http://community.jboss.org/message/621512#621512
--------------------------------------------------------------
> I'm pretty sure that e.g. support will tell you it is
unacceptable to ship a solution that may require manual transaction cleanup. We've had
a small number of corner cases in JTA that suffered that limitation and eliminating them
and the support load they generate has been a high priority for the transaction
development work. Intentionally introducing new ones is definitely in the category of Bad
Ideas.
can you be more specific than "Bad Idea"
Sure, explaining the chain of reasoning behind that one is easy:
Red Hat ships products and offers support on them. That support is fixed price based on
SLA, not proportional to the number of tickets filed. On the other hand, support costs
scale as a function of the number of issues reported. Thus the less support work we have
to do, the lower our cost and the higher our profit. Intentionally building and shipping
something we know is going to increase support load is contra-survival and therefore a Bad
Idea.
> [remote UserTransaction] ... behaves in an intuitive fashion only
for a very limited, albeit common, set of use cases. For more complex scenarios its
inherent limitations manifest in ways that can be confusing to users.
or "some complex scenarios"
yup, I can give you a some of them too:
1) The 'client' is actually another AS instance, either of the same or earlier
vintage, doing JNDI lookup of UserTransaction against a remote AS7.
2) The client wants to talk to two remote AS instances in the same tx.
3) The client is an environment that has its own UserTransaction implementation. This is
actually just a more general version of case 1). but in which you can't use tricks
like patching the client side lookup to return your actual UserTransaction instead of the
remote proxy.
4) you want to support load balancing or failover for the client-server connection.
> JCA inflow was either designed for propagation to leaf nodes
only, or incredibly badly thought out.
or "badly thought out"
yup, although it's really pretty obvious: The JCA inflow API uses an Xid as a poor
man's transaction propagation context. Xids were designed only for control flow
between a transaction manager and a resource manager, not for use in multi-level trees.
The JCA has no provision for allowing subordinates to create new branches in the global
transaction. For that it would have to pass in a mask of free bits in the bqual array as
well as the Xid to the subordinate. Indeed the JCA expressly prohibits the container
handling the inflow from altering the Xid. It has to remain immutable because without any
knowledge of which bits can safely be mutated, the container can't guarantee to
generate uniq Xids, a property which is required by the spec.
or "not capable enough"
The XA spec expects that each resource manager (give or take its XAResource's isSameRM
implementation) gets its own branch i.e. uniq Xid. With inflowed Xids you can't
generate new Xids to meet that expectation, you have to use the inflowed one verbatim.
That causes problems with the state machine for the XA protocol lifecycle, as it's
tied to the Xid. For example, if the inflowed tx is used to connect to two resource
managers, you can't recover from crashes cleanly as the recovery mechanism is tracking
state on the assumption that the Xid belongs to at most one RM and once it has cleaned
that one up it's done. Actually on further thought even an upper limit of one is
optimistic - the Xid contains the node Id of the originating parent and that parent may
connect to the same resource manager, in which case it's going to incorrectly manage
the lifecycle because it can't distinguish the XAResource representing the subordinate
tx from the one representing the RM as they have the same Xid. That last case is an
artifact of our implementation rather than the spec though.
or "unintuitive behavior"
yup, I can give you one for that too - the afterCompletions run relative to the commit in
the local node where they are registered, which may actually be before the commit in
another node and not correctly reflect heuristics outcomes or be suitable for triggering
subsequent steps in a process that depend on running after commits in the other nodes.
Likewise beforeCompletions run relative to the prepare in the local node, thus may run
after a prepare in another node. In the best case that's merely inefficient, in the
worst case, where resource managers are shared, it causes a flush of cached data to occur
after a prepare, which will fail. It that's not complicated enough for you, take the
inflowed transaction context and make a transactional call back to the originating parent
server. fun, fun.
Also - only one resource for inflowed transactions? How is that not
a serious deficiency in our implementation?
It's a deficiency in the JCA spec, see above. The spec assumes the inflowed container
is a leaf node i.e. RM, not a subordinate coordinator. There are some hacky things we can
potentially do to work around that limitation in the spec without outright breaking
compliance. They were on my list of things to do in the transactions upstream, but I seem
to be a bit busy with AS integration issues instead :-)
You're basically saying that an MDB can never access more than
one resource. That's a major problem in and of itself.
Not at all. MDBs don't normally run in inflowed transactions. The server hosting the
MDB container starts a top level transaction, enlists the JMS as a resource manager and
additionally enlists any resource managers the MDB calls e.g. a database. It's a flat
structure, not a hierarchic one.
Finally "unacceptable to ship a solution that may require manual
transaction cleanup" - you should know that any two-phase transaction system may
require manual transaction cleanup; that's the nature of two-phase transactions.
sure, but they are the small number of outcomes that result from one or more of the
players not behaving in accordance with the spec e.g. resource managers making autonomous
outcome decisions. We don't automatically do anything about those because we simply
can't - that's the point at which the spec basically says 'give up, throw a
heuristic and let a human deal with the mess'. You're talking about the much more
numerous expected failure cases that can be handled under automatically under the spec.
Indeed exactly the kinds of run of the mill system failures a distributed transaction
protocol is designed to protect a user against. Intentionally shipping a non spec
compliant XAResource implementation that will result in a support case for many of those
common failures is borderline business suicide, see above.
I'm pretty sure that if someone unplugs the ethernet cable of the
transaction coordinator after prepare but before commit, there's going to have to be
some manual cleanup.
Really? Got a test case for that? Other than the one a certain competitor wrote and we
soundly refuted as FUD? Because I've got an extensive test suite that shows no such
outcomes. Well, except for MS SQLServer and mysql, neither of which is fully XA compliant
at present. Ensuring clean transaction completion in crash situations is exactly what the
transaction manager is for after all.
--------------------------------------------------------------
Reply to this message by going to Community
[
http://community.jboss.org/message/621512#621512]
Start a new discussion in JBoss Transactions Development at Community
[
http://community.jboss.org/choose-container!input.jspa?contentType=1&...]