JBoss Community

Re: Remoting Transport Transaction Inflow Design Discussion

created by David Lloyd in JBoss Transactions Development - View the full discussion

Mark Little wrote:

"I mean that I haven't addressed the issue of transaction timeout control."

What issues? The timeout is controlled by the coordinator, not the client. Or by "control" do you mean setTimeout calls?

Exactly. It's just a question of how these get propagated, which seems somewhat outside of the core of this solution. It's only mentioned because it's in the SPIs.

Mark Little wrote:

"Keeping in mind that this is nowhere near the only process of this complexity to be tested - and no, don't trot out "it's more complex than you think" unless you want to enumerate specific cases (which will probably then be appropriated into additional tests) - I think we'd follow the same approach we'd follow for testing other things. We'd unit test the protocol of course, and test to ensure that the implementation matches the specification, and verify that the protocol handlers on either "end" forward to the proper APIs."

Go take a look at the QA tests for JBossTS. You'll see that a sh*t load of them are covering recovery. And then take a look at XTS and REST-AT. You'll see that a sh*t load of them are covering recovery. Want to take a wild stab in the dark why that might be the case ;-)? Yes, it's complex. It's got to be fault tolerant, so we have to test all of the cases. There are no edge-cases with transactions: it either works or it fails. Unit tests aren't sufficient for this.

Well, it's always good to have a set of existing projects to draw test scenarios from. But otherwise I don't think this is directly relevant to the discussion: unless you're saying "we must test these 200 different scenarios before I let you type 'git commit'". We need high quality, detailed tests for every subsystem. For example having thorougly tested transactions doesn't do us a lot of good if, for example, our JPA implementation or HornetQ or something was writing corrupt data. I mean everything needs thorough testing. Just the fact that these other projects have lots of tests covering recovery doesn't mean that those tests are necessary, and on the other hand, there may be many scenarios unaccounted-for in these tests as well. AS is riddled with highly complex systems that need detailed testing.

If we use an SPI with a documented contract, it is not unreasonable to expect that contract to be met by its implementation. If the contract is not met by the implementation, yeah that's a bug, but saying that it's the responsibility of every project consuming that SPI to verify that its implementation(s) meet the SPI contract is crazy. Yeah we may introduce a test here or there to catch regression in the target project, but even this is not strictly necessary as the target project should be doing this!

In this particular case (solution 2 that is), we're specifying an implementation for XAResource, a transport for it, and an endpoint which controls XATerminator; this says to me that our tests can be limited in scope to testing this mechanism from end to end. As I said if we have other projects we can draw recovery scenarios from, that's fine, and we will do so. I don't know what else to tell you.

Mark Little wrote:

"That's just another way of saying we don't have any special, magical auto-recovery "stuff" that isn't provided by the transaction coordinator (which might well have some magical auto-recovery "stuff"). There might be a better way to express that."

Let me try and rephrase and let me know if I get it wrong: you assume that existing recovery approaches are sufficient for this and nothing new will need to be invented?

Yes, that is my assumption, as we are using existing propagation mechanisms.

Mark Little wrote:

"In case 1, the client has no TM and it uses a remote UserTransaction interface to directly control the remote TM. In case 2, the client is using the local TM to control transactions, and is treating the remote TM as an enrolled resource into the current transaction."

Yeah, so it's interposition. Like I said, these are two difference scenarios.

"Case 1 cannot be made to work when a local TM is present without adding some notion in the EE layer to determine whether it should use the local UserTransaction or the remote one. This is possible but is a possibly significant amount of work."

How significant? If we're putting all options on the table then this needs to be there too.

The problem is that we'd need some way to control which kind of UserTransaction is pulled from JNDI and thus injected into EE components. This can depend on what the user intends to do with it; thus we'd need to isolate many use cases and figure out what level this should be done at (deployment? component? server-wide?), and we need to do some analysis to determine where and how the remote server connection(s) should be specified and associate the two somehow. We're basically choosing between TMs on a per-operation basis. This type of configuration is unprecedented as far as I know - I think the analysis would take as long as the implementation, if not longer. Because it is not known exactly how this should look, I can't say how much effort this is going to be other than "lots".

Mark Little wrote:

"Theoretically each successive "step" will treat the TM of the subsequent "step" as a participating resource. As to D calling A, that will only work if the TM is clever enough to figure out what's happening (I don't see why it wouldn't as the Xid should, well, identify the transaction so A should recognize its own; but that's why we're having this discussion)."

Please go take a look at what we have to do for interposition in JTS. And it's not because JTS is more complex than it needs to be: interposition is a fundamental concept within distributed transactions and the problems, optimisations, recovery semantics etc. are there no matter what object model or distribution approach you use. Take a look at XTS too, for instance.

Yeah but keep in mind that we're dealing in a strict hierarchy here, there are no peers. The transaction isn't so much "distributed" as it is "controlled"; caller always dominates callee. This if D calls A the behavior I'd expect would be that A would treat the imported work as a different or subordinate transaction; it need not really have any direct knowledge that the two are related since the D→A relationship is controlled by D, and the C→D relationship is controlled by C, etc. If the D→A outcome is in doubt then it's up to D to resolve that branch, not A. But that's just my ignoramus opinion.

When it comes to reality, this situation is extremely unlikely to occur even in the weirdest situations I've ever heard of. The reason is that if you've got two nodes invoking on each other, it is highly likely that they are within the same "tier", which greatly increases the likelihood that they could simply run JTS and be done.

Here's what I consider to be a likely, real-world scenario:

Host A runs a thin client which uses the "solution 1" mechanism to control the transaction when it talks to Host B.

Host B runs a "front" tier which is isolated by firewall. This tier has one or more local transactional databases or caches, and a local TM. The services running on B also perform EJB invocations on Host C.

Host C is the "rear" tier separated from B by one or more layer of firewall, and maybe even a public network. B talks to C via remoting, using "solution 2" to propagate transactions to it, using the client/server style of invocation.

Host C participates in a peer-to-peer relationship with other services on Hosts D, E, and F in the same tier, using Remoting or IIOP but using JTS to coordinate the transaction at this level since C, D, E, and F all mutulally execute operations on one another (and possibly each consume local resources) in a distributed object graph style of invocation.

Note you can substitue A and B with an EIS and everything should be exactly the same (except that recovery processes would be performed by the EIS rather than by B's TM).

Everything I understand about transaction processing (which is definitely at least as much as a "joe user") says that there's no reason this shouldn't "just work". And we should be able to utilize existing transaction recovery mechanisms as well.

Reply to this message by going to Community

Start a new discussion in JBoss Transactions Development at Community