Sanne,
One question that I didn't ask before, when the Hibernate
Synchronization.afterCompletion(int) is called (with rolledback TX
status), how does Hibernate know if the transaction was rolled back from
the transaction reaper thread? I tried checking the thread name in
WildFly integration code but that is wrong, as the thread name may be a
transaction manager communications thread that is invoked on behalf of a
remote (JVM) reaper thread.
Currently, are checking thread ids which does not work for the
distributed case involving multiple jvms. The current approach also
does not work for remote invocations that use the same JTA transaction
(as the below test case simulates).
Scott
On 08/07/2014 11:02 AM, Sanne Grinovero wrote:
Steve,
we had a conversation about this on IRC, I think we came to a
hopefully
nice
solution
.
[15:18] <sannegrinovero> andrigtmiller: I still think you're locking out
the TXM, not making it possible to legitimately timeout queries..
[15:19] <sannegrinovero> (but I don't have the full picture.. not really
my area)
[15:19] <smarlow> that is a transaction manager concern but they already
handled calling other synchronization call backs. currently, Arjuna
holds a lock on the collection of synchronizations at that time, so
there should be no conflict
[15:20] <smarlow> I asked about your concern yesterday to make sure we
wouldn't deadlock
[15:20] <sannegrinovero> I wasn't actually thinking about deadlock, I
just don't see what you're aiming at with the single lock on the Session
[15:22] <smarlow> to avoid calling Hibernate session.clear from a
background thread while the application thread is also doing a Hibernate
session invocation. which can lead to random exceptions
[15:22] <smarlow> I'm aiming at solving the concurrency concern ^
[15:22] <sannegrinovero> But I guess you'd need to talk with sebersole,
this lock thing you're proposing comes out of the blue for me :)
[15:22] <sannegrinovero> yes I get that
[15:23] <sannegrinovero> but you should allow the session.clear() to
proceed, not lock it out
[15:23] <smarlow> sure, I'm just trying to talk to who ever listens and
have feedback. Before I get anyway near code changes, Sebersole needs
to be on board (he isn't yet)
[15:24] <sannegrinovero> I'm neither :) willing to discuss for sake of
interest, but can't replace sebersole on this I think :)
[15:24] <smarlow> I posted on as7 dev mailing list a few years ago and
the only feedback that I got was that only Hibernate was not handling
concurrency but there could be others
[15:25] <sannegrinovero> what is the effect you expect to have users see
when the background thread kills the current session?
[15:25] <smarlow> I cross posted as well
[15:25] <smarlow> users shouldn't get NPE errors or unexpected exceptions
[15:26] <sannegrinovero> sure, but what kind of effect would you propose?
[15:26] <sannegrinovero> I guess a different exception with a better
error message?
[15:26] <smarlow> my goal is ensuring that only one thread is invoking
the Hibernate session at a time
[15:27] <sannegrinovero> No that's not true, as you're saying that you
want to allow the TXM to rollback the transaction
[15:27] <smarlow> well, want is a strong word :)
[15:28] <smarlow> IBM/Oracle/JBoss all do that
[15:28] <sannegrinovero> and that's an implementation detail what you
just explained :)
[15:28] <sannegrinovero> what I'm asking is what you expect the user to
experience when this needs to happen
[15:29] <smarlow> when the transaction is rolled back, depending on
where the application code is in the Hibernate session invocation, a
JDBC error might occur or something related to that. The goal is to
avoid mutating the Hibernate session from two different threads concurrently
[15:30] <sannegrinovero> you're not answering my question, that's an
implementation detail :)
[15:30] <smarlow> if we can avoid violating concurrency of the hibernate
session, I think we will be more robust.
[15:31] <sannegrinovero> The goal of the TXM timeout, is to kill stuff
which is taking too long.. not allowing it to finish in a safe lock.
andrigtmiller am I understanding the basics correctly?
[15:31] <smarlow> currently, what the user will experience is poor. The
scope of turning that into a more pleasant experience is a good question
but involves many moving and separate parts
[15:32] <sannegrinovero> so you actually _need_ to allow concurrent
access, to kill and rollback the operations which are being done
[15:32] <smarlow> the goal of TXM timeout handling in JBoss/Oracle/IBM
is to also handle deadlocks that might not otherwise be recovered from
[15:32] <smarlow> the concurrent access happens at the resource level
[15:33] <smarlow> from the background, with components that handle
concurrency. Hibernate doesn't
[15:33] <sannegrinovero> that's just one part of the things
[15:33] <sannegrinovero> and we can definitely have Hibernate handle
some concurrent events
[15:33] <smarlow> Hibernate sessions are not supposed to handle
concurrency, so its a design flaw
[15:33] <sannegrinovero> the TXM shouldn't invoke clear() (which is
public API) but invoke a specific method which could provide the needed
semantics
[15:33] <smarlow> its not some events, its many events
[15:34] <sannegrinovero> why do you say it's a design flaw? it's not,
for the "normal" public API usage
[15:34] <smarlow> the TXM, calls the Hibernate
Synchronization.afterCompletion(int) and Hibernate detaches entities
[15:34] <smarlow> and Hibernate does other cleaning up as well
[15:35] <smarlow> its a design flaw as its not handled in our system
[15:35] <sannegrinovero> I don't feel I'm making progress in the
conversation if you keep repeating the implementation details, we can
fix that as we please
[15:35] <sannegrinovero> but you have to explain the expected user
experience
[15:36] <smarlow> you mean if we didn't have any constraints of any of
the existing specs that we implement?
[15:36] <sannegrinovero> yes, just explain what you think it should do
please, that would help me understand, and I think I can propose you
something more concrete
[15:37] <smarlow> if we ignore XA, JTA + JPA, I can answer the question
better and will attempt that
[15:37] <sannegrinovero> the problem is the user sees an NPE right?
[15:38] <smarlow> could be an NPE or any unexpected exception that comes
from using a thread-unsafe component from multiple threads
[15:38] <sannegrinovero> so the basic question is what do you think the
user should "see".. I guess another exception, say
HibernateRolledBackException("The Transaction Manager decided we run out
of time")
[15:38] <sannegrinovero> sure. would you like the above proposal ^
[15:39] <sannegrinovero> Because implementing that is easy :)
[15:41] <sannegrinovero> ? If you're busy I'm happy to talk later, I am
3h late with my lunch
[15:41] <smarlow> in a perfect world, 1) the user registers an
application event listener that tells that the transaction timed out.
2) whether the application thread catches a signal that the
transaction is about to be rolled back. 3) The application thread then
catches a signal that the transaction was rolled back
[15:41] <sannegrinovero> that doesn't explain what you expect the client
code to experience.
[15:41] <smarlow> 4) the application thread then catches a signal that
Synchronization call backs are going to happen to clean up after the
roll back
[15:42] <sannegrinovero> say the user was incoking a "Query.list()" ..
what do we return?
[15:42] <smarlow> what could they expect? They could be in the middle
of code that doesn't use the Hibernate session or could be in code that does
[15:43] <sannegrinovero> Exactly my point, so we'd return an exception
like I proposed above right?
[15:43] <smarlow> if you say so but no one knows how to do that
[15:43] <smarlow> I mean, I'm not sure what returns that
[15:44] <smarlow> we had tried doing what you suggest but it was too
performance expensive and covered too few cases
[15:44] <smarlow> but sure, throwing an EJB rolledback exception is ideal
[15:45] <sannegrinovero> you can do it in an efficient way, but yes I
agree you'd have to patch several areas of code.
[15:45] <sannegrinovero> I'd do it incrementally though, start with the
Session and Query APIs, see if people like it
[15:45] <smarlow> our previous attempt that failed, added a pre-check at
the front of every Hibernate session call, tail end and middle
[15:46] <smarlow> but that didn't work because it didn't handle remote
transactions or distributed transactions
[15:46] <sannegrinovero> the EntityManager implementation has some kind
of "exception translator"
[15:46] <smarlow> and performance suffered
[15:46] <sannegrinovero> all you need to do is catch exceptions, and
re-throw the better explanation
[15:46] <smarlow> so, we have less of the checking today and still don't
handle remote transactions and distributed transactions
[15:47] <sannegrinovero> the TXM just needs to flag a volatile field in
the Session to inject the error it wants us to tell
[15:47] <sannegrinovero> so on the optimal path you just have a volatile
field read operation, which is a zero cost essentially
[15:47] <sannegrinovero> (optimal path I mean the case in which there
are no errors)
[15:48] <smarlow> then we need to poll for that flag in the
start/middle/end (or so) if every Hibernate method
[15:48] <sannegrinovero> no
[15:48] <smarlow> or maybe in the finally clause of every method
[15:48] <sannegrinovero> you just need to catch exceptions
[15:48] <sannegrinovero> and we happen to already do that, because you
need to translate all Hibernate specific exceptions to JPA specific ones
[15:49] <sannegrinovero> so it's actually a very simple patch with zero
overhead I think
[15:49] <smarlow> at worse, we catch exceptions (including NPE) and
notice the flag and then eat the cause exception?
[15:49] <sannegrinovero> +1
[15:49] <smarlow> if we show the cause, its confusing
[15:49] <sannegrinovero> you don't need the finally method either though
[15:49] <smarlow> if we eat it, its confusing
[15:49] <sannegrinovero> right, don't show the cause.
[15:49] <smarlow> I hate when exceptions are eaten
[15:50] <sannegrinovero> well.. the TXM should provide you a sensible
error message, like "aborted because of ... ", and we take that as
explanation.
[15:50] <smarlow> but only if that flag is set
[15:50] <sannegrinovero> of course!
[15:50] <smarlow> sounds like a worthwhile idea, thanks
[15:51] <sannegrinovero> np, I'm very happy if it works :)
[15:51] <sannegrinovero> the exception translation is already done
somewhere in the EM project
[15:51] <sannegrinovero> I don't remember the details, but I'm sure
you're more familiar with it
[15:51] <sannegrinovero> just make sure you look that up
[15:52] <sannegrinovero> it catches all exceptions from Hibernate to
wrap them in something else, to satisfy the spec requirements
-- Sanne