On 08/07/2014 01:12 PM, Steve Ebersole wrote:
We only check whether the thread is different from the
"application
thread". As I have mentioned to you about a bazillion times now (give
or take ;) this is all a best effort. As you well know, we simply do
not have all the pertinent information we need to know what the nature
of the various threads in play are.
I just figured out that synchronizing the Hibernate session would not be
enough, as the Hibernate session could potentially be called by the
application thread to mutate the session state (after all entities are
detached).
I thought we had a valid suggestion, of changing the transaction manager
to set rollback only instead of rolling the transaction back from the
background but that doesn't seem to be happening.
Not an easy problem.
On Thu, Aug 7, 2014 at 10:33 AM, Scott Marlow <smarlow(a)redhat.com
<mailto:smarlow@redhat.com>> wrote:
Sanne,
One question that I didn't ask before, when the Hibernate
Synchronization.__afterCompletion(int) is called (with rolledback TX
status), how does Hibernate know if the transaction was rolled back
from the transaction reaper thread? I tried checking the thread
name in WildFly integration code but that is wrong, as the thread
name may be a transaction manager communications thread that is
invoked on behalf of a remote (JVM) reaper thread.
Currently, are checking thread ids which does not work for the
distributed case involving multiple jvms. The current approach also
does not work for remote invocations that use the same JTA
transaction (as the below test case simulates).
Scott
On 08/07/2014 11:02 AM, Sanne Grinovero wrote:
Steve,
we had a conversation about this on IRC, I think we came to a
hopefully
nice
solution
.
[15:18] <sannegrinovero> andrigtmiller: I still think you're
locking out
the TXM, not making it possible to legitimately timeout queries..
[15:19] <sannegrinovero> (but I don't have the full picture..
not really
my area)
[15:19] <smarlow> that is a transaction manager concern but they
already
handled calling other synchronization call backs. currently, Arjuna
holds a lock on the collection of synchronizations at that time, so
there should be no conflict
[15:20] <smarlow> I asked about your concern yesterday to make
sure we
wouldn't deadlock
[15:20] <sannegrinovero> I wasn't actually thinking about
deadlock, I
just don't see what you're aiming at with the single lock on the
Session
[15:22] <smarlow> to avoid calling Hibernate session.clear from a
background thread while the application thread is also doing a
Hibernate
session invocation. which can lead to random exceptions
[15:22] <smarlow> I'm aiming at solving the concurrency concern ^
[15:22] <sannegrinovero> But I guess you'd need to talk with
sebersole,
this lock thing you're proposing comes out of the blue for me :)
[15:22] <sannegrinovero> yes I get that
[15:23] <sannegrinovero> but you should allow the session.clear() to
proceed, not lock it out
[15:23] <smarlow> sure, I'm just trying to talk to who ever
listens and
have feedback. Before I get anyway near code changes, Sebersole
needs
to be on board (he isn't yet)
[15:24] <sannegrinovero> I'm neither :) willing to discuss for
sake of
interest, but can't replace sebersole on this I think :)
[15:24] <smarlow> I posted on as7 dev mailing list a few years
ago and
the only feedback that I got was that only Hibernate was not
handling
concurrency but there could be others
[15:25] <sannegrinovero> what is the effect you expect to have
users see
when the background thread kills the current session?
[15:25] <smarlow> I cross posted as well
[15:25] <smarlow> users shouldn't get NPE errors or unexpected
exceptions
[15:26] <sannegrinovero> sure, but what kind of effect would you
propose?
[15:26] <sannegrinovero> I guess a different exception with a better
error message?
[15:26] <smarlow> my goal is ensuring that only one thread is
invoking
the Hibernate session at a time
[15:27] <sannegrinovero> No that's not true, as you're saying
that you
want to allow the TXM to rollback the transaction
[15:27] <smarlow> well, want is a strong word :)
[15:28] <smarlow> IBM/Oracle/JBoss all do that
[15:28] <sannegrinovero> and that's an implementation detail
what you
just explained :)
[15:28] <sannegrinovero> what I'm asking is what you expect the
user to
experience when this needs to happen
[15:29] <smarlow> when the transaction is rolled back, depending on
where the application code is in the Hibernate session invocation, a
JDBC error might occur or something related to that. The goal is to
avoid mutating the Hibernate session from two different threads
concurrently
[15:30] <sannegrinovero> you're not answering my question, that's
an
implementation detail :)
[15:30] <smarlow> if we can avoid violating concurrency of the
hibernate
session, I think we will be more robust.
[15:31] <sannegrinovero> The goal of the TXM timeout, is to kill
stuff
which is taking too long.. not allowing it to finish in a safe lock.
andrigtmiller am I understanding the basics correctly?
[15:31] <smarlow> currently, what the user will experience is
poor. The
scope of turning that into a more pleasant experience is a good
question
but involves many moving and separate parts
[15:32] <sannegrinovero> so you actually _need_ to allow concurrent
access, to kill and rollback the operations which are being done
[15:32] <smarlow> the goal of TXM timeout handling in
JBoss/Oracle/IBM
is to also handle deadlocks that might not otherwise be
recovered from
[15:32] <smarlow> the concurrent access happens at the resource
level
[15:33] <smarlow> from the background, with components that handle
concurrency. Hibernate doesn't
[15:33] <sannegrinovero> that's just one part of the things
[15:33] <sannegrinovero> and we can definitely have Hibernate handle
some concurrent events
[15:33] <smarlow> Hibernate sessions are not supposed to handle
concurrency, so its a design flaw
[15:33] <sannegrinovero> the TXM shouldn't invoke clear() (which is
public API) but invoke a specific method which could provide the
needed
semantics
[15:33] <smarlow> its not some events, its many events
[15:34] <sannegrinovero> why do you say it's a design flaw? it's
not,
for the "normal" public API usage
[15:34] <smarlow> the TXM, calls the Hibernate
Synchronization.__afterCompletion(int) and Hibernate detaches
entities
[15:34] <smarlow> and Hibernate does other cleaning up as well
[15:35] <smarlow> its a design flaw as its not handled in our system
[15:35] <sannegrinovero> I don't feel I'm making progress in the
conversation if you keep repeating the implementation details,
we can
fix that as we please
[15:35] <sannegrinovero> but you have to explain the expected user
experience
[15:36] <smarlow> you mean if we didn't have any constraints of
any of
the existing specs that we implement?
[15:36] <sannegrinovero> yes, just explain what you think it
should do
please, that would help me understand, and I think I can propose you
something more concrete
[15:37] <smarlow> if we ignore XA, JTA + JPA, I can answer the
question
better and will attempt that
[15:37] <sannegrinovero> the problem is the user sees an NPE right?
[15:38] <smarlow> could be an NPE or any unexpected exception
that comes
from using a thread-unsafe component from multiple threads
[15:38] <sannegrinovero> so the basic question is what do you
think the
user should "see".. I guess another exception, say
HibernateRolledBackException("__The Transaction Manager decided
we run out
of time")
[15:38] <sannegrinovero> sure. would you like the above proposal ^
[15:39] <sannegrinovero> Because implementing that is easy :)
[15:41] <sannegrinovero> ? If you're busy I'm happy to talk
later, I am
3h late with my lunch
[15:41] <smarlow> in a perfect world, 1) the user registers an
application event listener that tells that the transaction timed
out.
2) whether the application thread catches a signal that the
transaction is about to be rolled back. 3) The application
thread then
catches a signal that the transaction was rolled back
[15:41] <sannegrinovero> that doesn't explain what you expect
the client
code to experience.
[15:41] <smarlow> 4) the application thread then catches a
signal that
Synchronization call backs are going to happen to clean up after the
roll back
[15:42] <sannegrinovero> say the user was incoking a
"Query.list()" ..
what do we return?
[15:42] <smarlow> what could they expect? They could be in the
middle
of code that doesn't use the Hibernate session or could be in
code that does
[15:43] <sannegrinovero> Exactly my point, so we'd return an
exception
like I proposed above right?
[15:43] <smarlow> if you say so but no one knows how to do that
[15:43] <smarlow> I mean, I'm not sure what returns that
[15:44] <smarlow> we had tried doing what you suggest but it was too
performance expensive and covered too few cases
[15:44] <smarlow> but sure, throwing an EJB rolledback exception
is ideal
[15:45] <sannegrinovero> you can do it in an efficient way, but
yes I
agree you'd have to patch several areas of code.
[15:45] <sannegrinovero> I'd do it incrementally though, start
with the
Session and Query APIs, see if people like it
[15:45] <smarlow> our previous attempt that failed, added a
pre-check at
the front of every Hibernate session call, tail end and middle
[15:46] <smarlow> but that didn't work because it didn't handle
remote
transactions or distributed transactions
[15:46] <sannegrinovero> the EntityManager implementation has
some kind
of "exception translator"
[15:46] <smarlow> and performance suffered
[15:46] <sannegrinovero> all you need to do is catch exceptions, and
re-throw the better explanation
[15:46] <smarlow> so, we have less of the checking today and
still don't
handle remote transactions and distributed transactions
[15:47] <sannegrinovero> the TXM just needs to flag a volatile
field in
the Session to inject the error it wants us to tell
[15:47] <sannegrinovero> so on the optimal path you just have a
volatile
field read operation, which is a zero cost essentially
[15:47] <sannegrinovero> (optimal path I mean the case in which
there
are no errors)
[15:48] <smarlow> then we need to poll for that flag in the
start/middle/end (or so) if every Hibernate method
[15:48] <sannegrinovero> no
[15:48] <smarlow> or maybe in the finally clause of every method
[15:48] <sannegrinovero> you just need to catch exceptions
[15:48] <sannegrinovero> and we happen to already do that,
because you
need to translate all Hibernate specific exceptions to JPA
specific ones
[15:49] <sannegrinovero> so it's actually a very simple patch
with zero
overhead I think
[15:49] <smarlow> at worse, we catch exceptions (including NPE) and
notice the flag and then eat the cause exception?
[15:49] <sannegrinovero> +1
[15:49] <smarlow> if we show the cause, its confusing
[15:49] <sannegrinovero> you don't need the finally method
either though
[15:49] <smarlow> if we eat it, its confusing
[15:49] <sannegrinovero> right, don't show the cause.
[15:49] <smarlow> I hate when exceptions are eaten
[15:50] <sannegrinovero> well.. the TXM should provide you a
sensible
error message, like "aborted because of ... ", and we take that as
explanation.
[15:50] <smarlow> but only if that flag is set
[15:50] <sannegrinovero> of course!
[15:50] <smarlow> sounds like a worthwhile idea, thanks
[15:51] <sannegrinovero> np, I'm very happy if it works :)
[15:51] <sannegrinovero> the exception translation is already done
somewhere in the EM project
[15:51] <sannegrinovero> I don't remember the details, but I'm
sure
you're more familiar with it
[15:51] <sannegrinovero> just make sure you look that up
[15:52] <sannegrinovero> it catches all exceptions from Hibernate to
wrap them in something else, to satisfy the spec requirements
-- Sanne