[
http://jira.jboss.com/jira/browse/HIBERNATE-41?page=comments#action_12340066 ]
Yegor Yenikyeyev commented on HIBERNATE-41:
-------------------------------------------
Well i really won't dictate this point of view. My proposal decribed above is based
upon three thoughts:
(1) There is no need to put newly loaded data into L2 as long as they are available in
L1/session. I assume this normally works for "within the transaction" and for
PROPAGATION_REQUIRES_NEW in combination with READ_UNCOMMITED. This makes me thing that
putting into L2 *during a transaction* is not 100% necessary. Do you see any other major
reasons why it need to be stored into L2 before a commit is invoked?
(2) Second and the most important as for me. When we put into L2 it keeps write lock for
that key until unlock interceptor called (which won't happen until the transaction is
done). With pessimistic locks and READ_UNCOMMITED it's painful b/c i can't obtain
the same instance of Round object in PROPAGATION_REQUIRES_NEW method wich i call from
withing the transaction. As soon as i call this method and it loads Round with FOR_UPGRADE
mode it tries to put newly loaded instance to L2 again. But the write lock is still there!
With optimistic locks you could probably ignore exception caused by existing write lock
and continue without updating L2 instance but it's impossible with pessimistic locks.
Do you see any other ways of how can i load the same object instance in both transactions
when tx1 calls tx2?
(3) Finally, we really want a better performance for our system from L2 (and i suppose
everyone does 8-). Why don't we improve concurrency of the ORM layer by decreasing
total time of a write lock? Don't you think that using tx commit interceptor(s) for
putting object(s) into L2 cache could improve concurrency?
Also i would like to ask you this: Do you think that the deadlock i'm getting is
something you guys wanted to add to Hibernate "by design"? I'm just trying
to figure out if this is a *real* problem b/c for now i do not see why anybody can say
that the program flow we have is illegal or makes no sense or contains a mistake. If
it's really my code problem then i would like to check what is the expected flow for
that task. Please advise.
Thanks!
Usage of load() in PROPAGATION_REQUIRES_NEW within
PROPAGATION_REQUIRED method causes a deadlock
------------------------------------------------------------------------------------------------
Key: HIBERNATE-41
URL:
http://jira.jboss.com/jira/browse/HIBERNATE-41
Project: Hibernate
Issue Type: Bug
Environment: SUSE 10.1, kernel 2.6.15 SPM, JDK 1.5.0_06, PostgreSQL 8.0.1
Reporter: Yegor Yenikyeyev
Assigned To: Steve Ebersole
Priority: Blocker
Attachments: nloptc.zip
It seems like we discovered an unpredictable Hibernate and/or TreeCache behavior after
upgrade JBossCache from 1.2.4SP2 to JBossCache 1.3.0SP2. For now I can witness that the
same problem appears with 1.4.0CR2. I do not think that it's Hibernate-only or
TreeCache-only issue but I do think it's a kind of integration issue or
misunderstanding of how TreeCache transaction isolation is implemented. According Manik
Surtani JBossCache v1.2.4 had an issue with READ_COMMITED implementation (JBCACHE-218 )
and sinnce it fixed any usage of PROPAGATION_REQUIRES_NEW within PROPAGATION_REQUIRED
method causes a deadlock if both method load same instance of an entiry.
Our application works in clustered environment and we use JBossCache as L2 cache solution
for Hibernate 3.1.3 (I checked this with 3.2.0CR2 as well). Our settings for JBossCache
are
REPL_SYNC, READ_COMMITED and our target business object methods (f1 and f2) have
PROPAGATION_REQUIRED and PROPAGATION_REQUIRES_NEW. Our JDBC driver is 3.0 compliant.
Our objects hierarchy is like: Occasion contains link to Round and Round contains link to
Tournament. Round is NOT configured as "lazy" field in Occasion mapping b/c we
always need to have it initialized.
Here is in short what we try to do in our application:
(1) Transaction1: Call f1 (PROPAGATION_REQUIRED) method of a business object and it
causes Occasion1 to be loaded via a cachable query. After that Hibernate initializes
Occasion1.round field and loads Round1.
(2) Transaction1: Hibernate puts loaded Occasion1 and Round1 in L2 cache.
(3) Transaction1: TreeCache creates com/companyname/Occasion/com.companyname.Occasion#1
region and obtains WriteLock (WL1)
(4) Transaction1: TreeCache creates com/companyname/Round/com.companyname.Round#1 region
and obtains WriteLock (WL2)
(5) Transaction1: Do some business logic stuff
(6) Transaction1: We expect current transaction to be long and we want to change status
of Occasoin1 in DB very quickly. At this point we need an exclusive lock for appropriate
row in DB table to change the status and commit it. In order to do this we call f2
(PROPAGATION_REQUIRES_NEW) which suppose to be a REALLY short transaction which release
lock on the DB row as fast as possible.
(7) Transaction2: Transaction1 SUSPENDED at this point. We call HibernateTemplate (we use
Spring as well) to load Occasion1 for update with LockMode.UPGRADE flag and get exclusive
lock.
(8) Transaction2: Hibernate does NOT check for an instance of Occasion1 in L2 cache ( I
suppose it's b/c we obviously do want to lock it for update )
(9) Transaction2: Hibernate does check for an instance of Round1 in L2 cache and it calls
get() on TreeCache to obtain com/companyname/Round/com.companyname.Round#1
(10) Transaction2: At this point 1.3.0SP2 tries to obtain ReadLock for
com/companyname/Round/com.companyname.Round#1 and it can't b/c there is a WL for that
node in suspended Transaction1 !!! It can't obtain ReadLock for Round#1 anyhow!
(11) Transaction2: Stuck waiting for WL2 to be released in TreeCache but it can't be
released as soon as Transaction1 suspended and waits for Transaction2 to finish.
Obviously this situation is ridiculous - a legal sequence of operations causes a deadlock
on TreeCache. We do not expect com/companyname/Round/com.companyname.Round#1 to be visible
in Transaction2 b/c we use READ_COMMITED but WL2 must not affect Transaction2 in this way.
As soon as TreeCache prevents other transactions from reading
com/companyname/Round/com.companyname.Round#1 it must not tell other transactions that the
node exists to keep READ_COMMITED behavior consistent. For now it simply preventing
everybody from using PROPAGATION_REQUIRES_NEW.
The described scenario works with 1.2.4SP2 without a problem and I have serious concern
that READ_COMMITED strategy is really implemented in v1.2.4 but at least the behavior is
more consistent comparing to v1.3.0. As far as i understand this is result of JBCACHE-218
bugfix.
We tried to change PROPAGATION_REQUIRES_NEW to PROPAGATION_NESTED and take advantage of
nested transactions. We assume that com/companyname/Round/com.companyname.Round#1 would be
available in a nested Transaction2 from Transaction1. But PROPAGATION_NESTED isn't
supported by current JBossTransaction implementation (see line 209 in TxManager.java from
4.0.4.GA).
We could change isolation to READ_UNCOMMITED but it's simply impossible in many other
places of our application.
We could make a trick and load Occasion1 with Round1 in a separate Transaction0 before
starting Transaction1 but we HAVE to use LRU policy. That is why there is no chance for us
to make sure that eviction won't happen between Transaction0 and Transaction1. If it
happened then we are in the same situation as described above.
Finally we could stop using Transaction2 but our application is intend to handle large
amount of traffic and as soon as Transaction1 takes up to 3sec (comparing to 50ms for
Transaction2) we might get up to 700-1000 transactions on queue waiting for table row lock
to be released and we just can't allow this.
From what I see in Hibernate TreeCache sources and I have no idea how to avoid the
situation described above. One of my developers told me that probably it's possible to
put stuff into L2 cache on transaction commit which would decrease WL time and resolve the
issue with the deadlock. Honestly I'm seriously concerned how it applies to existing
Hibernate. I think small issues like performance issue of loading the same object during 1
transaction more then once can be resolved by using L1 cache or JDBC driver abilities. But
I guess there are a plenty of work to make this working for cachable queries.
Another option I see is to do a trick and put values for Round1 and Occasion1 into a new
region for Transaction2 if we know that Transaction1 suspended and owns WLs for various
nodes. I really do not like this way b/c in fact it's not a pure pessimistic locking.
But the issue described before is worse price for "pure" READ_COMMITED strategy.
In fact it showstopper assuming there is no way to use PROPAGATION_NESTED.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira