[
http://opensource.atlassian.com/projects/hibernate/browse/HHH-5608?page=c...
]
mikhailfranco edited comment on HHH-5608 at 9/30/10 10:13 AM:
--------------------------------------------------------------
It is not a pure batch process, it's a combination of streaming in event data to be
stored, and various derived calculations. At the moment, the operations are interleaved
through callbacks, but I suppose they could be separated in some way.
My current approach is try using detach() to evict individual entities from the session
cache after they are created, but it involves touching a lot of code.
How can I create a stateless session through JPA ?
I assume I could cast the EntityManager.getDelegate() back to a Hibernate Session if I
need to, then call clear().
I still think the session cache should be bounded in some way, but I realize this would
detach application objects at some arbitrary time, behind the application's back, and
cause spurious 'detached object' errors, which would force the application to
merge() all over the place. Which brings us back to everyone's favorite JPA issue: why
does JPA throw a 'detached object' error and not just call merge itself ? Because
merge creates a whole new object ... etc. etc.
was (Author: mik):
It is not a pure batch process, it's a combination of streaming in event data to
be stored, and various derived calculations. At the moment, the operations are interleaved
through callbacks, but I suppose they could be separated in some way.
My current approach is try using detach() to evict individual entities from the session
cache after they are created, but it involes touching a lot of code.
How can I create a stateless session through JPA ?
I assume I could cast the EntityManager.getDelegate() back to a Hibernate Session if I
need to.
I still think the session cache should be bounded in some way, but I realize this would
detach application objects at some arbitrary time, behind the application's back, and
cause spurious 'detached object' errors, which would force the application to
merge() all over the place. Which brings us back to everyone's favorite JPA issue: why
does JPA throw a 'detached object' error and not just call merge itself ? Because
merge creates a whole new object ... etc. etc.
Performance problem with cascadeOnFlush
----------------------------------------
Key: HHH-5608
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HHH-5608
Project: Hibernate Core
Issue Type: Bug
Components: core
Affects Versions: 3.5.2
Environment: Hibernate 3.5.2 JPA 2.0
HSQLDB 2.0.0 and PostgreSQL 8.4
Windows XP 2002
JDK 1.6.0_11
Reporter: mikhailfranco
Performance of adding entities to the database degrades as O(n^2) or worse.
The problem seems to be in AbstractFlushingEventListener.
The method prepareEntityFlushes() calls cascadeOnFlush() for every object in the session
cache, so cascadeOnFlush ends up being called n^2 times.
The cascadeOnFlush method seems to be very slow, even when there is nothing to do, i.e.
no actual cascading. It appears to create a new Cascade object and do lots of work in
cascade() for every invocation.
Also see the description given in this thread:
http://www.mail-archive.com/nhusers@googlegroups.com/msg14727.html
> I think there is a problem with our mapping or something with how
> we're using NHibernate. Whenever we've done performance testing we've
> always noticed that NH takes up a large amount of time and usually
> it's a ridiculous amount of time based on how little data should be
> saved.
>
> I started doing some profiling to try to see what we're doing wrong
> and I noticed during one test that there were 50 calls to
> PrepareEntityFlushes which is an acceptable amount of calls but then I
> noticed it resulted in 584,441 calls to CascadeOnFlush.
>
> So my question is does this number seem a bit excessive
> or is this normal?
Clearly this is really very excessive !
He goes on to say:
> I've made another interesting discovery. The problem was actually
> caused by the fact we were calling a Flush() after every SaveOrUpdate
> (silly way to try to make sure we had an Id -- we're handling things
> differently now). By switching to AutoFlush we were able to take
> something that took an operation that took 2 hours (and spent +90% of
> the time in NH code) and lower it to 4 minutes.
In my case, I do not call flush(), but add lots of entities in transactions. The profiler
shows 800 calls to commit, and 85,000 calls to cascadeOnFlush. There are no cascade
relationships declared for the entities being persisted.
Not sure what the solution could be, probably some combination of:
- making cascade check more efficient, don't create objects,
truncate cascade quickly if there is no work to do, etc.
- limit the size of the cache, use bounded cache, maybe LRU algorithm, etc.
- give some kind of diagnostic warning if a large cache is configured,
with lots of cascade relationships in the ORM
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira