Re: [hibernate-dev] Multiple-object batched inserts
by Brett Wooldridge
Yeah, I read through the whole discussion. Pity about the Map - would have
been much more efficient than sorting - but I'm glad there is a solution
nonetheless.
-Brett
On 8/23/07 7:36 AM, "Steve Ebersole" <steve(a)hibernate.org> wrote:
> Wanted to point out that a Map keyed by SQL is not sufficient, because that
> approach would not properly account for self-referential associations (the
> same SQL).
>
> The patch provided for HHH-1 is well thought out in terms of accounting for
> all association scenarios. It does not, of course, handle multi-table
> structures which require a much different type of solution given how
> Hibernate groups the DML for these behind the persisters (I know the general
> approach I will use to solve this one now, btw).
>
> On Thursday 23 August 2007 01:32:02 am Max Rydahl Andersen wrote:
>> http://opensource.atlassian.com/projects/hibernate/browse/HHH-1
>>
>> /max
>>
>>> I thought his question was a little too technical for the general list,
>>> so I thought I¹d ask it here. Is there a technical reason why Hibernate
>>> does not or cannot support the batching of multiple (different) object
>>> insertions within the same session?
>>>
>>> I have code that is semantically equivalent to:
>>>
>>> Session session = sessionFactory.getCurrentSession();
>>> for (int i = 0; i < 1000; i++) {
>>> Foo foo = new Foo();
>>> session.save(foo);
>>>
>>> Bar bar = new Bar();
>>> session.save(bar);
>>> }
>>> session.flush();
>>>
>>> If Foo or Bar is not in the equation, batching occurs as desired.
>>> However, if both Foo and Bar are inserted during the session in
>>> alternating fashion the result is that in the ActionQueue the 'insertion
>>> list' is interleaved with Foo's and Bar's. This is natural enough it
>>> seems. However, when ActionQueue.executeActions() is called the
>>> following occurs...
>>>
>>> 1) executeAction() iterates through the insertion list calling execute()
>>> on each Executable in the list
>>> 2) the Executable.execute() method calls insert() on the associated
>>> persister, in the case a SingleTableEntityPersister. It seems to realize
>>> that it CAN batch these inserts and so ...
>>> 3) insert() calls the BatchingBatcher.prepareBatchStatement(sql) which is
>>> actually a method on AbstractBatcher. This is where things go "awry"...
>>>
>>> The prepareBatchStatement(sql) does this thing where it compares the SQL
>>> statement that was provided with the _last SQL that it executed_ and if
>>> they match it re-uses the _last PreparedStatement_ which it also hung
>>> onto, otherwise it closes the existing PreparedStatement and creates a
>>> new one -- effectively ending the current "batch".
>>>
>>> In this case, because the objects are interleaved in the insertion list
>>> the result is that the prepareBatchStatement() is called with:
>>>
>>> insert into foo (a, b, c) values (?, ?, ?)
>>> insert into bar (x, y, z) values (?, ?, ?)
>>> insert into foo (a, b, c) values (?, ?, ?)
>>> insert into bar (x, y, z) values (?, ?, ?)
>>> insert into foo (a, b, c) values (?, ?, ?)
>>> insert into bar (x, y, z) values (?, ?, ?)
>>> ...
>>>
>>> The result being that each subsequent statement terminates the statement
>>> prepared before it, preventing batching from occuring. Well, it logs a
>>> debug message saying "Executing batch size: 1" for each statement.
>>>
>>> So my question is this ... is there any reason not to use a Map<sql,
>>> PreparedStatement> construct in the BatchingBatcher such that the
>>> appropriate PreparedStatement can be located and re-used based on the
>>> incoming SQL rather than only retaining the _last_ SQL statement which is
>>> guaranteed to "flap" in the case of alternating (or multiple) objects?
>>>
>>> If there is no technical reason why this shouldn't be done or wouldn't
>>> work, I will be happy to make the change and submit a patch. [Can you
>>> tell how bad I need batching in this scenario?]
>>>
>>> Thanks,
>>> Brett Wooldridge
>>> The ZipTie Project
>>> www.ziptie.org
>>>
>>>
>>> _______________________________________________
>>> hibernate-dev mailing list
>>> hibernate-dev(a)lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev(a)lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
17 years, 3 months
Re: [hibernate-dev] Multiple-object batched inserts
by Brett Wooldridge
Thanks for that.
-Brett
On 8/23/07 1:32 AM, "Max Rydahl Andersen" <max.andersen(a)redhat.com> wrote:
>
> http://opensource.atlassian.com/projects/hibernate/browse/HHH-1
>
> /max
>
>> I thought his question was a little too technical for the general list, so I
>> thought I¹d ask it here. Is there a technical reason why Hibernate does not
>> or cannot support the batching of multiple (different) object insertions
>> within the same session?
>>
>> I have code that is semantically equivalent to:
>>
>> Session session = sessionFactory.getCurrentSession();
>> for (int i = 0; i < 1000; i++) {
>> Foo foo = new Foo();
>> session.save(foo);
>>
>> Bar bar = new Bar();
>> session.save(bar);
>> }
>> session.flush();
>>
>> If Foo or Bar is not in the equation, batching occurs as desired. However,
>> if both Foo and Bar are inserted during the session in alternating fashion
>> the result is that in the ActionQueue the 'insertion list' is interleaved
>> with Foo's and Bar's. This is natural enough it seems. However, when
>> ActionQueue.executeActions() is called the following occurs...
>>
>> 1) executeAction() iterates through the insertion list calling execute() on
>> each Executable in the list
>> 2) the Executable.execute() method calls insert() on the associated
>> persister, in the case a SingleTableEntityPersister. It seems to realize
>> that it CAN batch these inserts and so ...
>> 3) insert() calls the BatchingBatcher.prepareBatchStatement(sql) which is
>> actually a method on AbstractBatcher. This is where things go "awry"...
>>
>> The prepareBatchStatement(sql) does this thing where it compares the SQL
>> statement that was provided with the _last SQL that it executed_ and if they
>> match it re-uses the _last PreparedStatement_ which it also hung onto,
>> otherwise it closes the existing PreparedStatement and creates a new one --
>> effectively ending the current "batch".
>>
>> In this case, because the objects are interleaved in the insertion list the
>> result is that the prepareBatchStatement() is called with:
>>
>> insert into foo (a, b, c) values (?, ?, ?)
>> insert into bar (x, y, z) values (?, ?, ?)
>> insert into foo (a, b, c) values (?, ?, ?)
>> insert into bar (x, y, z) values (?, ?, ?)
>> insert into foo (a, b, c) values (?, ?, ?)
>> insert into bar (x, y, z) values (?, ?, ?)
>> ...
>>
>> The result being that each subsequent statement terminates the statement
>> prepared before it, preventing batching from occuring. Well, it logs a
>> debug message saying "Executing batch size: 1" for each statement.
>>
>> So my question is this ... is there any reason not to use a Map<sql,
>> PreparedStatement> construct in the BatchingBatcher such that the
>> appropriate PreparedStatement can be located and re-used based on the
>> incoming SQL rather than only retaining the _last_ SQL statement which is
>> guaranteed to "flap" in the case of alternating (or multiple) objects?
>>
>> If there is no technical reason why this shouldn't be done or wouldn't work,
>> I will be happy to make the change and submit a patch. [Can you tell how
>> bad I need batching in this scenario?]
>>
>> Thanks,
>> Brett Wooldridge
>> The ZipTie Project
>> www.ziptie.org
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev(a)lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>
17 years, 3 months
Multiple-object batched inserts
by Brett Wooldridge
I thought his question was a little too technical for the general list, so I
thought I¹d ask it here. Is there a technical reason why Hibernate does not
or cannot support the batching of multiple (different) object insertions
within the same session?
I have code that is semantically equivalent to:
Session session = sessionFactory.getCurrentSession();
for (int i = 0; i < 1000; i++) {
Foo foo = new Foo();
session.save(foo);
Bar bar = new Bar();
session.save(bar);
}
session.flush();
If Foo or Bar is not in the equation, batching occurs as desired. However,
if both Foo and Bar are inserted during the session in alternating fashion
the result is that in the ActionQueue the 'insertion list' is interleaved
with Foo's and Bar's. This is natural enough it seems. However, when
ActionQueue.executeActions() is called the following occurs...
1) executeAction() iterates through the insertion list calling execute() on
each Executable in the list
2) the Executable.execute() method calls insert() on the associated
persister, in the case a SingleTableEntityPersister. It seems to realize
that it CAN batch these inserts and so ...
3) insert() calls the BatchingBatcher.prepareBatchStatement(sql) which is
actually a method on AbstractBatcher. This is where things go "awry"...
The prepareBatchStatement(sql) does this thing where it compares the SQL
statement that was provided with the _last SQL that it executed_ and if they
match it re-uses the _last PreparedStatement_ which it also hung onto,
otherwise it closes the existing PreparedStatement and creates a new one --
effectively ending the current "batch".
In this case, because the objects are interleaved in the insertion list the
result is that the prepareBatchStatement() is called with:
insert into foo (a, b, c) values (?, ?, ?)
insert into bar (x, y, z) values (?, ?, ?)
insert into foo (a, b, c) values (?, ?, ?)
insert into bar (x, y, z) values (?, ?, ?)
insert into foo (a, b, c) values (?, ?, ?)
insert into bar (x, y, z) values (?, ?, ?)
...
The result being that each subsequent statement terminates the statement
prepared before it, preventing batching from occuring. Well, it logs a
debug message saying "Executing batch size: 1" for each statement.
So my question is this ... is there any reason not to use a Map<sql,
PreparedStatement> construct in the BatchingBatcher such that the
appropriate PreparedStatement can be located and re-used based on the
incoming SQL rather than only retaining the _last_ SQL statement which is
guaranteed to "flap" in the case of alternating (or multiple) objects?
If there is no technical reason why this shouldn't be done or wouldn't work,
I will be happy to make the change and submit a patch. [Can you tell how
bad I need batching in this scenario?]
Thanks,
Brett Wooldridge
The ZipTie Project
www.ziptie.org
17 years, 3 months