[hibernate-dev] multiLoad support

Sanne Grinovero sanne at hibernate.org
Tue Jul 26 11:53:45 EDT 2016


On 26 July 2016 at 16:44, Steve Ebersole <steve at hibernate.org> wrote:
> So what I ended up doing is:
>
> Add new MultiIdentifierLoadAccess#enableReturnOfDeletedEntities option
> (default == false)
> Add new MultiIdentifierLoadAccess#enableOrderedReturn option (default ==
> true)
> Added Javadoc clarification to MultiIdentifierLoadAccess#multiLoad to see
> the Javadocs for the above 2 options.

Thanks Steve, looks good!
So since you now have an option in which order might matter, I guess
you won't change the argument type from List to Collection?

> I am not super stoked about the way the ordering happens, at the moment.
> For now I essentially:
>
> Perform X number of batch loads
> After all the batch loads, come back and build the result List based on the
> incoming id positions.
>
> I did it this way to also handle de-duplication which is a related, reported
> bug.

Fair enough, it has to work clearly and correctly first. People will
have to come back on this if they have concrete data on how it
performs, that's usually a better approach than trying to predict or
make it unnecessarily complex.

-- Sanne

>
> On Mon, Jul 25, 2016 at 3:01 PM Sanne Grinovero <sanne at hibernate.org> wrote:
>>
>> On 25 July 2016 at 19:55, Steve Ebersole <steve at hibernate.org> wrote:
>> > See inline...
>> >
>> >
>> > On Mon, Jul 25, 2016 at 1:06 PM Sanne Grinovero <sanne at hibernate.org>
>> > wrote:
>> >>
>> >> some comments inline:
>> >>
>> >> On 25 July 2016 at 18:27, Steve Ebersole <steve at hibernate.org> wrote:
>> >> > I wanted to start a consolidated discussion about multi-load support.
>> >> > This
>> >> > relates to a few Jiras, questioning a few different aspects of its
>> >> > current
>> >> > behavior:
>> >> >
>> >> > https://hibernate.atlassian.net/browse/HHH-10984
>> >> > https://hibernate.atlassian.net/browse/HHH-10617
>> >> >
>> >> > Basically this comes down to the following questions/requests...
>> >> >
>> >> >
>> >> > ## Handling locally deleted entities
>> >> >
>> >> > First (from HHH-10984) is the idea that locally deleted entities are
>> >> > currently returned from multi-load calls.  The background here is
>> >> > that
>> >> > multi-load support was designed based on
>> >> > IdentifierLoadAccess#getReference
>> >> > (Session#load).  So its outcome follows what is done for
>> >> > #getReference
>> >> > in
>> >> > terms of behavior and results.  Now one of the behaviors of
>> >> > #getReference
>> >> > that differs from IdentifierLoadAccess#load (Session#get) is how
>> >> > locally
>> >> > deleted entities are treated: #getReference will return them, whereas
>> >> > #load
>> >> > will not.
>> >> >
>> >> > So as I see it we have 3 options:
>> >> >
>> >> >    1. Continue to expose just the one form on
>> >> > MultiIdentifierLoadAccess
>> >> > and
>> >> >    either have this filter out the locally deleted objects, or add a
>> >> > new
>> >> >    option to specify whether locally deleted objects ought to be
>> >> > returned in
>> >> >    the results.
>> >> >    2. Expose 2 distinct forms on MultiIdentifierLoadAccess:
>> >> >       1. Plural form of #getReference (current behavior, more or
>> >> > less)
>> >> >       2. Plural form of #load
>> >> >    3. Expose 3 distinct forms on MultiIdentifierLoadAccess:
>> >> >       1. Plural form of #getReference
>> >> >       2. Plural form of #load
>> >> >       3. "Hybrid" form
>> >>
>> >> Regarding locally deleted entities, I understand the underlying logic
>> >> but as a user I think I'd be surprised for it to return them.
>> >> Is there a good use case to return these?
>> >
>> >
>> > I think its just more efficient in many cases.  Meaning, in some cases
>> > filtering out the deleted ones would mean a performance overhead mainly
>> > because we'd have to ascertain whether they are deleted or not.  Which
>> > means
>> > going to check the related EntityEntry, which has an overhead going the
>> > reference to via creation of an EntityKey and then the
>> > "EntityEntryContext
>> > lookup".  So I think in the case of #getReference (allow proxy creation)
>> > it
>> > might be more of a significant overhead .
>> >
>> > If we go with the proposed "2 distinct forms" approach, the current
>> > behavior
>> > would align with "plural form of #getReference (current behavior, more
>> > or
>> > less)".  We'd just offer another solution that aligns with "plural form
>> > of
>> > #load".  However, see the caveat wrt nulls as discussed below...
>> >
>> >
>> >> > Much of the discussion on HHH-10984 revolves around the (poorly
>> >> > based,
>> >> > imo)
>> >> > assumption that because a List/array of ids is passed in and because
>> >> > a
>> >> > List
>> >> > is returned that we ought to return the results in the order
>> >> > indicated
>> >> > by
>> >> > the incoming List/array of ids.
>> >>
>> >> As discussed on chat, in lack of any explicit documentation I also
>> >> would have expected the order to be preserved, but stating otherwise
>> >> in javadoc seems good enough to me.
>> >>
>> >> +1 to change the parameter type from List to Collection as you
>> >> suggested; the return should also be changed from List so to not
>> >> suggest any ordering guarantee but since that's breaking we'll do it
>> >> later.
>> >>
>> >> > While I do not agree with the assumption there, I can see that that
>> >> > behavior might sometimes be beneficial.  Is this something we want to
>> >> > support?  There is certainly a performance overhead, and so I think
>> >> > we
>> >> > definitely do not want to support it by default.  But do we want to
>> >> > expose
>> >> > an option to allow users to request this?
>> >>
>> >> Yes I think there would be performance benefits to allow the consumer
>> >> to push the sorting guarantee to ORM:
>> >> in worst case, ORM will do it in the same inefficient way that the
>> >> consumer would need to, but there might be cases in which ORM can do
>> >> something smarter.
>> >>
>> >> In Hibernate Search we have two different situations in which we'd
>> >> load by "byMultipleIds";
>> >> in one case we'll need strict sorting, in the other we actually don't
>> >> care at all for the order.. so yes we'd benefit from an option.
>> >>
>> >> Considering that the current API takes a List as parameter, and
>> >> returns a List, and that we seem to agree that there's value in
>> >> maintaining ordering.. maybe we should keep this API as is and make it
>> >> always ordered?
>> >> One could add the variation based on Collections as the alternative
>> >> API to use when ordering is not needed, by adding it as a new API we'd
>> >> not be breaking anything and still providing both options.
>> >
>> >
>> > I'd instead opt then to keep the current API taking and returning Lists,
>> > but
>> > with a sorting option (#enabledOrderedResults) that controls whether we
>> > sort
>> > or not.
>> >
>> > Keep in mind too that this Jira also then asserts (and I'd have to agree
>> > with, if we accept ordering) that nulls should be pushed into the
>> > resulting
>> > List at the correct spot.  So the List consumer would have to handle for
>> > null elements, which would be a change in behavior as well.  Sanne -
>> > could
>> > Search, as the List consumer, deal with null elements?
>>
>> Yes, not a problem.
>>
>> >
>> > That is why I would allow to control this via the switch.  I could agree
>> > either way about the default here.


More information about the hibernate-dev mailing list