On 26 July 2016 at 16:44, Steve Ebersole <steve(a)hibernate.org> wrote:
So what I ended up doing is:
Add new MultiIdentifierLoadAccess#enableReturnOfDeletedEntities option
(default == false)
Add new MultiIdentifierLoadAccess#enableOrderedReturn option (default ==
true)
Added Javadoc clarification to MultiIdentifierLoadAccess#multiLoad to see
the Javadocs for the above 2 options.
Thanks Steve, looks good!
So since you now have an option in which order might matter, I guess
you won't change the argument type from List to Collection?
I am not super stoked about the way the ordering happens, at the
moment.
For now I essentially:
Perform X number of batch loads
After all the batch loads, come back and build the result List based on the
incoming id positions.
I did it this way to also handle de-duplication which is a related, reported
bug.
Fair enough, it has to work clearly and correctly first. People will
have to come back on this if they have concrete data on how it
performs, that's usually a better approach than trying to predict or
make it unnecessarily complex.
-- Sanne
On Mon, Jul 25, 2016 at 3:01 PM Sanne Grinovero <sanne(a)hibernate.org> wrote:
>
> On 25 July 2016 at 19:55, Steve Ebersole <steve(a)hibernate.org> wrote:
> > See inline...
> >
> >
> > On Mon, Jul 25, 2016 at 1:06 PM Sanne Grinovero <sanne(a)hibernate.org>
> > wrote:
> >>
> >> some comments inline:
> >>
> >> On 25 July 2016 at 18:27, Steve Ebersole <steve(a)hibernate.org> wrote:
> >> > I wanted to start a consolidated discussion about multi-load support.
> >> > This
> >> > relates to a few Jiras, questioning a few different aspects of its
> >> > current
> >> > behavior:
> >> >
> >> >
https://hibernate.atlassian.net/browse/HHH-10984
> >> >
https://hibernate.atlassian.net/browse/HHH-10617
> >> >
> >> > Basically this comes down to the following questions/requests...
> >> >
> >> >
> >> > ## Handling locally deleted entities
> >> >
> >> > First (from HHH-10984) is the idea that locally deleted entities are
> >> > currently returned from multi-load calls. The background here is
> >> > that
> >> > multi-load support was designed based on
> >> > IdentifierLoadAccess#getReference
> >> > (Session#load). So its outcome follows what is done for
> >> > #getReference
> >> > in
> >> > terms of behavior and results. Now one of the behaviors of
> >> > #getReference
> >> > that differs from IdentifierLoadAccess#load (Session#get) is how
> >> > locally
> >> > deleted entities are treated: #getReference will return them, whereas
> >> > #load
> >> > will not.
> >> >
> >> > So as I see it we have 3 options:
> >> >
> >> > 1. Continue to expose just the one form on
> >> > MultiIdentifierLoadAccess
> >> > and
> >> > either have this filter out the locally deleted objects, or add a
> >> > new
> >> > option to specify whether locally deleted objects ought to be
> >> > returned in
> >> > the results.
> >> > 2. Expose 2 distinct forms on MultiIdentifierLoadAccess:
> >> > 1. Plural form of #getReference (current behavior, more or
> >> > less)
> >> > 2. Plural form of #load
> >> > 3. Expose 3 distinct forms on MultiIdentifierLoadAccess:
> >> > 1. Plural form of #getReference
> >> > 2. Plural form of #load
> >> > 3. "Hybrid" form
> >>
> >> Regarding locally deleted entities, I understand the underlying logic
> >> but as a user I think I'd be surprised for it to return them.
> >> Is there a good use case to return these?
> >
> >
> > I think its just more efficient in many cases. Meaning, in some cases
> > filtering out the deleted ones would mean a performance overhead mainly
> > because we'd have to ascertain whether they are deleted or not. Which
> > means
> > going to check the related EntityEntry, which has an overhead going the
> > reference to via creation of an EntityKey and then the
> > "EntityEntryContext
> > lookup". So I think in the case of #getReference (allow proxy creation)
> > it
> > might be more of a significant overhead .
> >
> > If we go with the proposed "2 distinct forms" approach, the current
> > behavior
> > would align with "plural form of #getReference (current behavior, more
> > or
> > less)". We'd just offer another solution that aligns with "plural
form
> > of
> > #load". However, see the caveat wrt nulls as discussed below...
> >
> >
> >> > Much of the discussion on HHH-10984 revolves around the (poorly
> >> > based,
> >> > imo)
> >> > assumption that because a List/array of ids is passed in and because
> >> > a
> >> > List
> >> > is returned that we ought to return the results in the order
> >> > indicated
> >> > by
> >> > the incoming List/array of ids.
> >>
> >> As discussed on chat, in lack of any explicit documentation I also
> >> would have expected the order to be preserved, but stating otherwise
> >> in javadoc seems good enough to me.
> >>
> >> +1 to change the parameter type from List to Collection as you
> >> suggested; the return should also be changed from List so to not
> >> suggest any ordering guarantee but since that's breaking we'll do
it
> >> later.
> >>
> >> > While I do not agree with the assumption there, I can see that that
> >> > behavior might sometimes be beneficial. Is this something we want to
> >> > support? There is certainly a performance overhead, and so I think
> >> > we
> >> > definitely do not want to support it by default. But do we want to
> >> > expose
> >> > an option to allow users to request this?
> >>
> >> Yes I think there would be performance benefits to allow the consumer
> >> to push the sorting guarantee to ORM:
> >> in worst case, ORM will do it in the same inefficient way that the
> >> consumer would need to, but there might be cases in which ORM can do
> >> something smarter.
> >>
> >> In Hibernate Search we have two different situations in which we'd
> >> load by "byMultipleIds";
> >> in one case we'll need strict sorting, in the other we actually
don't
> >> care at all for the order.. so yes we'd benefit from an option.
> >>
> >> Considering that the current API takes a List as parameter, and
> >> returns a List, and that we seem to agree that there's value in
> >> maintaining ordering.. maybe we should keep this API as is and make it
> >> always ordered?
> >> One could add the variation based on Collections as the alternative
> >> API to use when ordering is not needed, by adding it as a new API we'd
> >> not be breaking anything and still providing both options.
> >
> >
> > I'd instead opt then to keep the current API taking and returning Lists,
> > but
> > with a sorting option (#enabledOrderedResults) that controls whether we
> > sort
> > or not.
> >
> > Keep in mind too that this Jira also then asserts (and I'd have to agree
> > with, if we accept ordering) that nulls should be pushed into the
> > resulting
> > List at the correct spot. So the List consumer would have to handle for
> > null elements, which would be a change in behavior as well. Sanne -
> > could
> > Search, as the List consumer, deal with null elements?
>
> Yes, not a problem.
>
> >
> > That is why I would allow to control this via the switch. I could agree
> > either way about the default here.