[hibernate-dev] [Hibernate Search] Feedback on Document Field lazy loading

Emmanuel Bernard emmanuel at hibernate.org
Sun Jun 15 12:22:22 EDT 2008


On  Jun 15, 2008, at 06:01, Hardy Ferentschik wrote:

> Hi,
>
> On Sun, 15 Jun 2008 10:24:55 +0200, Sanne Grinovero <sanne.grinovero at gmail.com 
> > wrote:
>
>> what is your goal? performance?

Here is the full story. While writing Hibernate Search in Action I  
explained how a database lookup by id is not fundamentally different  
than Lucene loading a document. Other factors such as cache and I/O  
speed make the difference.
But reality is slightly different. To load an object we need to the  
document, which today means loading all the fields. This is fine for  
people fully embracing HSearch idea as the two fields will be class  
and id. But for people storing data in the Document (for projection,  
for highlight etc) this can make a memory / CPU difference as we need  
to load all these stored fields simply to extract class and id.

Note that even on lazy fields, Lucene is not fantastically efficient  
at "not loading" non-compressed String fields: it basically reads each  
byte.

>>
>
> Yeah, what's the actual goal? If the goal is performance have to  
> compared the performance of the original
> solution with this 'simple' patch? I think to justify complicating a  
> nice and simple API the improvement must
> be substantial.

The Query API does not change in my proposal. TwoWayFieldBridge does  
but we could imagine not enhancing the API and apply the optimization  
if only TwoWayStringBridge are used.
But I agree some tests are needed.

>
>
> And so that I just understand the problem properly. There are two  
> cases to consider - projected vs. non projected queries.
>
> When using projections the proposed solution works fine. The fields  
> we want to return are explicitly specified and we return 'only'  
> EntityInfos anyway. So no problem there.

No, even projection has issues. If I project a field using a custom  
TwoWayFieldBridge, I am exposed to the problem as the pure  
TwoWayFieldBridge can do the hell it wants to the Document.

>
>
> The propblem comes in when we try to use document field lazy loading  
> in the case where we want to return managed objects. In particulat  
> when someone uses a TwoWayFieldBridge. Potentially this bridge can  
> add arbitrary field names which are then not properly loaded at  
> query time. Is this correct?

Note that we could use LazyFields as opposed to no load fields but  
instead of readig sequencially, we would read randomly in the worst  
case scenario, I am not sure I like it.

>
>
> I am not sure if I like lazy field loading in the second case. The  
> proposed metadata to FieldBridge (FieldBridge.fieldNameStrategy()  
> EXACT, IN_NAMESPACE, NON_SAFE) seems very artificial. If we extend  
> the FieldBridge interface why not just add a new methods  
> getFieldNames() which returns a array of String listing all field  
> names this bridge is using?

I thought about that but you don't always know :) Thing Maps mapped  
with the key as the field name.

>
>
> --Hardy
>
>




More information about the hibernate-dev mailing list