[hibernate-dev] [HSEARCH] Proposal to change the default value of Field#norms()

Emmanuel Bernard emmanuel at hibernate.org
Wed May 9 06:26:32 EDT 2012


On 9 mai 2012, at 00:43, Andrej Golovnin wrote:

> Hi Hardy,
> 
>> It depends. Maybe you want to boost any of these fields ad index time.
> 
> OK, what about the document id? It is indexed with NOT_ANALYZED_NO_NORMS.
> As far as I understand (please correct me if I'm wrong) to boost fields at index time
> norms must be enabled for those fields. And because Hibernate Search uses for document ids
> NOT_ANALYZED_NO_NORMS, adding @Boost to a document id would have no effect.
> Although I'm not sure, someone really needs this. But it should be at least documented.

@DocumentId is special as we need to make sure we can lookup by its exact value. So Hibernate Search takes over the responsibility to set the right options.
Plus it's unlikely that you need to "full-text" search an id field. If you need to, you can use an extra @Field on it.

Can you open a JIRA issue to clarify this in http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#basic-mapping ?
Even better, can you propose the change as a GitHub pull request? :)

> 
> Also it is also not clear from the Hibernate Search documentation, what will happen
> if I have following annotation on a field:
> 
> @Field(boost=@Boost(1.2f), norms = Norms.NO)
> private String description;
> 
> Would I see a warning or maybe an error message from Hibernate Search?
> Does make such combination sense?

I don't think we raise an exception, we should do something indeed. Do you want to open the JIRA and possibly provide a patch via a GitHub pull request?

> 
>> From http://www.lucidimagination.com/content/scaling-lucene-and-solr#d0e71
> norms may not be useful for short or non full text fields. I personally consider
> Enums, boolean/int/short, date, OIDs as short and non full text fields.
> And in my application those fields are majority. We have 130 indexes.
> Some of this indexes are shared by multiple entities. So I would guess
> that we have ca. 140-150 indexed entities +  a lot of IndexEmbedded.
> Of course I could do search&replace. But at some point a developer would
> forget to add "norms = Norms.NO" to a field and our application would start
> consuming more memory. And this is my dilemma. May be I can add an
> automatic test to our build process which would force developers to add
> "norms = Norms.NO". Alternative solution for would be to have a custom
> Hibernate Search version, but I don't like this idea. I will discuss with my team
> what is the best way to solve this small problem.

I think stereotypes would be best. For numbers, I encourage you to use @NumericField. I don't believe these are normed and they are for sure more optimized.

> Btw Is it possible that OIDs added to index by IndexedEmbedded have
> norms? Luke shows me it in two different indexes. Could someone
> else please verify it? If it is the case, it would be nice if Hibernate Search
> would have the same behavior for OIDs added to index either
> by DocumentId or by IndexedEmbedded, e.g. index them with
> NOT_ANALYZED_NO_NORMS.

I am not following you. What's an OID.

> 
> I understand that for you as a framework provider it is difficult to find a golden path
> which would satisfy every framework user. I think Sanne's suggestion
> with "@Stereotype - like user annotations" may help us to achieve this goal.
> 
> Btw Hibernate Search and Hibernate helped us to create a really cool
> product, so kudos to all who have contributed to this great projects!
> 
> Best regards,
> Andrej Golovnin
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev




More information about the hibernate-dev mailing list