[hibernate-dev] Search: Dynamic Document boosting

Sun May 3 06:41:26 EDT 2009

yes fix them for 3.2 and add the incompatibility in the wiki page

On  Apr 30, 2009, at 19:36, Sanne Grinovero wrote:

> Thank you very much for the suggestions, maybe what I was trying is  
> wrong.
> I'll try to use the FunctionQuery; remember I had the patch in Search
> more or less working if one day you might need it.
>
> BTW I've just realized the Discriminator interface is having a nasty
> typo in the public api method name :
>
> "getAnanyzerDefinitionName" instead of getAnalyzerDefinitionName
> and the bug is repeated in all documentation.
> fixing for 3.2?
>
> 2009/4/30 Emmanuel Bernard <emmanuel at hibernate.org>:
>>
>> On  Apr 30, 2009, at 14:11, Sanne Grinovero wrote:
>>
>>> 2009/4/30 Emmanuel Bernard <emmanuel at hibernate.org>:
>>>>
>>>> On  Apr 30, 2009, at 13:01, Sanne Grinovero wrote:
>>>>
>>>>> Basically I need a function to convert a user-proposed term to a
>>>>> series of proposals
>>>>> of "similar" terms but giving a higher rank to the terms I'd  
>>>>> prefer
>>>>> him to choose as they
>>>>> are the correct names used in my domain.
>>>>> You can think of it as a spellchecker/dictionary (using synonyms
>>>>> toos), but giving priority to
>>>>> a selected form of each term, known as the "root", or the standard
>>>>> form in the domain.
>>>>
>>>> Have you considered indexing the same property twice:
>>>>  - once unaltered and giving it a high boost
>>>>  - once with synonyms and giving it a low boost
>>>>
>>>> That's how I would sove your use case personally.
>>>>
>>>
>>> That would be clever if I had only two levels of boost, but it's  
>>> not the
>>> case.
>>> Also most properties are being indexed already in 5+n different  
>>> fields,
>>> being n the number of supported languages for snowball (currently  
>>> 15,
>>> so the next step
>>> will be to try the programmatic configuration to remove all this
>>> annotations).
>>> So adding more fields will drammatically increase the number of them
>>> (number of boost
>>> levels * (5 + number of enabled stemmers))
>>
>> number of boost? the level of boost is defined at query time so I  
>> guess
>> there would be the clean data and the synonym data ie 2 * (5 +  
>> languages)
>> right?
>>
>>
>>>
>>> and make the index size unnecessarily large, and I'm not solving the
>>> time-fading requirement.
>>
>> Well yes but dynamic boosting as defined by you does not solve the  
>> time
>> fading requirement either right?
>> FunctionQuery does (or can)
>>
>>>
>>> My BI suite is returning some nice float which I'm storing in the
>>> entity itself as a property,
>>> it would make my life a lot easier if I could just use this float
>>> value as the document boost.
>>
>> I'd rather see something more generic like the @AnalyzerDiscriminator
>> approach we've used if really really you want to set that at  
>> indexing time.
>>