[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-133) Allow ovrriding DefaultSimilarity for indexing and searching
John Griffin (JIRA)
noreply at atlassian.com
Sat Nov 17 13:47:21 EST 2007
[ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_28871 ]
John Griffin commented on HSEARCH-133:
--------------------------------------
The method of Lucene scoring in the DefaultSimilarity class is:
score(q,d) = coord(q,d) - queryNorm(q) - SIGMA ( tf(t in d) - idf(t)2 - t.getBoost() - norm(t,d) )
t in q
Use cases:
Let's say that I don't care about what the term frequency count (tf) is in a document when it's scored. I
only want to know if a certain term appeared at all. I would override the tf(float freq) abstract method to
return 1.0.
Let's say that I want to increase a document's score by a factor of two if it contains more than one of the
terms I am querying by. I would override the coord(int overlap, int maxOverlap) to return
(overlap * 2)/maxOverlap.
As a final example I want to develop a 'ThresholdSimilarity' to increase the score of a document if and only
if it contains more than a certain number of query terms regardless of the repository document count. Below
that 'threshold' the scoring calculation remains the same as the default. I would override the
idf(int docFreq, int numDocs) method to perform this calculation.
> Allow ovrriding DefaultSimilarity for indexing and searching
> ------------------------------------------------------------
>
> Key: HSEARCH-133
> URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-133
> Project: Hibernate Search
> Issue Type: New Feature
> Components: engine, query
> Affects Versions: 3.0.1
> Reporter: John Griffin
> Assignee: John Griffin
>
> Ability to override DefaultSimilarity for indexing and searching should be implemented. Access is necessary in both places because changing only one and then querying has undefined results as far as scores are concerned.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the hibernate-issues
mailing list