Re: [hibernate-dev] [Search] DisjunctionMaxQuery and MoreLikeThis

Friday, 21 February 2014

On 21 Feb 2014, at 01:15, Sanne Grinovero <sanne(a)hibernate.org&gt; wrote:

...
 I still suspect that a DisMax approach would provide a better
scoring
 model but this is an implementation detail we should iterate on at a
 second phase.
 Essentially taking the example of "albino elephants" I agree on the
 behaviour you described but I think there are some additional aspects
 to consider when you're evaluating how a partial match "albino" scores
 against a full match "albino elephant" in a single field, rather than
 split up, or how "albino" could score less in field A rather than
 field B, so even swapping positions of termson different fields could
 provide a less valuable match.
 Probably better explained with an example on a larger data set but
 alas I won't be able to craft one soon.. still it's not a blocker at
 all as in this first phase I think we should 1) have a working
 solution 2) focus on API effectiveness. Performance and a sofisticated
 scoring system will necessarily have to follow: I'm unpacking a large
 data set to play with, I'm pretty sure we'll have plenty of follow up
 improvements. 

If I managed to decipher you, you think that applying a dismax query at the top of MLT (ie
the junction between each graph of queries related to each field) would be useful to favor
a field that gets a better score and downplay an average ressemblance over several
fields?
It is much to anticipate that now (that we would need a boolean / dismax work) because it
does impact the context sequence of the DSL at least when the addition is complex.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: [hibernate-dev] [Search] DisjunctionMaxQuery and MoreLikeThis