[hibernate-dev] Re: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default value for indexing null value
Emmanuel Bernard
emmanuel at hibernate.org
Tue Apr 22 11:02:37 EDT 2008
Interesting. Quite heavy though.
The things I like is that you can build a 'is null' operator on top of
this internal model. And you don't have to pray for the analyzer to do
the right thing.
The think I dislike is that it makes the raw query quite unintuitive
and heavy (but I guess a three-state SQL query is quite verbose as
well).
If we go that path, we should add a NullQuery class that can be
combined with other *Query from Lucene and hide the complexity.
On Apr 21, 2008, at 20:29, Sanne Grinovero wrote:
> Hey all,
> nice discussion :-)
>
> About implementation 1:
> I like this one, but think this is what people already are doing using
> their own StringBridge when needed, as an "unsupported hack".
> In case the framework was offering me this option I would expect it to
> be really smart: "escape" somehow the value if eventually some
> non-null object is tokenized to my same marker string; I would also
> expect the framework to be keyword-aware somehow on query parsing, but
> this looks really messy.
>
> A new proposal:
> I got inspired by the "3VL" considerations described in Emmanuel's
> link to wikipedia, and think backwards compatibility is nice:
> add a "@IndexNullMarker" on the property, this will add an additional
> Field to the index for null values:
>
> @Field
> @IndexNullMarker
> String getFoo(){
>
> would add either the Field "foo:value" or "fooIsNull:true" to the
> Document, implementing real three-state logic. Eventually we could add
> "fooIsNull:false" too for query consistency.
> To resolve nameclashes (imagine a getFooIsNull() property we could use
> some character prohibited in Java identifiers:
> "foo-isNull:true"
> or escape the name with additional prefixes as needed.
>
> The Field and StringBridge API would remain as-is;
> If you prefer not to add an additional @IndexNullMarker could be
> dropped if you think adding this field is acceptable for all fields.
> We could add an additional option to existing field, but I am thinking
> of a case you don't want to index the value, only mark it if it is
> null.
> Also IMHO an additional annotation makes it more intuitive that you
> are adding an additional Field.
>
> In any case I would question the original reason for this improvement
> to index null values as "default language"; if there is a default I
> think you should better tell me which it is, or give it a value. So I
> can't really think of a good reason to have different results for
> "search-on-null" rather than "search-on-empty", but it could
> potentially help in "entity state reconstruction" use cases, when the
> entity is partially rebuild using projection.
>
> regards,
> Sanne
>
> 2008/4/21, Emmanuel Bernard <emmanuel at hibernate.org>:
>> Hey
>> The more I think about the feature, the less I like it.
>>
>> Here is what I have written in Hibernate Search in Action
>>
>>
>> Hibernate Search, by default, does not store null attributes into
>> the index.
>> Lucene does not have the notion of null fields, the field is simply
>> not
>> there. Hibernate Search could offer the ability (and most likely
>> will in the
>> future) to use a special string as a null marker to still be able
>> to search
>> by "null".
>> But before you jump at the Hibernate Search team throat, you need to
>> understand why they have not offered this feature so far. Null is
>> not a
>> value per se. Null means that the data is not known (or does not make
>> sense). Therefore, searching by null as if it was a value is
>> somewhat odd.
>> The authors are well aware that this is a raging debate especially
>> amongst
>> the relational model experts (see
>> http://en.wikipedia.org/wiki/Null_%28SQL%29).
>> Whenever you feel the need for searching by "null", you should ask
>> yourself
>> if storing a special marker value in the database would make more
>> sense. If
>> you store a special marker value in the database, a lot of the "null"
>> inconsistencies vanish. It also has the side effect of being
>> queriable in
>> Lucene and Hibernate Search.
>>
>> So before we jump on the boat for this feature, I would like to
>> know if
>> people think it's still a good idea to offer this feature.
>>
>> To answer your questions, the reason why I do not pass @Field but
>> the raw
>> set of data is because the @Field.index is translated into it's
>> Lucene
>> representation: some work is done.
>> Most people will write StringBridge implementation anyway where the
>> null
>> handling will be taken care of transparently (by
>> String2FieldBridgeAdaptor).
>>
>> I think I like 1 or 3. Note that get should be changed as well.
>> Three is
>> interesting indeed, rename it IndexingStragegy.
>>
>>
>> On Apr 21, 2008, at 10:07, Hardy Ferentschik wrote:
>> Hi Emmanuel,
>>
>> what's you take on this? Just adding another String parameter will
>> work, but
>> are we not getting too many parameters into the method? Wouldn't it
>> be nicer
>> to pass the actual @Field annotation. I think this might make
>> things also
>> clearer for the implementor of the interface.
>>
>> I am also trying here to get a little into your head to understand
>> your
>> ideas behind the code design - hope
>> you don't mind ;-)
>>
>> --Hardy
>>
>>
>>
>> ------- Forwarded message -------
>> From: "Hardy Ferentschik (JIRA)" <noreply at atlassian.com>
>> To: hardy at ferentschik.de
>> Subject: [Hibernate-JIRA] Commented: (HSEARCH-115) Add a default
>> value for
>> indexing null value
>> Date: Mon, 21 Apr 2008 14:04:33 +0200
>>
>>
>> [
>> http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_30032
>> ]
>>
>> Hardy Ferentschik commented on HSEARCH-115:
>> -------------------------------------------
>>
>> Ok, here are a few suggestions:
>>
>> 1. This is the simplest way. Basically just add a new property named
>> 'indexNullAs' to @Field and @ClassBridge. Accordingly extend the
>> FieldBridge
>> interface to set(String name, Object value, Document document,
>> Field.Store
>> store, Field.Index index, Field.TermVector termVector, Float boost,
>> String
>> indexNullAs).
>>
>> 2. Alternatively one could change the FieldBridge API to actually
>> pass in
>> the Field annotation itself: set(String name, Object value, Document
>> document, Field fieldAnnotation, Float boost). This would reduce
>> the amount
>> of parameters and might actually be more transparent for users
>> implementing
>> custom bridges. Unfortunately, one would have to introduce a
>> ClassBridge
>> interface as well in this case. I am not sure whether it is a good
>> design
>> choice to pass annotation instances around.
>>
>> 3. We ccould also change the API into something like this:
>> set(String name,
>> Object value, Document document, IndexProperties props), where
>> IndexProperties is just a wrapper class for Field.Store,
>> Field.Index, ...
>> The drawback is that this just increases the number of classes.
>>
>> Any comments?
>>
>> Add a default value for indexing null value
>> -------------------------------------------
>>
>> Key: HSEARCH-115
>> URL:
>> http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-115
>> Project: Hibernate Search
>> Issue Type: Improvement
>> Components: mapping
>> Reporter: Julien Brulin
>> Assignee: Hardy Ferentschik
>> Fix For: 3.1.0
>>
>>
>> Hi,
>> Null elements are not indexed by lucene then it's not easy to use a
>> nullable
>> property in lucene query.
>> I have a TagTranslation entity in my model with a nullable property
>> language. In this case null is used as default language for tag
>> translation.
>> Each translation may have many variations like synonyms.
>> Because I can specified a default value for null value in the @Field
>> annotation like this @Field(index=Index.UN_TOKENIZED,
>> store=Store.NO, default='null'), i can't search a cat tag with a
>> default
>> translation like this : +value:cat* +lang:null
>> <pre></code>
>> @Entity()
>> @Table(name="indexing_tag_trans")
>> @org
>> .hibernate
>> .annotations
>> .Cache
>> (usage=org.hibernate.annotations.CacheConcurrencyStrategy.READ_WRITE)
>> @Indexed
>> public class TagTranslation implements java.io.Serializable {
>>
>>
>> private static final long serialVersionUID = -1065316566731456110L;
>>
>> @Id
>> @GeneratedValue(strategy=GenerationType.IDENTITY)
>> @DocumentId
>> private Integer id;
>>
>> @Field(index=Index.UN_TOKENIZED, store=Store.NO)
>> private String language;
>>
>> @Field(index=Index.TOKENIZED, store=Store.YES)
>> private String value;
>>
>> @OneToMany(cascade=CascadeType.ALL,
>> fetch=FetchType.LAZY)
>>
>> @org
>> .hibernate
>> .annotations.Fetch(org.hibernate.annotations.FetchMode.SUBSELECT)
>> @JoinColumn(name="translation_id")
>> @IndexedEmbedded
>> private List<TagVariation> variations = new
>> LinkedList<TagVariation>();
>>
>> public TagTranslation() { }
>> ...
>> </code>
>> </pre>
>> What do you think about that ?
>> Ps: sorry for english write, i am a french guy.
>>
>>
>>
>> --
>> Hartmut Ferentschik
>> Ekholmsv.339 ,1, 127 45 Skärholmen, Sweden
>> Phone: +46 704 225 097 (m)
>>
>>
>> _______________________________________________
>> hibernate-dev mailing list
>> hibernate-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>>
>>
>
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
More information about the hibernate-dev
mailing list