[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-960) Index.UN_TOKENIZED overrides other tokenized fields that share the same name

Wednesday, 26 October 2011

    [
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-960?pag...
] 

John-Michael Au commented on HSEARCH-960:
-----------------------------------------

Hi Sanne,

In our case, having an ID (I apologize for the ambiguity- these are not the actual object
identifiers as in the database table primary keys) such as '1234.56.abcd' which we
would like to be added verbatim to the index, and text such as '1234 Years Ago'
which should be tokenized. In this case of course, 1234 should match the document with
'1234 Years Ago', but not '1234.56.abcd'. 

I understand that the standard tokenizer will not break up '1234.56.abcd' anyway,
but it does seem that such behaviour should be explicitly defined rather than implied by
the functionality of other components. I suppose that if you look at it from the point of
view that multiple fields with the same name should essentially behave as one field, being
able to specify different indexing options for each field doesn't make much sense. On
the other hand, if you see the field as the end-product of concatenation of other
individual fields, it would make sense if each field could be indexed as required.

...
 Index.UN_TOKENIZED overrides other tokenized fields that share the
same name
 ----------------------------------------------------------------------------

                 Key: HSEARCH-960
                 URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-960
             Project: Hibernate Search
          Issue Type: Bug
          Components: mapping
    Affects Versions: 3.4.0.Final
         Environment: 3.4.0 Final
            Reporter: John-Michael Au
              Labels: annotations, bug, override, tokenized, un_tokenized
             Fix For: 3.4.2, 4.0.0.CR2

 Marking one field as un-tokenized causes all other fields with the same names to be
un-tokenized.
 i.e.
 {code}
 @Field(name = "simple_search", index = Index.UN_TOKENIZED, store = Store.NO)
 private String string;
 @Field(name = "simple_search", index = Index.TOKENIZED, store = Store.NO)
 private String string2;
 {code}
 The resulting behaviour is that "simple_search" will be made up of un-tokenized
'string' and 'string2' values, even though 'string2' was specified
to be tokenized. 
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-issues] [Hibernate-JIRA] Commented: (HSEARCH-960) Index.UN_TOKENIZED overrides other tokenized fields that share the same name