[
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-960?pag...
]
Hardy Ferentschik commented on HSEARCH-960:
-------------------------------------------
After reviewing some code and digging a little deeper I am not so sure that we actually
can do something. The problem is not so much on the Hibernate Search side, than on the
Lucene API.
On the Search side we actually keep the metadata per field and use the right option when
building the document. See _assertValuesAreIndexedWithDifferentAnalyzeSettings_ of this
[
test|https://github.com/hferentschik/hibernate-search/blob/a0352df5845c18...]
The problem is that the analyzing step does not occur at the time the _Document_ is built,
but when we add the document is added to the index (see eg _AddWorkDelegate_ -
{code}writer.addDocument( work.getDocument(), analyzer ){code})
Analyzers work per field name and we have our own implementation _ScopedAnalyzer_.
Depending on a field name it returns an analyzer. In the case of non analyzed field we use
_PassThroughAnalyzer_. To implement the described use case our _ScopedAnalyzer_ would have
to return in one case the _PassThroughAnalyzer_ and in the other the _StandardAnalyzer_
for the same field name. There is not enough information to make the distinction.
I think we are better of logging a warning or throwing an exception. Or does anyone have a
better idea?
Index.UN_TOKENIZED overrides other tokenized fields that share the
same name
----------------------------------------------------------------------------
Key: HSEARCH-960
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-960
Project: Hibernate Search
Issue Type: Bug
Components: mapping
Affects Versions: 3.4.0.Final
Environment: 3.4.0 Final
Reporter: John-Michael Au
Assignee: Hardy Ferentschik
Labels: annotations, bug, override, tokenized, un_tokenized
Fix For: 3.4.2, 4.0.0.CR2
Marking one field as un-tokenized causes all other fields with the same names to be
un-tokenized.
i.e.
{code}
@Field(name = "simple_search", index = Index.UN_TOKENIZED, store = Store.NO)
private String string;
@Field(name = "simple_search", index = Index.TOKENIZED, store = Store.NO)
private String string2;
{code}
The resulting behaviour is that "simple_search" will be made up of un-tokenized
'string' and 'string2' values, even though 'string2' was specified
to be tokenized.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/software/jira