HibernateSearchResourceLoader uses default charset for reading resources
------------------------------------------------------------------------
Key: HSEARCH-390
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-390
Project: Hibernate Search
Issue Type: Bug
Components: analyzer
Affects Versions: 3.1.1.GA
Reporter: Ivan Holub
HibernateSearchResourceLoader uses default charset for reading resources.
So stop words are not working for other languages.
@AnalyzerDef(name="ru",
tokenizer=(a)TokenizerDef(factory=StandardTokenizerFactory.class),
filters={
@TokenFilterDef(factory=StandardFilterFactory.class),
@TokenFilterDef(factory=LowerCaseFilterFactory.class),
@TokenFilterDef(factory=StopFilterFactory.class,
params=@Parameter(name="words",
value="stopwords/stopwords_ru.txt")),
@TokenFilterDef(factory=SnowballPorterFilterFactory.class,
params=@Parameter(name="language",
value="Russian"))
stopwords/stopwords_ru.txt is UTF-8 file
To fix the problem I constructed Analyzer in separate class and without using
AnalyzerDef.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira