[hibernate-issues] [Hibernate-JIRA] Resolved: (HSEARCH-390) Allow customization of the charset used by analyzer components
Emmanuel Bernard (JIRA)
noreply at atlassian.com
Sat Nov 6 12:00:13 EDT 2010
[ http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Emmanuel Bernard resolved HSEARCH-390.
--------------------------------------
Resolution: Fixed
> Allow customization of the charset used by analyzer components
> --------------------------------------------------------------
>
> Key: HSEARCH-390
> URL: http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-390
> Project: Hibernate Search
> Issue Type: New Feature
> Components: analyzer
> Affects Versions: 3.1.1.GA
> Reporter: Ivan Holub
> Assignee: Emmanuel Bernard
> Fix For: 3.3.0.CR1
>
>
> HibernateSearchResourceLoader uses default charset for reading resources.
> So stop words are not working for other languages.
> @AnalyzerDef(name="ru",
> tokenizer=@TokenizerDef(factory=StandardTokenizerFactory.class),
> filters={
> @TokenFilterDef(factory=StandardFilterFactory.class),
> @TokenFilterDef(factory=LowerCaseFilterFactory.class),
> @TokenFilterDef(factory=StopFilterFactory.class,
> params=@Parameter(name="words",
> value="stopwords/stopwords_ru.txt")),
> @TokenFilterDef(factory=SnowballPorterFilterFactory.class,
> params=@Parameter(name="language",
> value="Russian"))
> stopwords/stopwords_ru.txt is UTF-8 file
> To fix the problem I constructed Analyzer in separate class and without using AnalyzerDef.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the hibernate-issues
mailing list