[hibernate-issues] [Hibernate-JIRA] Updated: (HSEARCH-390) Allow customization of the charset used by analyzer components

Saturday, 6 November 2010



     [
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-390?pag...
]

Emmanuel Bernard updated HSEARCH-390:
-------------------------------------

         Assignee: Emmanuel Bernard
    Fix Version/s:     (was: 3.3.0)
                   3.3.0.CR1
       Issue Type: New Feature  (was: Bug)
          Summary: Allow customization of the charset used by analyzer components  (was:
HibernateSearchResourceLoader uses default charset for reading resources)

fixed with an ad-hoc param
{code}@Parameter(name="resource_charset", value"UTF-8"){code}

...
 Allow customization of the charset used by analyzer components
 --------------------------------------------------------------

                 Key: HSEARCH-390
                 URL:
http://opensource.atlassian.com/projects/hibernate/browse/HSEARCH-390
             Project: Hibernate Search
          Issue Type: New Feature
          Components: analyzer
    Affects Versions: 3.1.1.GA
            Reporter: Ivan Holub
            Assignee: Emmanuel Bernard
             Fix For: 3.3.0.CR1


 HibernateSearchResourceLoader uses default charset for reading resources.
 So stop words are not working for other languages.
 	@AnalyzerDef(name="ru",
 				 tokenizer=(a)TokenizerDef(factory=StandardTokenizerFactory.class),
 				 filters={
 					@TokenFilterDef(factory=StandardFilterFactory.class),
 					@TokenFilterDef(factory=LowerCaseFilterFactory.class),
 					@TokenFilterDef(factory=StopFilterFactory.class, 
 									params=@Parameter(name="words",
 													  value="stopwords/stopwords_ru.txt")),
 				    @TokenFilterDef(factory=SnowballPorterFilterFactory.class,
 								    params=@Parameter(name="language",
 							                          value="Russian"))
 stopwords/stopwords_ru.txt is UTF-8 file
 To fix the problem I constructed Analyzer in separate class and without using
AnalyzerDef. 
-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[hibernate-issues] [Hibernate-JIRA] Updated: (HSEARCH-390) Allow customization of the charset used by analyzer components