[infinispan-issues] [JBoss JIRA] (ISPN-9494) Ickl full-text queries with wildcards are affected by upper/lower case

Gustavo Fernandes (Jira) issues at jboss.org
Fri Nov 30 04:48:00 EST 2018


    [ https://issues.jboss.org/browse/ISPN-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668490#comment-13668490 ] 

Gustavo Fernandes commented on ISPN-9494:
-----------------------------------------

The full-text wildcard query is supposed to match the terms as they are stored in the index, so depending on the analyzer, it will not ignore capitalization.

In the provided sample, if the firstName is "Wolfgang", given that this field is Analyzed (using the standard analyzer), it will be stored in the index as "*wolfgang*". So wolfg*ng should match, but not W*lfgang or Wolf*ng since they don't exist in the index. 

If the keyword analyzer were used, they the firstName would be stored in the index as "*Wolfgang*" (no analyzer), and "Wolfg*ng", Wolfgan*", etc would all match, but not "wolfgang".

In general, the wildcard query should be avoided in conjunction with full-text searches, as it has a performance drawback that causes the search engine to scan for the terms rather than performing simple lookups.



> Ickl full-text queries with wildcards are affected by upper/lower case
> ----------------------------------------------------------------------
>
>                 Key: ISPN-9494
>                 URL: https://issues.jboss.org/browse/ISPN-9494
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Embedded Querying, Remote Querying
>    Affects Versions: 7.2.2.Final
>            Reporter: Wolf-Dieter Fink
>            Assignee: Adrian Nistor
>            Priority: Major
>
> If an attribute is annotated with 
>    @Field(index = Index.YES,store = Store.YES, analyze = Analyze.YES)
> a full-text search with wildcards can be used like this
>     from proto.Person p where p.firstName : 'wolf*ng'
> It is expected that the query will ignore capitalisation. So a search 'Wolf*' 'Wolf*ng' 'wolf*' should find the attribute value "Wolfgang" and "wolfgang".
> But only if the query is written in lower-case it will match the expectation



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the infinispan-issues mailing list