[teiid-issues] [JBoss JIRA] (TEIID-5709) Consistent character handling beyond bmp

Steven Hawkins (Jira) issues at jboss.org
Thu Apr 4 08:32:00 EDT 2019


    [ https://issues.jboss.org/browse/TEIID-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717886#comment-13717886 ] 

Steven Hawkins commented on TEIID-5709:
---------------------------------------

Most of our logic is UCS2 based, but this is only noted with regards to Teiid string sorting behavior.  Some routines are surrogate aware, such as Clob upper/lower.

Issues identified so far:

- the Teiid length function reports the UCS2 length (same as H2), but different than Oracle (which has explicit LENGTH_ functions) and Postgresql
-- similar issues exist with ascii, initCap, translate, trim 
- the char type can only hold characters in the bmp.
- XMLFunctions name escaping does not properly handle surrogate pairs
- there may be issues with object names, but that requires more review (SQLStringVisitor, resolving logic, etc.). 


> Consistent character handling beyond bmp
> ----------------------------------------
>
>                 Key: TEIID-5709
>                 URL: https://issues.jboss.org/browse/TEIID-5709
>             Project: Teiid
>          Issue Type: Bug
>          Components: Server
>            Reporter: Steven Hawkins
>            Assignee: Steven Hawkins
>            Priority: Major
>             Fix For: 12.2
>
>
> There are many places in the code that only consider each 16 bit character when we should consult the full code point.  



--
This message was sent by Atlassian Jira
(v7.12.1#712002)


More information about the teiid-issues mailing list