[teiid-issues] [JBoss JIRA] (TEIID-5709) Consistent character handling beyond bmp
Steven Hawkins (Jira)
issues at jboss.org
Thu Apr 4 08:32:00 EDT 2019
[ https://issues.jboss.org/browse/TEIID-5709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717886#comment-13717886 ]
Steven Hawkins commented on TEIID-5709:
---------------------------------------
Most of our logic is UCS2 based, but this is only noted with regards to Teiid string sorting behavior. Some routines are surrogate aware, such as Clob upper/lower.
Issues identified so far:
- the Teiid length function reports the UCS2 length (same as H2), but different than Oracle (which has explicit LENGTH_ functions) and Postgresql
-- similar issues exist with ascii, initCap, translate, trim
- the char type can only hold characters in the bmp.
- XMLFunctions name escaping does not properly handle surrogate pairs
- there may be issues with object names, but that requires more review (SQLStringVisitor, resolving logic, etc.).
> Consistent character handling beyond bmp
> ----------------------------------------
>
> Key: TEIID-5709
> URL: https://issues.jboss.org/browse/TEIID-5709
> Project: Teiid
> Issue Type: Bug
> Components: Server
> Reporter: Steven Hawkins
> Assignee: Steven Hawkins
> Priority: Major
> Fix For: 12.2
>
>
> There are many places in the code that only consider each 16 bit character when we should consult the full code point.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
More information about the teiid-issues
mailing list