David Lloyd created ELY-525:
-------------------------------
Summary: Support our own Unicode normalizer
Key: ELY-525
URL:
https://issues.jboss.org/browse/ELY-525
Project: WildFly Elytron
Issue Type: Feature Request
Reporter: David Lloyd
Priority: Minor
We should do our own Unicode normalizer, because the JDK one is not good performance-wise
or memory-wise, and doesn't integrate well with the authentication mechanisms which
require normalization.
It would be accessible off of {CodePointIterator} as a few methods:
* {decomposeCanonical}
* {decomposeCompatibility}
* {composeCanonical}
These methods could be chained in various ways to achieve the standard defined
normalization types:
* NFD = {decomposeCanonical}
* NFC = {decomposeCanonical} + {composeCanonical}
* NFKD = {decomposeCompatibility}
* NFKC = {decomposeCompatibility} + {composeCanonical}
The types and behaviors are defined here:
http://www.unicode.org/reports/tr15/tr15-43.html
The implementations should be lazy. If possible they should be implemented in code as
opposed to data tables, possibly one class per operation type per Unicode version so that
only the necessary transformations are loaded/initialized. The code could potentially be
generated from tables and rules by a Maven plugin or annotation processor (see
https://github.com/jdeparser/jdeparser2 for one option).
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)