[keycloak-dev] Support UTF-8 encoding theme message properites

Hiroyuki Wada wadahiro at gmail.com
Wed Jul 6 06:30:35 EDT 2016


On Wed, Jul 6, 2016 at 6:53 PM, Thomas Raehalme
<thomas.raehalme at aitiofinland.com> wrote:
> From the javadoc of java.util.ResourceBundle the encoding for .properties is
> still expected to be ISO-8859-1:
>
>> Constructing a PropertyResourceBundle instance from an InputStream
>> requires that the input stream be encoded in ISO-8859-1. In that case,
>> characters that cannot be represented in ISO-8859-1 encoding must be
>> represented by Unicode Escapes as defined in section 3.3 of The Java™
>> Language Specification whereas the other constructor which takes a Reader
>> does not have that limitation.

It's a case of using constructor which takes an InputStream.
It says we can use any encoding when we use the other constructor
which takes a Reader.

I understand your concern. So what do you think supporting utf-8 with
a header like Stian suggested?
I think it can avoid confusion because the encoding is noted in the
file itself...

Regards,

> https://docs.oracle.com/javase/8/docs/api/java/util/PropertyResourceBundle.html
>
> I understand that Spring allows one to read .properties using any encoding,
> but you need to specify the encoding in Spring configuration external to the
> .properties file.
>
> In my opinion it would cause unnecessary confusion amongst developers to use
> any other encoding than the one defined by the official documentation for
> Properties.
>
> Best regards,
> Thomas
>
>
> On Wed, Jul 6, 2016 at 12:40 PM, Hiroyuki Wada <wadahiro at gmail.com> wrote:
>>
>> I think it was true before Java 1.6 because Java standard library
>> (java.util.ResourceBundle or java.util.Properties) only supported
>> ISO-8859-1 encoding file.
>> But they support any encoding after Java 1.6. There are some
>> frameworks which can read .properties with any encoding. For example,
>> Spring Framework can read .properties with any encoding. I think it's
>> very useful to read UTF-8 directly for multibyte language country.
>>
>> On Wed, Jul 6, 2016 at 3:32 PM, Thomas Raehalme
>> <thomas.raehalme at aitiofinland.com> wrote:
>> > Please correct me if I am wrong, but I have been under the impression
>> > that
>> > Java .properties files should always use encoding ISO-8859-1. Characters
>> > not
>> > present in ISO-8859-1 can be written in \uxxxx. Won't it make things
>> > confusing to developers if another encoding is used here instead?
>> >
>> >
>> > https://docs.oracle.com/javase/8/docs/api/java/util/Properties.html#load-java.io.InputStream-
>> >
>> > https://docs.oracle.com/javase/8/docs/api/java/util/Properties.html#store-java.io.OutputStream-java.lang.String-
>> >
>> > If alternate encodings are desired how about supporting the XML format
>> > of
>> > Properties?
>> >
>> > Best regards,
>> > Thomas
>> >
>> >
>> > On Wed, Jul 6, 2016 at 9:04 AM, Stian Thorgersen <sthorger at redhat.com>
>> > wrote:
>> >>
>> >> Both iso-8859-1 and utf-8 message bundles should be able to co-exist.
>> >>
>> >> We can allow specifying the encoding in a comment on the first line
>> >> like
>> >> this:
>> >>
>> >> # encoding=utf-8
>> >> key=value
>> >>
>> >> # encoding=iso-8859-1
>> >> key=value
>> >>
>> >> If the first line in the file doesn't contain the comment with the
>> >> encoding then we should default to iso-8859-1 for backwards
>> >> compatibility
>> >>
>> >> On 5 July 2016 at 09:49, Hiroyuki Wada <wadahiro at gmail.com> wrote:
>> >>>
>> >>> Thanks for your comment.
>> >>>
>> >>> > If we want to change to utf-8 we'd still need to support iso.. for
>> >>> > backwards compatibility.
>> >>>
>> >>> If we change to UTF-8, we can still read unicode codepoint like
>> >>> '\u00e8'.
>> >>> There is an incompatibility when non-ascii characters are used in
>> >>> message properties.
>> >>> The non-ascii characters are 0xA0 - 0xFF codes (please refer codepage
>> >>> layout: https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Codepage_layout )
>> >>>
>> >>> I think the non-ascii characters might be used in French messages like
>> >>> 'à' so I agree to support ISO-8859-1 for backwards compatibility.
>> >>> To support this, I think we can add a property like "messageEncoding"
>> >>> in keycloak-server.json as below. Is it a good idea?
>> >>>
>> >>>     "theme": {
>> >>>         "staticMaxAge": 2592000,
>> >>>         "cacheTemplates": true,
>> >>>         "cacheThemes": true,
>> >>>         "messageEncoding": "ISO-8859-1",
>> >>>         "folder": {
>> >>>           "dir": "${jboss.home.dir}/themes"
>> >>>         }
>> >>>     },
>> >>>
>> >>> Regards,
>> >>>
>> >>> On Mon, Jul 4, 2016 at 10:33 PM, Stian Thorgersen
>> >>> <sthorger at redhat.com>
>> >>> wrote:
>> >>> > We have in the past discussed this and decided to stick with
>> >>> > ISO-8859-1.
>> >>> > That was probably not the best idea though. If we want to change to
>> >>> > utf-8
>> >>> > we'd still need to support iso.. for backwards compatibility.
>> >>> >
>> >>> > On 4 July 2016 at 14:37, Bruno Oliveira <bruno at abstractj.org> wrote:
>> >>> >>
>> >>> >> It makes sense, maybe file a Jira associated with:
>> >>> >> https://issues.jboss.org/browse/KEYCLOAK-3259 ?
>> >>> >>
>> >>> >> On 2016-07-04, Hiroyuki Wada wrote:
>> >>> >> > Hello all,
>> >>> >> >
>> >>> >> > I am trying to translate all base theme messages to my country
>> >>> >> > language, Japanese. And I'd like to contribute them. Before that
>> >>> >> > work,
>> >>> >> > I'd like to propose about the files encoding.
>> >>> >> >
>> >>> >> > Currently, the message files (*.properties) are loaded with
>> >>> >> > ISO-8859-1
>> >>> >> > encoding. Therefore, it is necessary to convert the files by
>> >>> >> > 'native2ascii' command beforehand. However we can directly read
>> >>> >> > the
>> >>> >> > property files with UTF-8 encoding in java 1.6 or later because
>> >>> >> > 'java.util.Properties#load(java.io.Reader)' method was introduced
>> >>> >> > as
>> >>> >> > below.
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > http://docs.oracle.com/javase/6/docs/api/java/util/Properties.html#load(java.io.Reader)
>> >>> >> >
>> >>> >> > So, my proposal is supporting the message files with UTF-8
>> >>> >> > encoding.
>> >>> >> > I
>> >>> >> > believe that it's very developers/customers friendly. In
>> >>> >> > addition,
>> >>> >> > we
>> >>> >> > can easily review the translated messages on the github pull
>> >>> >> > request
>> >>> >> > view and so on. What do you think?
>> >>> >> >
>> >>> >> > If it's ok, I'll create a JIRA issue and create a pull request.
>> >>> >> >
>> >>> >> > Regards,
>> >>> >> >
>> >>> >> > --
>> >>> >> > Hiroyuki Wada,
>> >>> >> > Developer,
>> >>> >> > Nomura Research Institute, Ltd.
>> >>> >> > _______________________________________________
>> >>> >> > keycloak-dev mailing list
>> >>> >> > keycloak-dev at lists.jboss.org
>> >>> >> > https://lists.jboss.org/mailman/listinfo/keycloak-dev
>> >>> >>
>> >>> >> --
>> >>> >>
>> >>> >> abstractj
>> >>> >> PGP: 0x84DC9914
>> >>> >> _______________________________________________
>> >>> >> keycloak-dev mailing list
>> >>> >> keycloak-dev at lists.jboss.org
>> >>> >> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>> >>> >
>> >>> >
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> keycloak-dev mailing list
>> >> keycloak-dev at lists.jboss.org
>> >> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>> >
>> >
>> >
>
>
>
>



More information about the keycloak-dev mailing list