Took a look at this yesterday and could confirm the problem. Seems to be isolated to MySQL as it works fine on H2 and PostgreSQL at least.

Can you create a JIRA for it please?



On 16 August 2016 at 15:38, Igor Zuk <igor.zuk@qualitytaskforce.com> wrote:

Hi Stian

 

I hope you had a good time during holiday. Your autoresponse asked me to resend you the email when you’re back, so here it is.

 

From: Igor Zuk
Sent: Wednesday, August 03, 2016 4:37 PM
To: 'stian@redhat.com' <stian@redhat.com>
Cc: keycloak-user@lists.jboss.org
Subject: RE: [keycloak-user] Keycloak user data encoding

 

Sorry for a delayed response, I had to temporarily suspend the investigation.

 

 

Yes, I can reproduce the issue anytime using Docker.

 

I set up as default as possible environment with latest MySQL:

docker run --name mysql -e MYSQL_ROOT_PASSWORD=root -e MYSQL_DATABASE=keycloak -e MYSQL_USER=keycloak -e MYSQL_PASSWORD=keycloak -P -d mysql:latest

and Keycloak in the same version as where I found the issue, 1.9.2.Final:

docker run --name keycloak -e MYSQL_DATABASE=keycloak -e MYSQL_USER=keycloak -e MYSQL_PASSWORD=keycloak -e MYSQL_PORT_3306_TCP_ADDR=192.168.99.100 -e MYSQL_PORT_3306_TCP_PORT=32779 -P -d jboss/keycloak-mysql:1.9.2.Final

 

The results were identical, special letters in names were replaced with question marks. It turned out, that Keycloak created all its tables with the DB's default encoding, latin1 (ISO-8859-1). I've checked it with a query:

SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_name = "USER_ENTITY" AND column_name = "FIRST_NAME";

 

Once again I've manually changed the encoding of a single column:

ALTER TABLE `USER_ENTITY` MODIFY `FIRST_NAME` VARCHAR(255) CHARACTER SET utf8;

It worked, but Keycloak was still putting there names with question marks, so the issue was fully reproduced.

 

 

Because it seems, that Keycloak uses the DB's default encoding, I tried changing it in MySQL. I've removed its container entirely and started it with two parameters appended to Docker run command:

--character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci

It seems, that it worked, but then Keycloak refused to start at all throwing exception with error:

Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs [Failed SQL: ALTER TABLE keycloak.REALM MODIFY CERTIFICATE VARCHAR(4000)]

It looks that Keycloak is not able to work with UTF-8 in databases at all! The full startup log is here: http://pastebin.com/VMTARqgF

 

 

Because 1.9.2.Final is quite dated, I've checked the latest available MySQL-preconfigured version, 2.0.0.Final. I've repeated all the steps and the results were identical.

 

 

The example name I'm working with is Mściwy Żółw. Only the letter 'ó' is working, that's because it's encodable in ISO-8859-1.

 

 

From: Stian Thorgersen [mailto:sthorger@redhat.com]
Sent: Friday, July 15, 2016 6:48 AM
To: Igor Zuk <igor.zuk@qualitytaskforce.com>
Cc: keycloak-user@lists.jboss.org


Subject: Re: [keycloak-user] Keycloak user data encoding

 

It's strange that no one else has reported this. We had several people report the issue with umlats, but no one else seems to have the issue with the database encoding. Maybe there's something different with your database config? Could you try with a default MySQL database installation and see if you can reproduce the issue? Also, can you give me a sample name that shows the problems.

 

I added a test for umlats to registration and account management, see https://github.com/keycloak/keycloak/pull/3036. Once it's in I'll schedule a run with CI, which tests with a range of different databases.

 

On 12 July 2016 at 16:13, Igor Zuk <igor.zuk@qualitytaskforce.com> wrote:

Thank you for a quick response.

 

I’m using 1.9.2.Final and the problem is a bit different, it’s not limited to registration screen.

 

I’m saying, that ISO-8859-1 is the default encoding, because all the text columns in USER_ENTITY table had encoding latin1. The table was created completely by Keycloak as the database was empty in the beginning. I manually switched encoding of FIRST_NAME to UTF-8 and modified it so it contained special letters. I started the user editor in Keycloak admin console and this name was displayed correctly. I added a single character to it, saved, and then the name got messed up with question marks instead of all special characters.

 

From: Stian Thorgersen [mailto:sthorger@redhat.com]
Sent: Tuesday, July 12, 2016 3:43 PM
To: Stian Thorgersen <
stian@redhat.com>
Cc: Igor Zuk <
igor.zuk@qualitytaskforce.com>; keycloak-user@lists.jboss.org
Subject: Re: [keycloak-user] Keycloak user data encoding

 

By the way this was fixed in 1.6.0.Final, see https://issues.jboss.org/browse/KEYCLOAK-1830?jql=project%20%3D%20KEYCLOAK%20AND%20text%20~%20%22encoding%22

 

Are you using an old version?

 

On 12 July 2016 at 15:37, Stian Thorgersen <sthorger@redhat.com> wrote:

Why are you saying the default encoding is ISO-8859-1? All forms are encoded as UTF-8 and all strings passed to the database should be UTF-8 encoded as well.

 

The only thing that is ISO-8859-1 is the message properties, but those are converted to UTF-8 when added to HTML pages.

 

On 12 July 2016 at 14:58, Igor Zuk <igor.zuk@qualitytaskforce.com> wrote:

Hi

 

I have an encoding problem. By default users' data fields (e.g. first name and last name) are encoded using ISO-8859-1. People from many countries can't properly create accounts as their personal data is silently messed up. How can I fix it?

·         The MySQL DB receives already damaged names. By default all columns are ISO-8859-1-encoded, but manually converting them to UTF-8 doesn't help.

·         Manual account modification from admin console has same effect.

·         Change of default server (Wildfly) encoding to UTF-8 doesn't do anything.

 

Best regards

Igor Żuk

 

_______________________________________________
keycloak-user mailing list
keycloak-user@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/keycloak-user