[keycloak-user] Keycloak user data encoding

Igor Zuk igor.zuk at qualitytaskforce.com
Wed Aug 3 10:36:46 EDT 2016


Sorry for a delayed response, I had to temporarily suspend the investigation.


Yes, I can reproduce the issue anytime using Docker.

I set up as default as possible environment with latest MySQL:
docker run --name mysql -e MYSQL_ROOT_PASSWORD=root -e MYSQL_DATABASE=keycloak -e MYSQL_USER=keycloak -e MYSQL_PASSWORD=keycloak -P -d mysql:latest
and Keycloak in the same version as where I found the issue, 1.9.2.Final:
docker run --name keycloak -e MYSQL_DATABASE=keycloak -e MYSQL_USER=keycloak -e MYSQL_PASSWORD=keycloak -e MYSQL_PORT_3306_TCP_ADDR=192.168.99.100 -e MYSQL_PORT_3306_TCP_PORT=32779 -P -d jboss/keycloak-mysql:1.9.2.Final

The results were identical, special letters in names were replaced with question marks. It turned out, that Keycloak created all its tables with the DB's default encoding, latin1 (ISO-8859-1). I've checked it with a query:
SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_name = "USER_ENTITY" AND column_name = "FIRST_NAME";

Once again I've manually changed the encoding of a single column:
ALTER TABLE `USER_ENTITY` MODIFY `FIRST_NAME` VARCHAR(255) CHARACTER SET utf8;
It worked, but Keycloak was still putting there names with question marks, so the issue was fully reproduced.


Because it seems, that Keycloak uses the DB's default encoding, I tried changing it in MySQL. I've removed its container entirely and started it with two parameters appended to Docker run command:
--character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
It seems, that it worked, but then Keycloak refused to start at all throwing exception with error:
Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs [Failed SQL: ALTER TABLE keycloak.REALM MODIFY CERTIFICATE VARCHAR(4000)]
It looks that Keycloak is not able to work with UTF-8 in databases at all! The full startup log is here: http://pastebin.com/VMTARqgF.


Because 1.9.2.Final is quite dated, I've checked the latest available MySQL-preconfigured version, 2.0.0.Final. I've repeated all the steps and the results were identical.


The example name I'm working with is Mściwy Żółw. Only the letter 'ó' is working, that's because it's encodable in ISO-8859-1.


From: Stian Thorgersen [mailto:sthorger at redhat.com]
Sent: Friday, July 15, 2016 6:48 AM
To: Igor Zuk <igor.zuk at qualitytaskforce.com>
Cc: keycloak-user at lists.jboss.org
Subject: Re: [keycloak-user] Keycloak user data encoding

It's strange that no one else has reported this. We had several people report the issue with umlats, but no one else seems to have the issue with the database encoding. Maybe there's something different with your database config? Could you try with a default MySQL database installation and see if you can reproduce the issue? Also, can you give me a sample name that shows the problems.

I added a test for umlats to registration and account management, see https://github.com/keycloak/keycloak/pull/3036. Once it's in I'll schedule a run with CI, which tests with a range of different databases.

On 12 July 2016 at 16:13, Igor Zuk <igor.zuk at qualitytaskforce.com<mailto:igor.zuk at qualitytaskforce.com>> wrote:
Thank you for a quick response.

I’m using 1.9.2.Final and the problem is a bit different, it’s not limited to registration screen.

I’m saying, that ISO-8859-1 is the default encoding, because all the text columns in USER_ENTITY table had encoding latin1. The table was created completely by Keycloak as the database was empty in the beginning. I manually switched encoding of FIRST_NAME to UTF-8 and modified it so it contained special letters. I started the user editor in Keycloak admin console and this name was displayed correctly. I added a single character to it, saved, and then the name got messed up with question marks instead of all special characters.

From: Stian Thorgersen [mailto:sthorger at redhat.com<mailto:sthorger at redhat.com>]
Sent: Tuesday, July 12, 2016 3:43 PM
To: Stian Thorgersen <stian at redhat.com<mailto:stian at redhat.com>>
Cc: Igor Zuk <igor.zuk at qualitytaskforce.com<mailto:igor.zuk at qualitytaskforce.com>>; keycloak-user at lists.jboss.org<mailto:keycloak-user at lists.jboss.org>
Subject: Re: [keycloak-user] Keycloak user data encoding

By the way this was fixed in 1.6.0.Final, see https://issues.jboss.org/browse/KEYCLOAK-1830?jql=project%20%3D%20KEYCLOAK%20AND%20text%20~%20%22encoding%22

Are you using an old version?

On 12 July 2016 at 15:37, Stian Thorgersen <sthorger at redhat.com<mailto:sthorger at redhat.com>> wrote:
Why are you saying the default encoding is ISO-8859-1? All forms are encoded as UTF-8 and all strings passed to the database should be UTF-8 encoded as well.

The only thing that is ISO-8859-1 is the message properties, but those are converted to UTF-8 when added to HTML pages.

On 12 July 2016 at 14:58, Igor Zuk <igor.zuk at qualitytaskforce.com<mailto:igor.zuk at qualitytaskforce.com>> wrote:
Hi

I have an encoding problem. By default users' data fields (e.g. first name and last name) are encoded using ISO-8859-1. People from many countries can't properly create accounts as their personal data is silently messed up. How can I fix it?

•         The MySQL DB receives already damaged names. By default all columns are ISO-8859-1-encoded, but manually converting them to UTF-8 doesn't help.

•         Manual account modification from admin console has same effect.

•         Change of default server (Wildfly) encoding to UTF-8 doesn't do anything.

Best regards
Igor Żuk

_______________________________________________
keycloak-user mailing list
keycloak-user at lists.jboss.org<mailto:keycloak-user at lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/keycloak-user



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/keycloak-user/attachments/20160803/a0ac9e93/attachment-0001.html 


More information about the keycloak-user mailing list