[keycloak-dev] Potential performance regression between 3.4.3.Final and 4.3.0.Final
Schuster Sebastian (INST-CSS/BSV-OS)
Sebastian.Schuster at bosch-si.com
Tue Sep 4 03:50:34 EDT 2018
Hi Hynek,
I did a little testing. In my setup I have a local MS SQL DB with 170.495 users and 18.663.024 user attributes. That’s in the order of our targeted production setup.
Without your change, listing users basically times out, I never get a response.
With your change not fetching the user attributes I get an answer after 150 ms.
I also tested what happens if I change the fetchtype to “select” of all one-to-many relations in UserEntity with a batch size of 20 (just an initial guess as that’s the default page size in the UI).
Then I get the list of users in 1100ms instead of timing out (after setting briefRepresentation to false).
I read a little bit in the Hibernate documentation and they wrote that fetchmode.subselect loads all entities into the session and only makes sense when nearly all entities should be loaded anyways. I’d guess that’s mostly not true for Users in Keycloak. Also not for Resources, the only other place I found a subselect relation.
I would conclude going back to fetchtype.select with a batch size of 20 is definitely superior to the subselect that’s currently used, independently of your change.
It’s just 6 lines of code change. ☺ I could prepare a pull request for that but I have to go through some nasty processes on our side so that might take a while…
Best regards,
Sebastian
Mit freundlichen Grüßen / Best regards
Dr.-Ing. Sebastian Schuster
Engineering and Support (INST/ESY1)
Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin | GERMANY | www.bosch-si.com<http://www.bosch-si.com>
Tel. +49 30 726112-485 | Fax +49 30 726112-100 | Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>
Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr. Stefan Ferber, Michael Hahn
From: Hynek Mlnarik <hmlnarik at redhat.com>
Sent: Freitag, 31. August 2018 18:38
To: Thomas Darimont <thomas.darimont at googlemail.com>
Cc: Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com>; Stian Thorgersen <stian at redhat.com>; keycloak-dev <keycloak-dev at lists.jboss.org>
Subject: Re: [keycloak-dev] Potential performance regression between 3.4.3.Final and 4.3.0.Final
Hi Thomas,
the issue with repeating a query should be addressed by [1]. Out of curiosity, I wonder whether you'd have some spare time to test this change in your environment?
Thanks
--Hynek
[1] https://github.com/keycloak/keycloak/pull/5525
On Thu, Aug 30, 2018 at 12:58 PM Hynek Mlnarik <hmlnarik at redhat.com<mailto:hmlnarik at redhat.com>> wrote:
Hi Sebastian, Thomas,
I agree that this query is bad performance-wise. It is caused inability of database to easily limit rows in the resultset of query that joins users and attributes (and other tables) if there might are several attributes per user while we want to return exactly n users (the LIMIT/OFFSET won't work here for parent table due to joining with child records). Hibernate thus resorts to its own filtering of the full resultset. That is one option, the other is having subsequent selects and limiting batch size. Both approaches have pros and cons, and actually the subsequent selects were changed to subselects precisely because of performance. Looks like we might need to revert back to the select approach and tweak batch size. That will require some time for optimizations.
For now, the most painful issue with user list seems resolved and thus I'm merging the PR. There will be subsequent work though on further optimizations to user / user list retrieval, that relates to both your suggestion and Thomas's comment (that relates to authz resource server store caching). Thank you both for your help!
--Hynek
On Thu, Aug 30, 2018 at 12:06 PM Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>> wrote:
Hi Thomas,
Your query is the same as before, it is the initial retrieval of the user entity. The query I am referring to is the one loading the user attributes after the user entity has been fetched as part of ModelToRepresentation.toRepresentation(KeycloakSession session, RealmModel realm, UserModel user). It just looks wrong to me that all users are fetched when only the attributes of a single user are needed. However, with Hynek’s change, the user attributes are no longer fetched because only a BriefRepresentation is requested, that’s why it is faster now. I still think a full representation should also be faster and not fetch all users from a realm.
Best regards,
Sebastian
Mit freundlichen Grüßen / Best regards
Dr.-Ing. Sebastian Schuster
Engineering and Support (INST/ESY1)
Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin | GERMANY | www.bosch-si.com<http://www.bosch-si.com>
Tel. +49 30 726112-485 | Fax +49 30 726112-100 | Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>
Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr. Stefan Ferber, Michael Hahn
From: Thomas Darimont <thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>>
Sent: Mittwoch, 29. August 2018 22:55
To: Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>>
Cc: Hynek Mlnarik <hmlnarik at redhat.com<mailto:hmlnarik at redhat.com>>; Stian Thorgersen <stian at redhat.com<mailto:stian at redhat.com>>; keycloak-dev <keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>>
Subject: Re: [keycloak-dev] Potential performance regression between 3.4.3.Final and 4.3.0.Final
Hello Hynek et al.,
I gave your PR 5517 a quick spin and tested the user listing in the admin-console on my Dell XPS (3.4GHz, 32g) laptop with local PostgreSQL backend.
I can confirm that the response times for loading a single page in the user-listing are much better now with your fix:
In a test realm with 10k users, loading a single page takes ~15ms compared to 8 seconds (!) before.
The new query that is now generated whilst browsing the user list looks like this:
Hibernate:
select
userentity0_.ID as ID1_75_,
userentity0_.CREATED_TIMESTAMP as CREATED_2_75_,
userentity0_.EMAIL as EMAIL3_75_,
userentity0_.EMAIL_CONSTRAINT as EMAIL_CO4_75_,
userentity0_.EMAIL_VERIFIED as EMAIL_VE5_75_,
userentity0_.ENABLED as ENABLED6_75_,
userentity0_.FEDERATION_LINK as FEDERATI7_75_,
userentity0_.FIRST_NAME as FIRST_NA8_75_,
userentity0_.LAST_NAME as LAST_NAM9_75_,
userentity0_.NOT_BEFORE as NOT_BEF10_75_,
userentity0_.REALM_ID as REALM_I11_75_,
userentity0_.SERVICE_ACCOUNT_CLIENT_LINK as SERVICE12_75_,
userentity0_.USERNAME as USERNAM13_75_
from
USER_ENTITY userentity0_
where
userentity0_.REALM_ID=?
and (
userentity0_.SERVICE_ACCOUNT_CLIENT_LINK is null
)
order by
userentity0_.USERNAME limit ? offset ?
The query looks reasonable with appropriate limits applied.
However, it seems that policies are fetched for each returned user via a dedicated query, e.g.:
select
resourcese0_.ID as ID1_64_0_,
resourcese0_.ALLOW_RS_REMOTE_MGMT as ALLOW_RS2_64_0_,
resourcese0_.POLICY_ENFORCE_MODE as POLICY_E3_64_0_
from
RESOURCE_SERVER resourcese0_
where
resourcese0_.ID=?
Would it be possible to load those in one batch?
Cheers,
Thomas
Am Mi., 29. Aug. 2018 um 17:19 Uhr schrieb Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>>:
Thanks Hynek, I’ll test this. I also tracked the issue down to the ModelToRepresentation.toRepresentation(KeycloakSession session, RealmModel realm, UserModel user) method. However, what I am also seeing is when the method is running, it is emitting SQL queries that are not scoped down to the user id of the user being transformed. E.g. for the user attributes of a user the following query is generated:
select
attributes0_.USER_ID as USER_ID4_72_1_,
attributes0_.ID as ID1_72_1_,
attributes0_.ID as ID1_72_0_,
attributes0_.NAME as NAME2_72_0_,
attributes0_.USER_ID as USER_ID4_72_0_,
attributes0_.VALUE as VALUE3_72_0_
from
USER_ATTRIBUTE attributes0_
where
attributes0_.USER_ID in (
select
userentity0_.ID
from
USER_ENTITY userentity0_
where
userentity0_.REALM_ID=?
and (
userentity0_.SERVICE_ACCOUNT_CLIENT_LINK is null
)
)
The subselect returns all real users from the realm. It is neither paged nor bound to a specific user id. The same happens in the above method for credentials and required actions. I am not a JPA/Hibernate expert so I might be missing something here…
Best regards,
Sebastian
Mit freundlichen Grüßen / Best regards
Dr.-Ing. Sebastian Schuster
Engineering and Support (INST/ESY1)
Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin | GERMANY | www.bosch-si.com<http://www.bosch-si.com>
Tel. +49 30 726112-485 | Fax +49 30 726112-100 | Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>
Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr. Stefan Ferber, Michael Hahn
From: Hynek Mlnarik <hmlnarik at redhat.com<mailto:hmlnarik at redhat.com>>
Sent: Mittwoch, 29. August 2018 16:35
To: Thomas Darimont <thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>>
Cc: Stian Thorgersen <stian at redhat.com<mailto:stian at redhat.com>>; keycloak-dev <keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>>; Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>>
Subject: Re: [keycloak-dev] Potential performance regression between 3.4.3.Final and 4.3.0.Final
Thanks Thomas and all,
great help with reproducing the issue at least for admin user list endpoint. Could you please test [1] that it indeed fixes the issue? It was caused obtaining too much user details when unnecessary (for user list, only few fields are necessary).
As for increased memory requirements, these might be unrelated to this fix, and we'll keep an eye on this.
--Hynek
[1] https://github.com/keycloak/keycloak/pull/5517
On Mon, Aug 27, 2018 at 4:48 PM Thomas Darimont <thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>> wrote:
Hello,
FYI I created https://issues.jboss.org/browse/KEYCLOAK-8150
Cheers,
Thomas
Am Mo., 27. Aug. 2018 um 13:33 Uhr schrieb Stian Thorgersen <
sthorger at redhat.com<mailto:sthorger at redhat.com>>:
> Can you open a bug for this in JIRA please and include as much details as
> you can.
>
> On Thu, 23 Aug 2018 at 15:45, Thomas Darimont <
> thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>> wrote:
>
>> Hi Sebastian,
>>
>> thanks for investigating! Unfortunately, I didn't have time yet to look
>> further into the problem, but I also observed the slow attributes handling.
>> Btw. we have around 100k users and ~700k attributes. The
>> Entity-Attribute-Value data model takes it's toll here...
>>
>> @Stian so far I've only observed this when scrolling through the user
>> overview in the admin-console, which also triggers
>> the mentioned spike in heap usage.
>>
>> Cheers,
>> Thomas
>>
>> Am Do., 23. Aug. 2018 um 15:22 Uhr schrieb Schuster Sebastian (INST/ESY1)
>> <Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>>:
>>
>>> Btw. we also have 40K user attributes...
>>>
>>> Best regards,
>>> Sebastian
>>>
>>> Mit freundlichen Grüßen / Best regards
>>>
>>> Dr.-Ing. Sebastian Schuster
>>>
>>> Engineering and Support (INST/ESY1)
>>> Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin |
>>> GERMANY | www.bosch-si.com<http://www.bosch-si.com>
>>> Tel. +49 30 726112-485 | Fax +49 30 726112-100 |
>>> Sebastian.Schuster at bosch-si.com<mailto:Sebastian.Schuster at bosch-si.com>
>>>
>>> Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
>>> Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung:
>>> Dr. Stefan Ferber, Michael Hahn
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: keycloak-dev-bounces at lists.jboss.org<mailto:keycloak-dev-bounces at lists.jboss.org> <
>>> keycloak-dev-bounces at lists.jboss.org<mailto:keycloak-dev-bounces at lists.jboss.org>> On Behalf Of Stian Thorgersen
>>> Sent: Mittwoch, 22. August 2018 21:25
>>> To: Thomas Darimont <thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>>
>>> Cc: keycloak-dev <keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>>
>>> Subject: Re: [keycloak-dev] Potential performance regression between
>>> 3.4.3.Final and 4.3.0.Final
>>>
>>> That's quite worrying. Is it limited to browsing users through the admin
>>> console or are you seeing bad behaviour elsewhere? Same question applies to
>>> the heap. My best bet here is that it has that something has changed around
>>> user querying/caching.
>>>
>>> On Mon, 20 Aug 2018 at 19:18, Thomas Darimont <
>>> thomas.darimont at googlemail.com<mailto:thomas.darimont at googlemail.com>> wrote:
>>>
>>> > Hello Keycloak team,
>>> >
>>> > has anyone encountered some performance issues after upgrading 3.4.3
>>> > to 4.x (4.3.0)?
>>> >
>>> > Today I noticed a performance regression while preparing an upgrade
>>> > from Keycloak 3.4.3.Final to 4.3.0.Final in our staging environment.
>>> >
>>> > In our test environment, we have around ~100k test users stored in a
>>> > postgres-backed database. When we started the server with the new
>>> > Keycloak version, the migration went through, and everything looked
>>> > fine at first, but when we tried to browse the list of users via the
>>> > admin-console, we noticed that the CPU and memory consumption of the
>>> > server increased significantly, up to a point where Keycloak crashed
>>> with an OOME.
>>> >
>>> > All previous Keycloak versions including 3.4.3 were very modest with
>>> > their memory requirements and quite happy with ~1g heap.
>>> > However, that seems to have changed in Keycloak 4.3.0 - there we
>>> > needed at least 4g to prevent Keycloak from crashing with an OOME.
>>> > Furthermore, we noticed that the response times for browsing the
>>> > paginated user view increased significantly as well:
>>> > In Keycloak 3.4.3 the average time to load a user page is ~80ms. In
>>> > Keycloak 4.3.0 (and older versions >= 4.0.0.Beta1) the same operation
>>> > takes
>>> > ~7 seconds for a test realm with just 10k users.
>>> >
>>> > In the test realm with 100k users, the time to load a single page in
>>> > the users listing was 66 seconds for version 4.3.0, on average -
>>> > compared to quite stable 80ms in 3.4.3.
>>> >
>>> > The database query that is executed by Keycloak 4.3.0 runs in ~1.5
>>> > seconds for 100k users, so I assume the processing logic in Keycloak
>>> > is the culprit.
>>> >
>>> > The problem of long load-times can be reproduced with the Keycloak
>>> > docker images and the in-memory database. I also created a small
>>> > example project that creates some users with just a few attributes in
>>> > a docker based 3.4.3 and 4.3.0 Keycloak environment with a Postgres
>>> > database to reproduce the problem.
>>> > https://github.com/thomasdarimont/kc-user-regression-tester
>>> >
>>> > Cheers,
>>> > Thomas
>>> > _______________________________________________
>>> > keycloak-dev mailing list
>>> > keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>
>>> > https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>> >
>>> _______________________________________________
>>> keycloak-dev mailing list
>>> keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>
>>> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>
>>
_______________________________________________
keycloak-dev mailing list
keycloak-dev at lists.jboss.org<mailto:keycloak-dev at lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/keycloak-dev
More information about the keycloak-dev
mailing list