[keycloak-dev] Potential performance regression between 3.4.3.Final and 4.3.0.Final

Hynek Mlnarik hmlnarik at redhat.com
Thu Aug 30 06:58:25 EDT 2018


Hi Sebastian, Thomas,

I agree that this query is bad performance-wise. It is caused inability of
database to easily limit rows in the resultset of query that joins users
and attributes (and other tables) if there might are several attributes per
user while we want to return exactly n users (the LIMIT/OFFSET won't work
here for parent table due to joining with child records). Hibernate thus
resorts to its own filtering of the full resultset. That is one option, the
other is having subsequent selects and limiting batch size. Both approaches
have pros and cons, and actually the subsequent selects were changed to
subselects precisely because of performance. Looks like we might need to
revert back to the select approach and tweak batch size. That will require
some time for optimizations.

For now, the most painful issue with user list seems resolved and thus I'm
merging the PR. There will be subsequent work though on further
optimizations to user / user list retrieval, that relates to both your
suggestion and Thomas's comment (that relates to authz resource server
store caching). Thank you both for your help!

--Hynek


On Thu, Aug 30, 2018 at 12:06 PM Schuster Sebastian (INST/ESY1) <
Sebastian.Schuster at bosch-si.com> wrote:

> Hi Thomas,
>
>
>
> Your query is the same as before, it is the initial retrieval of the user
> entity. The query I am referring to is the one loading the user attributes
> after the user entity has been fetched as part of
> ModelToRepresentation.toRepresentation(KeycloakSession session, RealmModel
> realm, UserModel user). It just looks wrong to me that all users are
> fetched when only the attributes of a single user are needed. However, with
> Hynek’s change, the user attributes are no longer fetched because only a
> BriefRepresentation is requested, that’s why it is faster now. I still
> think a full representation should also be faster and not fetch all users
> from a realm.
>
>
>
> Best regards,
>
> Sebastian
>
>
>
> Mit freundlichen Grüßen / Best regards
>
>
> *Dr.-Ing. Sebastian Schuster *
> Engineering and Support (INST/ESY1)
> Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin |
> GERMANY | www.bosch-si.com
> Tel. +49 30 726112-485 | Fax +49 30 726112-100 |
> Sebastian.Schuster at bosch-si.com
>
> Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
> Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr.
> Stefan Ferber, Michael Hahn
>
>
>
> *From:* Thomas Darimont <thomas.darimont at googlemail.com>
> *Sent:* Mittwoch, 29. August 2018 22:55
> *To:* Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com>
> *Cc:* Hynek Mlnarik <hmlnarik at redhat.com>; Stian Thorgersen <
> stian at redhat.com>; keycloak-dev <keycloak-dev at lists.jboss.org>
> *Subject:* Re: [keycloak-dev] Potential performance regression between
> 3.4.3.Final and 4.3.0.Final
>
>
>
> Hello Hynek et al.,
>
>
>
> I gave your PR 5517 a quick spin and tested the user listing in the
> admin-console on my Dell XPS (3.4GHz, 32g) laptop with local PostgreSQL
> backend.
>
> I can confirm that the response times for loading a single page in the
> user-listing are much better now with your fix:
>
> In a test realm with 10k users, loading a single page takes ~15ms compared
> to 8 seconds (!) before.
>
>
>
> The new query that is now generated whilst browsing the user list looks
> like this:
>
>
>
> Hibernate:
>
>     select
>
>         userentity0_.ID as ID1_75_,
>
>         userentity0_.CREATED_TIMESTAMP as CREATED_2_75_,
>
>         userentity0_.EMAIL as EMAIL3_75_,
>
>         userentity0_.EMAIL_CONSTRAINT as EMAIL_CO4_75_,
>
>         userentity0_.EMAIL_VERIFIED as EMAIL_VE5_75_,
>
>         userentity0_.ENABLED as ENABLED6_75_,
>
>         userentity0_.FEDERATION_LINK as FEDERATI7_75_,
>
>         userentity0_.FIRST_NAME as FIRST_NA8_75_,
>
>         userentity0_.LAST_NAME as LAST_NAM9_75_,
>
>         userentity0_.NOT_BEFORE as NOT_BEF10_75_,
>
>         userentity0_.REALM_ID as REALM_I11_75_,
>
>         userentity0_.SERVICE_ACCOUNT_CLIENT_LINK as SERVICE12_75_,
>
>         userentity0_.USERNAME as USERNAM13_75_
>
>     from
>
>         USER_ENTITY userentity0_
>
>     where
>
>         userentity0_.REALM_ID=?
>
>         and (
>
>             userentity0_.SERVICE_ACCOUNT_CLIENT_LINK is null
>
>         )
>
>     order by
>
>         userentity0_.USERNAME limit ? offset ?
>
>
>
> The query looks reasonable with appropriate limits applied.
>
>
>
> However, it seems that policies are fetched for each returned user via a
> dedicated query, e.g.:
>
>
>
>     select
>
>         resourcese0_.ID as ID1_64_0_,
>
>         resourcese0_.ALLOW_RS_REMOTE_MGMT as ALLOW_RS2_64_0_,
>
>         resourcese0_.POLICY_ENFORCE_MODE as POLICY_E3_64_0_
>
>     from
>
>         RESOURCE_SERVER resourcese0_
>
>     where
>
>         resourcese0_.ID=?
>
>
>
> Would it be possible to load those in one batch?
>
>
>
> Cheers,
>
> Thomas
>
>
>
> Am Mi., 29. Aug. 2018 um 17:19 Uhr schrieb Schuster Sebastian (INST/ESY1) <
> Sebastian.Schuster at bosch-si.com>:
>
> Thanks Hynek, I’ll test this. I also tracked the issue down to the
> ModelToRepresentation.toRepresentation(KeycloakSession session, RealmModel
> realm, UserModel user) method. However, what I am also seeing is when the
> method is running, it is emitting SQL queries that are not scoped down to
> the user id of the user being transformed. E.g. for the user attributes of
> a user the following query is generated:
>
>
>
> select
>
>         attributes0_.USER_ID as USER_ID4_72_1_,
>
>         attributes0_.ID as ID1_72_1_,
>
>         attributes0_.ID as ID1_72_0_,
>
>         attributes0_.NAME as NAME2_72_0_,
>
>         attributes0_.USER_ID as USER_ID4_72_0_,
>
>         attributes0_.VALUE as VALUE3_72_0_
>
>     from
>
>         USER_ATTRIBUTE attributes0_
>
>     where
>
>         attributes0_.USER_ID in (
>
>             select
>
>                 userentity0_.ID
>
>             from
>
>                 USER_ENTITY userentity0_
>
>             where
>
>                 userentity0_.REALM_ID=?
>
>                 and (
>
>                     userentity0_.SERVICE_ACCOUNT_CLIENT_LINK is null
>
>                 )
>
>         )
>
>
>
> The subselect returns all real users from the realm. It is neither paged
> nor bound to a specific user id. The same happens in the above method for
> credentials and required actions. I am not a JPA/Hibernate expert so I
> might be missing something here…
>
>
>
> Best regards,
>
> Sebastian
>
>
>
> Mit freundlichen Grüßen / Best regards
>
>
> *Dr.-Ing. Sebastian Schuster *
> Engineering and Support (INST/ESY1)
> Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin |
> GERMANY | www.bosch-si.com
> Tel. +49 30 726112-485 | Fax +49 30 726112-100 |
> Sebastian.Schuster at bosch-si.com
>
> Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
> Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr.
> Stefan Ferber, Michael Hahn
>
>
> *From:* Hynek Mlnarik <hmlnarik at redhat.com>
> *Sent:* Mittwoch, 29. August 2018 16:35
> *To:* Thomas Darimont <thomas.darimont at googlemail.com>
> *Cc:* Stian Thorgersen <stian at redhat.com>; keycloak-dev <
> keycloak-dev at lists.jboss.org>; Schuster Sebastian (INST/ESY1) <
> Sebastian.Schuster at bosch-si.com>
> *Subject:* Re: [keycloak-dev] Potential performance regression between
> 3.4.3.Final and 4.3.0.Final
>
>
>
> Thanks Thomas and all,
>
>
>
> great help with reproducing the issue at least for admin user list
> endpoint. Could you please test [1] that it indeed fixes the issue? It was
> caused obtaining too much user details when unnecessary (for user list,
> only few fields are necessary).
>
>
>
> As for increased memory requirements, these might be unrelated to this
> fix, and we'll keep an eye on this.
>
>
>
> --Hynek
>
>
>
> [1] https://github.com/keycloak/keycloak/pull/5517
>
>
>
>
>
>
>
> On Mon, Aug 27, 2018 at 4:48 PM Thomas Darimont <
> thomas.darimont at googlemail.com> wrote:
>
> Hello,
>
> FYI I created https://issues.jboss.org/browse/KEYCLOAK-8150
>
> Cheers,
> Thomas
>
> Am Mo., 27. Aug. 2018 um 13:33 Uhr schrieb Stian Thorgersen <
> sthorger at redhat.com>:
>
> > Can you open a bug for this in JIRA please and include as much details as
> > you can.
> >
> > On Thu, 23 Aug 2018 at 15:45, Thomas Darimont <
> > thomas.darimont at googlemail.com> wrote:
> >
> >> Hi Sebastian,
> >>
> >> thanks for investigating! Unfortunately, I didn't have time yet to look
> >> further into the problem, but I also observed the slow attributes
> handling.
> >> Btw. we have around 100k users and ~700k attributes. The
> >> Entity-Attribute-Value data model takes it's toll here...
> >>
> >> @Stian so far I've only observed this when scrolling through the user
> >> overview in the admin-console, which also triggers
> >> the mentioned spike in heap usage.
> >>
> >> Cheers,
> >> Thomas
> >>
> >> Am Do., 23. Aug. 2018 um 15:22 Uhr schrieb Schuster Sebastian
> (INST/ESY1)
> >> <Sebastian.Schuster at bosch-si.com>:
> >>
> >>> Btw. we also have 40K user attributes...
> >>>
> >>> Best regards,
> >>> Sebastian
> >>>
> >>> Mit freundlichen Grüßen / Best regards
> >>>
> >>> Dr.-Ing.  Sebastian Schuster
> >>>
> >>> Engineering and Support (INST/ESY1)
> >>> Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin |
> >>> GERMANY | www.bosch-si.com
> >>> Tel. +49 30 726112-485 | Fax +49 30 726112-100 |
> >>> Sebastian.Schuster at bosch-si.com
> >>>
> >>> Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
> >>> Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung:
> >>> Dr. Stefan Ferber, Michael Hahn
> >>>
> >>>
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: keycloak-dev-bounces at lists.jboss.org <
> >>> keycloak-dev-bounces at lists.jboss.org> On Behalf Of Stian Thorgersen
> >>> Sent: Mittwoch, 22. August 2018 21:25
> >>> To: Thomas Darimont <thomas.darimont at googlemail.com>
> >>> Cc: keycloak-dev <keycloak-dev at lists.jboss.org>
> >>> Subject: Re: [keycloak-dev] Potential performance regression between
> >>> 3.4.3.Final and 4.3.0.Final
> >>>
> >>> That's quite worrying. Is it limited to browsing users through the
> admin
> >>> console or are you seeing bad behaviour elsewhere? Same question
> applies to
> >>> the heap. My best bet here is that it has that something has changed
> around
> >>> user querying/caching.
> >>>
> >>> On Mon, 20 Aug 2018 at 19:18, Thomas Darimont <
> >>> thomas.darimont at googlemail.com> wrote:
> >>>
> >>> > Hello Keycloak team,
> >>> >
> >>> > has anyone encountered some performance issues after upgrading 3.4.3
> >>> > to 4.x (4.3.0)?
> >>> >
> >>> > Today I noticed a performance regression while preparing an upgrade
> >>> > from Keycloak 3.4.3.Final to 4.3.0.Final in our staging environment.
> >>> >
> >>> > In our test environment, we have around ~100k test users stored in a
> >>> > postgres-backed database. When we started the server with the new
> >>> > Keycloak version, the migration went through, and everything looked
> >>> > fine at first, but when we tried to browse the list of users via the
> >>> > admin-console, we noticed that the CPU and memory consumption of the
> >>> > server increased significantly, up to a point where Keycloak crashed
> >>> with an OOME.
> >>> >
> >>> > All previous Keycloak versions including 3.4.3 were very modest with
> >>> > their memory requirements and quite happy with ~1g heap.
> >>> > However, that seems to have changed in Keycloak 4.3.0 - there we
> >>> > needed at least 4g to prevent Keycloak from crashing with an OOME.
> >>> > Furthermore, we noticed that the response times for browsing the
> >>> > paginated user view increased significantly as well:
> >>> > In Keycloak 3.4.3 the average time to load a user page is ~80ms. In
> >>> > Keycloak 4.3.0 (and older versions >= 4.0.0.Beta1) the same operation
> >>> > takes
> >>> > ~7 seconds for a test realm with just 10k users.
> >>> >
> >>> > In the test realm with 100k users, the time to load a single page in
> >>> > the users listing was 66 seconds for version 4.3.0, on average -
> >>> > compared to quite stable 80ms in 3.4.3.
> >>> >
> >>> > The database query that is executed by Keycloak 4.3.0 runs in ~1.5
> >>> > seconds for 100k users, so I assume the processing logic in Keycloak
> >>> > is the culprit.
> >>> >
> >>> > The problem of long load-times can be reproduced with the Keycloak
> >>> > docker images and the in-memory database. I also created a small
> >>> > example project that creates some users with just a few attributes in
> >>> > a docker based 3.4.3 and 4.3.0 Keycloak environment with a Postgres
> >>> > database to reproduce the problem.
> >>> > https://github.com/thomasdarimont/kc-user-regression-tester
> >>> >
> >>> > Cheers,
> >>> > Thomas
> >>> > _______________________________________________
> >>> > keycloak-dev mailing list
> >>> > keycloak-dev at lists.jboss.org
> >>> > https://lists.jboss.org/mailman/listinfo/keycloak-dev
> >>> >
> >>> _______________________________________________
> >>> keycloak-dev mailing list
> >>> keycloak-dev at lists.jboss.org
> >>> https://lists.jboss.org/mailman/listinfo/keycloak-dev
> >>>
> >>
> _______________________________________________
> keycloak-dev mailing list
> keycloak-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/keycloak-dev
>
>


More information about the keycloak-dev mailing list