On 30/11/16 15:07, Gabriel Lavoie wrote:
Hi Stian,
I re-tested for points 4 and 5 on master (2.4.1) and they are still
both issues.
4 - The suggested fix still speed-up considerably the admin authentication
for both the admin console and the admin REST API. For the admin console,
with the fix I'm getting around 300K hit on the realmRevisions cache to
display 1000 realms (number increased with all my test on my perf database
;). Without that fix, the admin console display attempt ends up with
expired Bearer token errors after around 238M hits on the realmsRevisions
cache.
I'll submit the PR for that one right away.
Feel free to submit PR for your
composite role fixes. From the 1st look
it seems ok, but it will need a bit deeper review likely. We will see
whether to include it in next release or rather postpone.
5 - realmRevisions cache size is still hardcoded and is too small. With the
10000 elements limit (from InfinispanConnectionProvider), many operations
generate an endless loop of eviction/refresh. The 50K value I used
previously still fixes that with 1000 realms, but that should be
configurable. I guess I could just submit a PR with the same pattern as the
users cache where the revisions cache size is based on the cache size or
the default value if <= 0?
Again, feel free to submit PR. My vote is to include
this one soon.
Just note that "realms" cache doesn't have any limit by default, but
"realmRevisions" cache always must have limit. Otherwise there is risk
of memory leak. So in case that "realms" (or "users" ) cache
doesn't
have limit configured, then realmRevisions (or userRevisions) cache
should use some default limit. I guess that's what you were suggesting,
but just doublechecking.
Thanks,
Marek
Gabriel
2016-11-28 15:56 GMT-05:00 Gabriel Lavoie <glavoie(a)gmail.com>:
> Thanks for all the comments. I opened the PRs for 1-3. I will re-test on
> 2.4.0.Final a soon as possible for the other two points and report back.
>
> Gabriel
>
> 2016-11-28 10:34 GMT-05:00 Stian Thorgersen <sthorger(a)redhat.com>:
>
>> On a quick glance these changes (1, 2 and 3) seems all to make sense, but
>> there's a lot of them and we just don't have the capacity to review and
>> incorporate for 2.x (we're only weeks away from last 2.x release) so I
>> think it'll have to wait for 3.x.
>>
>> 4 and 5 have been significantly reworked so please review 2.4.0.Final and
>> see if this release fixed issues around caching.
>>
>>
>> On 17 November 2016 at 01:43, Gabriel Lavoie <glavoie(a)gmail.com> wrote:
>>
>>> Hi,
>>> following
>>>
http://lists.jboss.org/pipermail/keycloak-user/2016-October/008032.html
>>> that was discussed in the keycloak-user mailing list, I've been working
>>> in
>>> the last few weeks on fixing performance issues for Keycloak setup with a
>>> very large number of realms (tested with 500-600). As suggested by
>>> Stian, I
>>> have changes to submit as pull requests that I want to discuss here first
>>> to make sure they are acceptable.
>>>
>>> With this improvements, the following issues were solved in our setup:
>>> - Startup time reduced from more than 15 minute to under a minute.
>>> - Login time to the admin console reduced from ~5 minutes to ~20 secs
>>> when
>>> cache is being filled, to under 5 when cache is filled.
>>> - Realm creation time reduced from ~1 minute to under 10 secs.
>>>
>>> 1 - Too many autoFlush checks by Hibernate and explicit em.flush():
>>> >From what I see, the JPA entity model is very disconnected and many
>>> entities are retrieved through their ID using NamedQuery. I'm not sure
>>> why
>>> the mapping was done this way.
>>>
>>> This causes Hibernate to make a autoFlush check for dirty entities on
>>> every
>>> query, which is very CPU intensive. There are a few entities that can
>>> benefit from having a Collection/Set to their child entities and use it
>>> rather than a query to retrieve the required information. Also, with
>>> explicit mapping, I foresee multiple explicit em.flush() that could be
>>> removed when changing the data.
>>>
>>> In this changeset I've covered only what was necessary for us, but there
>>> are still a few areas that could be improved in my opinion:
>>>
>>>
https://github.com/glavoie/keycloak/commit/93c8056498e38ed07
>>> 5523f5b9b78dbde2814d783
>>
>>>
>>> 2 - Many missing index for FKs:
>>> We are using an Oracle database and indexes are not created automatically
>>> for FKs. I've added many but I'm not 100% sure if you want a new
>>> Liquibase
>>> changeset for every changes that could go in a version or only one per
>>> version. I still have one index to fix so don't take this as-is for now:
>>>
>>>
https://github.com/glavoie/keycloak/commit/7bb44b7dfeea31e66
>>> 3695218524a0da4ce48a331
>>>
>>> 3 - Using Set rather than Collection with JPA entities:
>>> When using a Collection, Hibernate will delete all rows and re-insert
>>> them
>>> in the join table to update the content. This behavior was very slow and
>>> visible with the "admin" composite role that points to roles for
all
>>> realms
>>> (through the different clients). This was slowing down a lot new realm
>>> creation. When using a Set, Hibernate does a better job at tracking
>>> changes.
>>>
>>>
https://github.com/glavoie/keycloak/commit/9eb1bf013e80395af
>>> feec09625d5ee6157a7084a
>>>
>>>
>>> 4 - "session" cache when testing for "hasRole" with a
composite role:
>>> The recursive search done KeycloakModelUtils.searchFor() can generate
>>> quite
>>> a lot of hits on the Infinispan cache when trying to test for role. With
>>> 500 realms, we saw up to 1.5 millions hits on the realmsRevision cache
>>> just
>>> to login to the admin console. I've added a in-memory cache of the
>>> retrieved composite RoleModel list in the Infinispan RoleAdapter. As I
>>> understand, every instance of a RoleAdapter is bound to a KeycloakSession
>>> that can be seen as a transaction/request. Still relying on the delegate
>>> object if the RealmAdapter is being invalidated.
>>>
>>> This is the change that reduced the most the login time to the admin
>>> console or the time to obtain a Bearer token for the admin API:
>>>
>>>
https://github.com/glavoie/keycloak/commit/0ab39670525315148
>>> 761627e7d0f808d4ef9c3c0
>>>
>>>
>>> 5 - realmRevisions cache size:
>>> In
https://issues.jboss.org/browse/KEYCLOAK-3202, the realmRevisions
>>> cache
>>> creation was moved from standalone.xml to Java code. With our current
>>> Keycloak cluster we already added the eviction policy mentioned in the
>>> ticket, but had to increase the size to 50000 to support our number of
>>> realms. Having this size hardcoded in the Java code becomes a bit
>>> inconvenient with the latest versions.
>>>
>>> I didn't really fix the code as Stian mentioned that this part is being
>>> heavily reworked. I would guess this has something to do with
>>>
http://lists.jboss.org/pipermail/keycloak-dev/2016-November/008397.html?
>>>
>>>
https://github.com/glavoie/keycloak/commit/6e3e7e8400da3440a
>>> 743941980bad3dc83be7965
>>>
>>>
>>> Thank you!
>>>
>>> Gabriel
>>>
>>> --
>>> Gabriel Lavoie
>>> glavoie(a)gmail.com
>>> _______________________________________________
>>> keycloak-dev mailing list
>>> keycloak-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>
>>
>
> --
> Gabriel Lavoie
> glavoie(a)gmail.com
>