Hi all,
I've been running load tests on our application during the last few weeks,
and having some performance issues when my custom federator is enabled.
The performance issue does not exist when the federator is disabled.
*Configuration*:
I have a cluster of 2 instances of Keycloak, with a standalone DB, we've
verified the DB isn't an issue when the federator is disabled. Both
instances have a quad core CPU and they are in the same network. We’ve left
the memory at 512MB. The test script, database and API that connects to the
federator are in separate machines.
*Federator*:
We have a simple custom federator that makes calls to a very performant
api, which has been tested and is ok. Additionally, we've tested stubbing
the API so the performance is not a problem there. This federator is using
a jaxb marshaller to create a request, again tested in isolation and is
performing well.
As the federator is doing a lot of calls to the API (3 per login request),
I've implemented a httpclient that uses a
PoolingHttpClientConnectionManager with 1000 connections available to use,
instead of using the standard apache httpclient from http components. That
hasn't improved a bit the performance of the system.
*Tests*:
It is a gatling scala script that could generate around ~300 (or more)
requests/second to the direct grants login endpoint using random usernames
from a list (all of them already registered using KC). The script is doing
a round robin across both instances of Keycloak with an even distribution
to each KC instance.
The idea is simulate a load of 300 to 1500 concurrent users trying to login
into our systems.
*Problem*:
If I run the tests without using a federation I can see a very good
performance, but when I try to run the tests with the custom federation
code, the performance drops from ~150 requests/second to 22 req/sec using
both instances.
Memory wise, it seems to be ok. I've never seen an error related to memory
with this configuration, also if you take a look at the attached visualVM
screenshot you'll see that memory is not a problem or it seems not to be.
CPU utilisation is very low to my mind, I'd expect more than 80% of usage
or something like that.
There is a method that is leading the CPU samples on VisualVM called
Semaphore.tryAcquire(). Not quite sure what's that for, still investigating.
I can see that a lot of new threads are being created when the test starts,
as it creates around 60requests/second to the direct grants login call, but
it seems to be a bottleneck at some point.
So I'm wondering if there is some configuration I'm missing on Keycloak
side that could be affecting the cluster performance when a federator is
enabled. Maybe something related to jpa connections, infinispan
configuration or even wildfly.
I'd really appreciate your help on this one as I'm out of ideas.
I've attached some screenshots of visualVM and tests results from my last
run today.
Sorry for the long email and please let me know if you need further
information.
Thank you in advance,
Regards,
Fab
--
*Fabricio Milone*
Developer
*Shine Consulting *
30/600 Bourke Street
Melbourne VIC 3000
T: 03 8488 9939
M: 04 3200 4006
www.shinetech.com *a* passion for excellence