Hi all,

I've been running load tests on our application during the last few weeks, and having some performance issues when my custom federator is enabled.

The performance issue does not exist when the federator is disabled.
 
Configuration

I have a cluster of 2 instances of Keycloak, with a standalone DB, we've verified the DB isn't an issue when the federator is disabled. Both instances have a quad core CPU and they are in the same network. We’ve left the memory at 512MB. The test script, database and API that connects to the federator are in separate machines.
 
Federator:

We have a simple custom federator that makes calls to a very performant api, which has been tested and is ok. Additionally, we've tested stubbing the API so the performance is not a problem there. This federator is using a jaxb marshaller to create a request, again tested in isolation and is performing well.

As the federator is doing a lot of calls to the API (3 per login request), I've implemented a httpclient that uses a PoolingHttpClientConnectionManager with 1000 connections available to use, instead of using the standard apache httpclient from http components. That hasn't improved a bit the performance of the system.
 
Tests:
 
It is a gatling scala script that could generate around ~300 (or more) requests/second to the direct grants login endpoint using random usernames from a list (all of them already registered using KC). The script is doing a round robin across both instances of Keycloak with an even distribution to each KC instance.
 
The idea is simulate a load of 300 to 1500 concurrent users trying to login into our systems.
 
Problem:

If I run the tests without using a federation I can see a very good performance, but when I try to run the tests with the custom federation code, the performance drops from ~150 requests/second to 22 req/sec using both instances.
 
 
Memory wise, it seems to be ok. I've never seen an error related to memory with this configuration, also if you take a look at the attached visualVM screenshot you'll see that memory is not a problem or it seems not to be.
 
CPU utilisation is very low to my mind, I'd expect more than 80% of usage or something like that.
 
There is a method that is leading the CPU samples on VisualVM called Semaphore.tryAcquire(). Not quite sure what's that for, still investigating.

I can see that a lot of new threads are being created when the test starts, as it creates around 60requests/second to the direct grants login call, but it seems to be a bottleneck at some point.

So I'm wondering if there is some configuration I'm missing on Keycloak side that could be affecting the cluster performance when a federator is enabled. Maybe something related to jpa connections, infinispan configuration or even wildfly.

I'd really appreciate your help on this one as I'm out of ideas.

I've attached some screenshots of visualVM and tests results from my last run today.


Sorry for the long email and please let me know if you need further information.

Thank you in advance,

Regards,
Fab

--
Fabricio Milone
Developer

Shine Consulting 

30/600 Bourke Street

Melbourne VIC 3000

T: 03 8488 9939

M: 04 3200 4006


www.shinetech.com  a passion for excellence