Race condition in SPI loading on startup
by EXTERNAL Thiele Frank (TNG, INST-CSS/BSV-OS2)
Hi,
Summary: When deploying an SPI extension for the identity-provider-mapper (just an example), this wasn't loaded due to that when using the debugger. In most cases this seems to work anyway - I could only reproduce it using a debugger.
There seems to be a race condition between org.keycloak.services.DefaultKeycloakSessionFactory.init() and org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor.deploy(DeploymentPhaseContext).
The deploy() function of above is beyond others used for file system imports, e.g. custom extensions as jars (e.g. folder standalone/deployments). It interacts indirectly with the init() function / whole DefaultKeycloakSessionFactory class based on ProviderManagerRegistry.SINGLETON.
deploy() alters the singleton to contain preBoot ProviderManager instances. Once these are set (properly...), the init() function takes them into account and loads them during startup, e.g. to load the custom jars.
The init() function itself searches for preBoot managers and loads them once they are defined. If they happen to be defined later, then these are not loaded by init().
In a positive case, this would then be handled by deploy() itself which has a handling to use the deployment process of init() (better: its class instance). But this is only working, if the init() function executes ProviderManagerRegistry.SINGLETON.setDeployer(this); before.
One can reproduce this by debugging with certain breakpoints:
- First here:
- public void init() {
- serverStartupTimestamp = System.currentTimeMillis();
-
- ProviderManager pm = new ProviderManager(KeycloakDeploymentInfo.create().services(), getClass().getClassLoader(), Config.scope().getArray("providers"));
- spis.addAll(pm.loadSpis());
- factoriesMap = loadFactories(pm);
- >>>> for (ProviderManager manager : ProviderManagerRegistry.SINGLETON.getPreBoot()) {
- Map<Class<? extends Provider>, Map<String, ProviderFactory>> factoryMap = loadFactories(manager);
- Second here:
- public void deploy(ProviderManager pm) {
- ProviderManagerDeployer deployer = getDeployer();
- if (deployer == null) {
- >>>> preBoot.add(pm);
- } else {
- deployer.deploy(pm);
- }
-
- }
By stepping through the workflow in a certain order, one can see the issue.
Idea for fix: Add mutexes / semaphores / compareAndSwap based non-blocking synchronization means
Mit freundlichen Grüßen / Best regards
Frank Thiele
Open Source Services 2 - Product Group Customer Success Services (INST-CSS/BSV-OS2)
Bosch Software Innovations GmbH | Ullsteinstr. 128 | 12109 Berlin | GERMANY | www.bosch-si.com<http://www.bosch-si.com>
Tel. +49 30 726112-0 | Fax +49 30 726112-100 | external.Frank.Thiele(a)bosch-si.com<mailto:external.Frank.Thiele@bosch-si.com>
Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411 B
Aufsichtsratsvorsitzender: Dr.-Ing. Thorsten Lücke; Geschäftsführung: Dr. Stefan Ferber, Michael Hahn, Dr. Aleksandar Mitrovic