[keycloak-dev] sync/federation requirements/ideas

Marek Posolda mposolda at redhat.com
Thu Jul 10 15:01:33 EDT 2014


On 10.7.2014 15:47, Bill Burke wrote:
>
>
> On 7/10/2014 7:41 AM, Marek Posolda wrote:
>>
>> That's why I would prefer "sync" among "federation" approach.
>
> Which is exactly what I'm proposing.  Except it allows for 
> syncs/imports into Keycloak on demand.  You already use 
> AuthenticationProvider in this manner!
oops, I though that with AuthProvider you don't like the fact that just 
LDAP users authenticated with Keycloak are available in Keycloak DB;-)


>
> My impression of a sync model is that Keycloak works with user data in 
> local Keycloak storage.  User metadata is imported from external 
> storage (either on demand or at startp and periodically synced).  
> Keycloak can manage/modify/augment this imported user data locally.  
> The changelog is obtained from the Sync API and updates sent back to 
> the external storage.
>
>
>> In
>> shortcut with "sync" approach will be users (and eventually their role
>> mappings) synced from external store into Keycloak and most of the
>> UserProvider methods would deal just with "local" provider. Only
>> exception are authentication related methods, and also "update" methods,
>> which will need to update local user and also "external" users. But
>> SyncProvider won't need to implement methods like getUsers(),
>> searchFor**, getUserBy** etc as all users will be available locally.
>>
>
> The problem with this approach is that new users don't get imported 
> into Keycloak until the sync gets invoked.  Syncing more than once per 
> day, or even once per week may not be feasible.  If the external 
> storage does not have a changelog, syncing would involve iterating 
> through each and every user in external storage and syncing it with 
> the keycloak database.  A full sync could take hours.
Yes, but how to do it differently?
(a) For methods like: UserProvider.getUsers(), UserProvider.searchFor*** 
we can either:
1) retrieve users just from "local" store
2) Federate users and merge them from both "local" and "external" store.

Federation approach (2) has quite bad performance issues (especially 
with pagination+sorting), so I would prefer (1). But this really 
requires full sync from external store into KC IMO.


(b) Then we have methods for retrieve single user:
getUserBy*** --- here we can try to retrieve user from 'local' store and 
fallback to 'external'. In that case, user will be synced on demand to 
KC database (similar approach like AuthenticationProvider), but then we 
have the issue that just those users, who were retrieved from 
externalStore will be available in KC.

I think that we can either:
(1) Federate users from both stores
(2) Temporarily allow that just some users from LDAP are available in KC 
database (those who were authenticated or for which getUserBy*** was called)
(3) Do full sync

I don't know if you see some other solution, but I would say that if we 
want to avoid (1) and (2) we really need to do full sync. So I would 
imagine that if someone configure externalStore, he will also need to 
trigger full sync to KC to have all users added to Keycloak.
>
>
>>
>>
>>
>> In details, I can imagine to have things working this way:
>>
>> * There will be Sync SPI configured per realm, which will allow to sync
>> users from "external store" to Keycloak.
>>
>
> +1
>
>> * It will be possible to configure when to sync users. For example it
>> can be full sync from LDAP at server startup or triggered from admin
>> console. Then some periodic sync (triggered for example once per day).
>>
>
> +1
>
>> * I am not sure if we need full sync from Keycloak to "external
>> provider" but probably not. Once user, role or user credential will be
>> updated in Keycloak, it will be also immediatelly synced to external
>> provider (if provider is not read-only).
>>
>
> If you don't implement through the UserProvider/UserModel interface 
> then you would need an Event model for which the SyncProvider listens 
> to so that it can sync the user on demand.
yep, I think that it works to implement this in UserProvider (and 
RealmProvider for roles). Basically you have something like 
"WrapperUserProvider" and once add/update/delete user is invoked, it 
will perform the operation on "local" UserProvider and then invoke 
particular method on SyncProvider to sync it back into external storage.
>
>> * It depends on SyncProvider implementation, what user data are synced.
>> For example in case of LDAP it will be just email+first,lastName and
>> eventually role mappings.
>>
>> * SyncProvider may also support roles + role mappings from external
>> store. SyncProvider will be able to import roles into RealmProvider and
>> then particular role mappings into UserProvider.
>>
>> * Users synced from "external provider" will have link to this provider
>> similarly like it's now. In case that provider is read-write, then all
>> newly created users in Keycloak will be synced into "external provider"
>> immediatelly also with the credential (if syncProvider supports
>> particular credential)
>>
>> I would propose the interface like this to handle both
>> sync/authentication (not sure if "SyncProvider" is good name, maybe
>> rather "ExternalUserProvider"). I can imagine something like this:
>>
>>
>> public interface SyncProvider {
>>
>
> I think your SyncProvider interface is very limiting and even 
> unusable.  For example, getAllUsers() is completely unfeasible if 
> there is a large user database.
oops, sorry. It's maybe not clear from method name and javadoc that 
"getAllUsers" is intended to be used for sync *from* External storage 
*to* Keycloak. The idea is that "getAllUsers" (or 
"syncAllUsersFromExternalStore" or whatever it's named) will be called 
by admin when he wants full sync from external storage to KC.

IMO sync from Keycloak to external storage is quite easy. We can either 
do it immediatelly when add/update/delete user/role is invoked or we can 
use some event approach you proposed with periodic (or on demand) sync 
of just "dirty" entities. We already have AuditProvider, which could be 
eventually used.

Personally I would prefer to always do sync from KC immediatelly as 
event queue doesn't always work. For example when you have RW LDAP and 
you register new user in Keycloak, you would need to save this user and 
his password immediatelly into external store. Save it somewhere in 
Keycloak database and then clear once Chron Job is triggered doesn't 
work IMO as it would mean that we will need to save passwords in 
plain-text until they are synced into LDAP. Which doesn't make much 
sense IMO.

So for sync from Keycloak to externalStore, I would personally always do 
it immediatelly after operation is performed on 
UserProvider/RealmProvider. It's not so big overhead IMO.

But the main issue is sync from externalStore to Keycloak. We can easily 
add changelog to Keycloak, but we don't have control under external 
storage...

I don't know if LDAP servers support changelog, probably yes but I don't 
think that it's standardized, maybe some LDAP servers support 
"proprietary" way of doing changelog. I can check if some LDAP supports 
this and how it works with them...

But fact is that if externalProvider doesn't support changelog, then how 
to do sync without something like "getAllUsers" (or "syncAllUsers")? I 
know that this is quite overhead, but I really don't see how to do 
"partial" sync from 3rd party provider if it doesn't support changelogs...

Marek
>
>
> IMO it would be something like this:
>
> public interface ChronJob {
>
>    void invoke(KeycloakSessionFactory factory);
> }
>
> ChronJob would be scheduled to run at boot time and/or periodically by 
> the admin console.  Periodic and boot time syncs would be implemented 
> here.  The sync operation needs to have full control of when 
> transactions begin and end so updates/creates/deletes can be executed 
> in batches.
>
> On demand syncing would be done through an implementation of 
> UserProvider as I proposed earlier.  On-demand syncing is:
>
> * Importing a user on demand
> * Updating external storage on demand.
>
> We would also have an event changelog that would be implemented as, or 
> act like a persistent JMS Topic.  This event topic shouldn't be used 
> to implement on-demand sync, but rather for when the external storage 
> is read only.:
>
> interface ChangeEvent {
>    enum EventType {
>        CREATE, UPDATE, DELETE
>    }
>
>
>
>    long getTimestamp();
>    EventType getEventType();
>    String getItemId();
>    String getItemType(); // UserModel, RoleModel
> }
>
> EventListeners could perform the syncs on demand.  Or, a ChronJob 
> could be something like a persistence JMS Topic subscriber and replay 
> change events.
>
> We could decide that we don't support user updates for read-only 
> stores.  Then, in that case, IMO, we don't need a changelog event queue.
>
>
>
>



More information about the keycloak-dev mailing list