[keycloak-dev] triple abstraction?

Wed Jul 9 10:19:28 EDT 2014

Oki, so how about:

1. Recover the JPA model provider, and set that as the default
2. Retire, but keep Hybrid model around until we're done
3. Extract user sessions from ModelProvider into UserSessionProvider, and add mem/jpa/mongo implementations
4. Review what we have, and make sure everyone is happy with the approach taken for UserSessionProvider
5. Extract users and role-mappings from ModelProvider into UserProvider, and add jpa/mongo implementations
6. Again, let's review and make sure everyone is happy
7. Rename ModelProvider to RealmProvider
8. Consider refactorings to the models (such as using attributes instead of columns)

I also think we'll need a EntityManagerProvider and a MongoClientProvider to make sure we have a single connection/transaction per-request. 

For 1.0.final I think we could introduce a limitation that we'll only allow one store at a time, so we don't have to deal with multiple-transactions (or 2pc).

More comments inline

----- Original Message -----
> From: "Bill Burke" <bburke at redhat.com>
> To: "Stian Thorgersen" <stian at redhat.com>
> Cc: keycloak-dev at lists.jboss.org
> Sent: Wednesday, 9 July, 2014 2:33:48 PM
> Subject: Re: [keycloak-dev] triple abstraction?
> 
> 
> 
> On 7/9/2014 6:00 AM, Stian Thorgersen wrote:
> > Okay, this all hybrid stuff ended up taking more time and becoming more
> > complex than I hoped. There's also still work to be done, and even once
> > that's completed there'd still be more to do.
> >
> > I think the correct path forward is to:
> >
> > 1. Recover the JPA model provider
> 
> We should.  Is that hard to do?

Nopes, already have it in a branch just need to merge it ;)

> 
> > 2. Branch the Hybrid model provider
> > 3. Delete the Hybrid model provider from master
> >
> > Then we can consider what changes if any to do to the model. I'd like to
> > keep the Hybrid model provider in a branch for a while as there are some
> > things in there that may be useful.
> >
> 
> Are you sure?  After thinking about things overnight, there's a lot of
> things hybrid model solved.  What freaked me out was that I thought the
> Model API was just going to be refactored and the ModelProvider split
> into RealmProvider, UserProvider, and UserSessionProvider.

Yes - I attacked this the wrong way around, and should have done it in the way I've suggested above.

> 
> > A few things that we should consider include:
> >
> > * Can we reduce the need to update database schemas in the future? I think
> > by cleaning up the entities, and also replacing a lot of columns with
> > attributes we can. For example adding an additional config field to
> > RealmModel shouldn't require us updating the database schema
> 
> Your hybrid implementation made me realize how important this will be to
> do across the board for Realm, Apps, Users, etc.  But this should be a
> storage implementation detail, right?  And not part of the Model API?
> Please?  :)
> 
> i.e. the brute force settings could be stored as attributes, but the
> model API would still expose them as getter methods.  I hate doing
> things like  map.get("maxFailures") vs. realm.getMaxfailures().  Having
> actual methods in the model api makes refactoring much much easier too.

How about:

  int maxFailures = map.containsKey("maxFailures") ? Integer.parseInt(map.get("maxFailures)) : 0;

Just kidding, it would be insane to do that in services ;)

> 
> 
> > * If we need to update database schemas how do we do this? Export/import
> > would work for realms, but not for users I think (users would just require
> > a too big downtime if there's millions)
> 
> If we change, for instance, how user "required actions" are stored,
> wouldn't we have to write a migration script anyways?  IIRC, adding
> tables or altering a table to add a column isn't such a big deal.  What
> is a big deal is removing a column.
> 
> Maybe we should just provide migration scripts in between releases that
> require them?

If stored as plain attributes, Keycloak could rewrite them on a live server.

Migration scripts may be the only feasible way to go, but I'm worried about the complexity of writing and testing these scripts for all the different DB variations, including Mongo.

> 
> > * Do we want to store sessions in memory?
> 
> Don't we need both persistence and in-memory storage for sessions?  For
> 1.0 we're not going to test clustering configurations, but, shouldn't it
> still be possible to cluster Keycloak if the cache is shut off and
> persistence session implementation is used?

Sure, let's do both. In the future we could consider Infinispan, or something similar.

> 
> > * Cache deals with realms well, but does this work for users or do we just
> > need to make sure loading users from the database performs well?
> 
> It works for users.

My point was, if we can tweak things so we only have to retrieve users from the database on login do we even need to cache users? It would make the overhead of a distributed cache significantly smaller, as admins are not going to update realms and such frequently, but if you have a large number of users they'll be changing passwords, profiles, etc, all the time.

I'm thinking:

* Realms - are own cache with the http invalidation messages
* Sessions - Infinispan
* Users - no cache, we can retrieve the info we need from tokens and session

> 
> > * With regards to the above I was wondering if we could prevent loading
> > users from cache/db to refresh a token, or to logout a user. If a user is
> > deleted or disabled the session should be deleted, so the only reason I
> > can see to check this is if role mappings have changed, and we could add a
> > dirtyRoles flag to the session for this. Basically dirtyRoles is set to
> > true on all users sessions whenever role-mappings are updated, we would
> > then reload the user/role-mappings on refresh tokens only if dirtyRoles is
> > true.. This would result in us only having to load the user from db once
> > for a "session", and hence there shouldn't really be a need to cache it.
> 
> The cache implementation handles grantRole() by invalidating the user in
> the cache.  It doesn't handle role removal.   But it does lookup the
> role to build the role mapping list.
> 
>      @Override
>      public Set<RoleModel> getRoleMappings() {
>          if (updated != null) return updated.getRoleMappings();
>          Set<RoleModel> roles = new HashSet<RoleModel>();
>          for (String id : cached.getRoleMappings()) {
>              roles.add(cacheSession.getRoleById(id, realm));
> 
>          }
>          return roles;
>      }
> 
> 
> 
> Maybe it could be as easy as just invalidating the user and delegating
> to the persistence model if the role from the role mapping doesn't exist?
> 
> > * Do we want to be able to store realms and users separately?
> >
> 
> I don't know.  We need a sync API architecture to be able to support all
> keycloak features.  This means we need to be able to store user metadata
> for each user.
> 
> These steps should be taken IMO:
> 
> * bring back JPA model
> * Retire Hybrid....BUT...ARE YOU SURE?!!!  You usually make better
> decisions than me...
> * Split Model API and JPA/Mongo impls into RealmProvider, UserProvider,
> and UserSessionProvider.
> * Keep Hybrid in codebase until we're done with ModelProvider refactoring
> * Move Hybrid to branch
> * Change JPA implementation to be more generic.  Attribute based for
> many settings.
> * Rethink AuthenticationProvider SPI. Users really want to be able to
> delegate to their own database or LDAP storage and we need to provide
> better flexibility.
> 
> --
> Bill Burke
> JBoss, a division of Red Hat
> http://bill.burkecentral.com
>