[keycloak-dev] triple abstraction?

Wed Jul 9 11:28:55 EDT 2014

On 7/9/2014 10:19 AM, Stian Thorgersen wrote:
> Oki, so how about:
>
> 1. Recover the JPA model provider, and set that as the default
> 2. Retire, but keep Hybrid model around until we're done
> 3. Extract user sessions from ModelProvider into UserSessionProvider, and add mem/jpa/mongo implementations
> 4. Review what we have, and make sure everyone is happy with the approach taken for UserSessionProvider
> 5. Extract users and role-mappings from ModelProvider into UserProvider, and add jpa/mongo implementations
> 6. Again, let's review and make sure everyone is happy
> 7. Rename ModelProvider to RealmProvider
> 8. Consider refactorings to the models (such as using attributes instead of columns)
>

9. We need to rethink AuthenticationProvider.

I'm willing to do any/all of that work.  It overlaps with cache changes 
I'd need to make.  I'm kind of in a holding pattern anyways until all 
the above is resolved.

> I also think we'll need a EntityManagerProvider and a MongoClientProvider to make sure we have a single connection/transaction per-request.
>
> For 1.0.final I think we could introduce a limitation that we'll only allow one store at a time, so we don't have to deal with multiple-transactions (or 2pc).
>

We will have 2pc problems when we eventually have a distributed cache.

More comments inline.

>>
>>> * Cache deals with realms well, but does this work for users or do we just
>>> need to make sure loading users from the database performs well?
>>
>> It works for users.
>
> My point was, if we can tweak things so we only have to retrieve users from the database on login do we even need to cache users? It would make the overhead of a distributed cache significantly smaller, as admins are not going to update realms and such frequently, but if you have a large number of users they'll be changing passwords, profiles, etc, all the time.
>

IMO, users will usually be static too.  Not as static as realm metadata, 
but they won't change daily, even weekly.

The only overhead of a distributed cache for users is when a user is 
updated.  And the only overhead is a simple invalidation message to 
invalidate the user.

> I'm thinking:
>
> * Realms - are own cache with the http invalidation messages
> * Sessions - Infinispan
> * Users - no cache, we can retrieve the info we need from tokens and session
>

User cache is very useful, even with millions of users and a much 
smaller cache.  Between processLogin, accessCodeToToken, refreshToken, 
UserModel is needed for all those requests.  So caching an active user 
is not a waste, especially if access code timeouts are much shorter than 
average active session times.

FYI, beside checking session logout/timeout, refreshToken also exists to 
get the latest and greatest scope and role mappings.

Also, IMO, we could probably cache 100s of thousands of users maybe 
millions on average machines.  1 million users at 1k per user is 1 
gigabyte ram.  I'm not convinced that millions of users will be the norm 
for keycloak.

-- 
Bill Burke
JBoss, a division of Red Hat
http://bill.burkecentral.com