[keycloak-user] offlineSessions data in cache vs db

Mon Jan 15 15:19:26 EST 2018

Hi Josh,

On 15/01/18 18:24, Josh Cain wrote:
> Thanks for taking a look at that Marek.  Really helpful.
>
> I might open something.  Our use case has very infrequent offline token
> usage (once every week to once every month), and it just doesn't make
> sense to have tokens used so infrequently sit in memory.  Any chance of
> having a DB option?
Yes, there is any chance, but everything depends on the priorities though :)

But actually, you use cross-DC setup if I know correctly? If yes, the 
infinispan caches will be configured with the remoteStore. It's possible 
that if you enable eviction + passivation on offlineSession caches, the 
infrequently used session data will be "passivated" and hence removed 
from the infinispan cache on Keycloak servers side, they will just 
remain in the caches on JDG side and will be loaded by Keycloak servers 
just when needed. But we didn't yet try to test with eviction and 
passivation enabled on infinispan caches with remoteStore on Keycloak 
server side.

Another question is, how much offline sessions you plan to have? Will be 
there million of sessions or 100K or just few thousands? If it's few 
thousands, then the memory might be acceptable? Even for 100K sessions 
(considering that one session = pair of 1 userSession + 1 clientSession) 
the memory is not more than 500 MB - and even less with more cluster 
nodes as then the sessions are distributed among all cluster nodes.
> Last question on this (for now anyway) - Are offline sessions part of
> the RH-SSO 7.2 + JDG cross-datacenter replication support?  If the cache
> only loads those on server startup, that obviously presents a problem
> when doing something like failing over to a secondary datacenter on hot
> standby.
Yes, offline sessions are part of RHSSO 7.2 + cross-datacenter 
replication support. Docs is available already in community [1] .

I think that usecase with hot standby will work.

Actually the example flow may work this way:
- You have 2 datacenters and JDG servers are started in both datacenters.
- First Keycloak server is starting in first datacenter. This server 
will start preloading sessions from DB .
- Offline sessions are preloaded in Keycloak caches in JDG caches in 
first datacenter and also on JDG in second datacenter (due the JDG in 
first DC is automatically doing backup to the JDG in second DC)
- When Keycloak server in second DC is starting, the sessions are 
preloaded from the JDG server in second DC, not from DB.
- At this point, offline sessions are available on both DCs.
- When any datacenter is going to offline mode, the Keycloak servers on 
the second DC will always be available to read the content from the caches
- When the datacenter is started again, it will preload again the 
sessions from the JDG servers in already started DC.

The offline sessions are preloaded from DB really just during start of 
first Keycloak in first DC (we saw much worse performance then 
preloading from remoteStore). But sessions should be always available.

If you have an opportunity to read the docs, try things and provide 
feedback, it will be cool.

[1] 
http://www.keycloak.org/docs/latest/server_installation/index.html#crossdc-mode

Marek
> Josh Cain
> Senior Software Applications Engineer, RHCE
> Red Hat North America
> jcain at redhat.com IRC: jcain
>
> On 01/11/2018 11:07 AM, Marek Posolda wrote:
>> On 10/01/18 22:31, Josh Cain wrote:
>>> Thanks for the response!  Seem to have missed the reply.  A follow-up
>>> question:
>>>
>>> You mentioned that the choice to store in the Infinispan cache was made
>>> for performance purposes.  I understand that this will lead to faster
>>> retrieval speeds, however storing *every* offline session in the
>>> Infinispan cache could lead to a massive memory footprint if these
>>> sessions are used widely enough, right?
>> Just tried some very basic testing. I've tried to create 100K
>> userSessions where every of them has 1 clientSession - so 100K
>> userSessions + 100K clientSessions.
>>
>> With 0 offlineSessions, I saw server consumes 100 MBytes in memory. With
>> 100K sessions (100K userSessions + 100K clientSessions) it was 230
>> MBytes. With 200K sessions (200K userSessions + 200K clientSessions), it
>> was 350 MBytes.
>>
>> So every userSession+clientSession pair took around 1-2 KBytes in my
>> test. In reality, it may be more as it depends on the amount of things
>> in the sessions (roles, protocolMappers, notes etc). We have an existing
>> JIRA to remove some stuff from sessions and save it on tokens itself,
>> which should improve memory consumption [1] .
>>
>> In cluster environment, the memory consumption will be smaller as every
>> cluster node will have just those sessions, which he is owner (default
>> setup of infinispan caches "offlineSessions" and "offlineClientSessions"
>> is to use distributed cache with 1 owner).
>>
>> If some more flexibility is needed, we may add support for
>> offlineSessions to use infinispan cacheStores/cacheLoaders. This is
>> pretty flexible SPI in infinispan 8 (which is the version we currently
>> use). With this, customer may be able to choose if sessions should be
>> preloaded on startup or lazy loaded. Also there may be some additional
>> options around passivation etc, which may be good if customer prefers to
>> save memory rather than CPU. Feel free to create another JIRA if you
>> need this. Just not sure when it's done...
>>
>> [1] https://issues.jboss.org/browse/KEYCLOAK-5006
>>
>> Marek
>>> Am I understanding this correctly, or are the client sessions so light
>>> the impact is negligible?
>>>
>>> Josh Cain
>>> Senior Software Applications Engineer, RHCE
>>> Red Hat North America
>>> jcain at redhat.com IRC: jcain
>>>
>>> On 01/10/2018 03:13 PM, Marek Posolda wrote:
>>>> Yes, I've replied. It seems this thread was send to both "keycloak-dev"
>>>> and "keycloak-user" and I've replied to "keycloak-dev" . Answer is here:
>>>> http://lists.jboss.org/pipermail/keycloak-dev/2017-December/010249.html
>>>> .
>>>>
>>>> Marek
>>>>
>>>> On 10/01/18 19:13, Josh Cain wrote:
>>>>> Looking to do some work with offline tokens and I had similar
>>>>> questions.
>>>>>     Was there ever a response to this?
>>>>>
>>>>> Josh Cain
>>>>> Senior Software Applications Engineer, RHCE
>>>>> Red Hat North America
>>>>> jcain at redhat.com IRC: jcain
>>>>>
>>>>> On 11/21/2017 05:12 PM, Tonnis Wildeboer wrote:
>>>>>> Hello Keycloak Users,
>>>>>>
>>>>>> Ultimately, what we want to do is have three nodes in one Kubernetes
>>>>>> namespace that define a cluster. Then be able to add three more
>>>>>> nodes to
>>>>>> the cluster in a new namespace that shares the same subnet and
>>>>>> database,
>>>>>> then kill off the original three nodes, effectively migrating the
>>>>>> cluster to the new namespace and do all this without anyone being
>>>>>> logged
>>>>>> out. The namespace distinction is invisible to Keycloak, as far as
>>>>>> I can
>>>>>> tell.
>>>>>>
>>>>>> What we have tried:
>>>>>> * Start with 3 standalone-ha mode instances clustered with
>>>>>> JGroups/JDBC_PING.
>>>>>> * Set the number of cache owners for sessions to 6.
>>>>>> * Start the three new instances in the new Kubernetes namespace,
>>>>>> configured exactly the same as the first three - that is, same db,
>>>>>> same
>>>>>> number of cache owners.
>>>>>> * Kill the original three
>>>>>>
>>>>>> But it seems this caused offlineSession tokens to be expired
>>>>>> immediately.
>>>>>>
>>>>>> I found this in the online documentation
>>>>>> (http://www.keycloak.org/docs/latest/server_installation/index.html#server-cache-configuration):
>>>>>>
>>>>>>
>>>>>>
>>>>>>     > The second type of cache handles managing user sessions, offline
>>>>>> tokens, and keeping track of login failures... The data held in these
>>>>>> caches is temporary, in memory only, but is possibly replicated across
>>>>>> the cluster.
>>>>>>
>>>>>>     > The sessions, authenticationSessions, offlineSessions and
>>>>>> loginFailures caches are the only caches that may perform replication.
>>>>>> Entries are not replicated to every single node, but instead one or
>>>>>> more
>>>>>> nodes is chosen as an owner of that data. If a node is not the
>>>>>> owner of
>>>>>> a specific cache entry it queries the cluster to obtain it. What this
>>>>>> means for failover is that if all the nodes that own a piece of
>>>>>> data go
>>>>>> down, that data is lost forever. By default, Keycloak only
>>>>>> specifies one
>>>>>> owner for data. So if that one node goes down that data is lost. This
>>>>>> usually means that users will be logged out and will have to login
>>>>>> again.
>>>>>>
>>>>>> It appears, based on these documentation comments and our experience,
>>>>>> that the "source of truth" regarding offlineSessions is the data in
>>>>>> the
>>>>>> "owner" caches, is NOT the database, as I would have expected. It also
>>>>>> seems to be the case that if a node joins the cluster (as defined by
>>>>>> JGroups/JDBC_PING), it will NOT be able to populate its
>>>>>> offlineSessions
>>>>>> cache from the database, but must rely on replication from one of the
>>>>>> owner nodes.
>>>>>>
>>>>>> Questions:
>>>>>> 1. Is the above understanding regarding the db vs cache correct?
>>>>>> 2. If so, please explain the design/reasoning behind this behavior.
>>>>>> Otherwise, please correct my understanding.
>>>>>> 3. Is there a way to perform this simple migration without losing any
>>>>>> sessions?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --Tonnis
>>>>>> _______________________________________________
>>>>>> keycloak-user mailing list
>>>>>> keycloak-user at lists.jboss.org
>>>>>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>>>>>
>>>>> _______________________________________________
>>>>> keycloak-user mailing list
>>>>> keycloak-user at lists.jboss.org
>>>>> https://lists.jboss.org/mailman/listinfo/keycloak-user
>>>>