[keycloak-dev] Slight refactoring of InfinispanUserSessionProvider

Wed May 17 05:47:58 EDT 2017

On 16/05/17 17:04, Schuster Sebastian (INST/ESY1) wrote:
> Hi Marek,
>
> comment inline...
>
>> -----Original Message-----
>> From: Marek Posolda [mailto:mposolda at redhat.com]
>> Sent: Dienstag, 16. Mai 2017 10:41
>> To: Schuster Sebastian (INST/ESY1) <Sebastian.Schuster at bosch-si.com>;
>> keycloak-dev at lists.jboss.org
>> Subject: Re: [keycloak-dev] Slight refactoring of InfinispanUserSessionProvider
>>
>> On 16/05/17 09:38, Schuster Sebastian (INST/ESY1) wrote:
>>> Hi Marek,
>>>
>>> You mentioned in your proposed solution that " The changes may not be
>> guaranteed in same order in every DC". Wouldn't that be a problem as it leads to
>> inconsistent data in different data centers?
>> I was thinking about that and ATM I can't see the scenario when it is a problem with
>> respect to our business logic related to userSession model.
>>
>> For example it can happen that there is request1 for add client "client1" in dc1. And
>> at the same time request2 for add "client2" in dc2.  So it may happen that request1
>> will add "client1" to the userSession and send the message to dc2 for add "client1"
>> . At the same time request2 will add "client2" to the userSession in dc2 and send
>> the message to dc1 for add "client1" . After some time, both DCs will receive the
>> message from the other DC and both will update their client list to be ["client1",
>> "client2"] .
> Agreed, if having intermediate inconsistent states does not hurt anybody else, doing this
> in an eventually-consistent style should be fine. I found some indications the Infinispan
> guys want to add eventual consistency anyways, maybe it is worthwile to talk to them?
> See http://infinispan.org/docs/stable/glossary/glossary.html#basically_available_soft_state_eventually_consistent_base
> and https://issues.jboss.org/browse/ISPN-999 (and adore the issue number).
Just a note that we can be always consistent in "simple" cluster. The 
eventual inconsistency can happen just in the cross-datacenter 
environment. Doing everything synchronously can be a big challenge for 
the performance though as the network connection between data-centers 
can be slow and blocking user requests until the message is transferred 
to the other DC will be performance blocker.
>
>> Important is, that there won't be lost update (which can easily happen now).
>>
>> There are various corner cases (eg. DC will receive the message about update of
>> userSession, which was already logged-out/removed in this DC etc), but all of
>> those can be handled IMO.
> To be honest, I am a bit afraid of them. Concurrency related corner cases are like the end
> boss and really hard to get right, typically. :)
It won't be so easy, but it's doable. Hopefully :) For example when you 
receive the message for update the session, which was already removed 
locally, then you just ignore the message. Or updating 
"lastSessionRefresh" to the earlier time then the current value of 
session will be ignored too. etc.

Marek
>
> Best regards.
> Sebastian
>
>> One special thing is code-to-token request as it can happen very quickly after the
>> userSession is created. And it can be processed on different DC then the one
>> when userSession was created. So my current thinking is to not lookup
>> userSession in code-to-token request at all, but instead rely on the information in
>> the code, which will itself be JWT and will encapsulate all the informations to be
>> put into the token itself. I will likely send another email about that..
>>
>> I think that for the case when you want to ensure very consistent behaviour, you
>> will be able to configure the channel between datacenters to be SYNC instead of
>> ASYNC. However that will have performance penalty.
>>
>> Marek
>>> Best regards,
>>> Sebastian
>>>
>>> P.S.: If I recall correctly, Infinispan does not support a "serializable" isolation
>> level preventing lost updates, so even if Keycloak used the transaction API, it
>> would not help...
>>> Mit freundlichen Grüßen / Best regards
>>>
>>>    Sebastian Schuster
>>>
>>> Engineering and Support (INST/ESY1)
>>> Bosch Software Innovations GmbH | Schöneberger Ufer 89-91 | 10785
>>> Berlin | GERMANY | www.bosch-si.com Tel. +49 30 726112-485 | Fax +49
>>> 30 726112-100 | Sebastian.Schuster at bosch-si.com
>>>
>>> Sitz: Berlin, Registergericht: Amtsgericht Charlottenburg; HRB 148411
>>> B
>>> Geschäftsführung: Dr.-Ing. Rainer Kallenbach, Michael Hahn
>>>
>>>
>>> -----Original Message-----
>>> From: keycloak-dev-bounces at lists.jboss.org
>>> [mailto:keycloak-dev-bounces at lists.jboss.org] On Behalf Of Marek
>>> Posolda
>>> Sent: Montag, 15. Mai 2017 18:25
>>> To: keycloak-dev at lists.jboss.org
>>> Subject: [keycloak-dev] Slight refactoring of
>>> InfinispanUserSessionProvider
>>>
>>> In today's call with Stian and Hynek, we discussed the possible refactoring of
>> InfinispanUserSessionProvider to be reliable in cluster/cross-dc environment.
>>> Current implementation has tightly couple between the model
>>> (UserSessionAdapter) and underlying infinispan entity. For example when you
>> call "userSessionModel.setLastSessionRefresh(123)" it directly delegates this call
>> to infinispan entity and calls "entity.setLastSessionRefresh(123)" .
>>> Some issues with this approach:
>>>
>>> 1) Possibility of lost updates in cluster - It can happen that there
>>> are more concurrent requests updating same userSession. There is a fix
>>> by Bill [1], which adds the thread-safety to UserSessionEntity and
>>> fixes the issue in local environment. However in cluster with more
>>> nodes, it can still happen that there is "lost update". I've created
>>> the JIRA [2] for this with details. In shortcut, what can happen is,
>>> that there is
>>> request1 processed on node1 and request2 processed on node2 in parallel.
>>> Both requests read the sessionEntity and then both perform concurrent update
>> and commits their transaction. Then the latter update wins and overwrites the first
>> update as the state when it read the entity didn't yet contain the first update.
>>> 2) Rollback doesn't work correctly. Basically when we do this:
>>>
>>> {code}
>>> userSession.setLastSessionRefresh(123);
>>> session.transaction().rollback();
>>> {code}
>>>
>>> the keycloak transaction should be rolled-back and session in infinispan cache
>> shouldn't be updated to lastSessionRefresh 123. However in current
>> implementation, rollback doesn't work and entity is always updated in local
>> environment and may be updated in cluster too (in case that calling node is the
>> owner of session entity). That's because we are not using infinispan transaction
>> API and calling "entity.setLastSessionRefresh(123)" directly updates the entity
>> instance referenced by infinispan storage.
>>> 3) Cross-dc support with good performance
>>>
>>> Ideal is to fix all issues and ensure consistency, but have the same/similar
>> performance at the same time and not introduce additional uneccessary locking.
>>>
>>> Possible solution:
>>>
>>> - Every change to UserSessionModel won't delegate directly to infinispan
>> sessionEntity as it's now. Instead it will track the change through
>> changelogs/events, which are saved locally in our infinispan transaction (eg.
>> AddClientToSessionEvent, LastSessionRefreshUpdateEvent).
>>> - During Keycloak transaction commit are changelogs applied to sessionEntity. It
>> may use atomic operation "cache.replace(key, oldValue, newValue)" and the
>> pattern like this:
>>> {code}
>>> boolean replaced = false;
>>>
>>> while (!replaced) {
>>>        SessionEntity oldSession = cache.get("123");
>>>        // Will update session entity based on the changes tracked in infinispan
>> transaction
>>>        SessionEntity updatedSession = applyChanges(oldSession);
>>>        replaced = cache.replace("123", oldSession, updatedSession); }
>>> {code}
>>>
>>> - In cross-dc environment are changes sent to another datacenter. That will
>> receive changes (eg. AddClientToSessionEvent) and apply changes locally in the
>> datacenter and update the sessionEntity in the infinispan. It shouldn't be a problem
>> that those events are sent between datacenters through the asynchronous channel.
>> The changes may not be guaranteed in same order in every DC, but all of them will
>> be triggered and there won't be lost updates. So the cache "sessions" won't be
>> directly replicated through the cross-dc, but instead the changes will be sent
>> between datacenters.
>>> What do you think? Can you see some issues with this approach?
>>>
>>> Thanks,
>>> Marek
>>>
>>> [1] https://issues.jboss.org/browse/KEYCLOAK-4821
>>> [2] https://issues.jboss.org/browse/KEYCLOAK-4898
>>>
>>> _______________________________________________
>>> keycloak-dev mailing list
>>> keycloak-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/keycloak-dev