[keycloak-dev] Support concurrent startup by more cluster nodes

Marek Posolda mposolda at redhat.com
Tue Mar 8 03:51:28 EST 2016


On 08/03/16 09:41, Stian Thorgersen wrote:
> I actually think the chance of someone killing it during upgrade is 
> relatively high. It could be they forgot to include bind address or 
> used wrong server config. It could be migration takes longer than they 
> expect. We shouldn't require users to manually unlock.
>
> The lock should be done in association with the transaction. JPA 
> provides pessimistic locks so you can do:
>
> DatabaseLockEntity lock = em.find(DatabaseLockEntity.class, "lock", 
> LockModeType.PESSIMISTIC_WRITE);
Ok, I will take a look at this possibility and if reliably works with 
all databases. However it will take some time though...

Marek
>
> That will work for all databases (except Mongo of course). If the 
> process dies the transaction will timeout and it's safe to run again 
> at that point because no chances would have been committed to the db.
>
>
>
> On 8 March 2016 at 09:22, Marek Posolda <mposolda at redhat.com 
> <mailto:mposolda at redhat.com>> wrote:
>
>     On 08/03/16 06:48, Stian Thorgersen wrote:
>>     What about obtaining a database lock on a table/column? That
>>     would automatically be freed if the transaction dies.
>     You mean something like "Give me lock for table XY until end of
>     transaction" ? I doubt there is some universal solution for
>     something like this, which will reliably work with all databases
>     which we need to support :/ Otherwise I guess liquibase would
>     already use it too?
>
>     Currently it works the way that lock is obtained by updating the
>     column in database. Something similar to "UPDATE
>     DATABASECHANGELOGLOCK set LOCKED=true where ID=1" .
>     Note there is always single record in this table with ID=1.
>     Something similar is done for Mongo too.
>
>     The lock is released in "finally" block if something fails. The
>     only possibility how can DB remains locked is if someone force to
>     kill the process (like "kill -9" command, then finally blocks are
>     not called) or if network connection between server and DB is
>     lost. The chance of this is very low IMO and we have option to
>     manually recover from this.
>
>     Marek
>
>>
>>     -1 To having a timeout, I agree it's dangerous and could leave
>>     the DB inconsistent so we shouldn't do it
>>
>>     On 7 March 2016 at 21:59, Marek Posolda <mposolda at redhat.com
>>     <mailto:mposolda at redhat.com>> wrote:
>>
>>         Then the record in DB will remain locked and needs to be
>>         fixed manually. Actually the same behaviour like liquibase.
>>         The possibilities to repair from this state is:
>>         - Run keycloak with system property
>>         "-Dkeycloak.dblock.forceUnlock=true" . Then Keycloak will
>>         release the existing lock at startup and acquire new lock.
>>         The warning is written to server.log that this property
>>         should be used carefully just to repair DB
>>         - Manually delete lock record from DATABASECHANGELOGLOCK
>>         table (or "dblock" collection in mongo)
>>
>>         The other possibility is that after timeout, node2 will
>>         assume the current lock is timed-out and will forcefully
>>         release existing lock and replace with it's own lock. However
>>         I didn't it this way as it's potentially dangerous though -
>>         there is some chance that 2 nodes run migration or import at
>>         the same time and DB will end in inconsistent state. Or is it
>>         acceptable risk?
>>
>>         Marek
>>
>>
>>
>>         On 07/03/16 19:50, Stian Thorgersen wrote:
>>>         900 seconds is probably ok, but what happens if the node
>>>         holding the lock dies?
>>>
>>>         On 7 March 2016 at 11:03, Marek Posolda <mposolda at redhat.com
>>>         <mailto:mposolda at redhat.com>> wrote:
>>>
>>>             Send PR with added support for $subject .
>>>             https://github.com/keycloak/keycloak/pull/2332 .
>>>
>>>             Few details:
>>>             - Added DBLockProvider, which handles acquire and
>>>             release of DB lock.
>>>             When lock is acquired, the cluster node2 needs to wait
>>>             until node1
>>>             release the lock
>>>
>>>             - The lock is acquired at startup for the migrating
>>>             model (both model
>>>             specific and generic migration), importing realms and
>>>             adding initial
>>>             admin user. So this can be done always just by one node
>>>             at a time.
>>>
>>>             - The lock is implemented at DB level, so it works even
>>>             if infinispan
>>>             cluster is not correctly configured. For the JPA, I've added
>>>             implementation, which is reusing liquibase DB locking
>>>             with the bugfix,
>>>             which prevented builtin liquibase lock to work
>>>             correctly. I've added
>>>             implementation for Mongo too.
>>>
>>>             - Added DBLockTest, which simulates 20 threads racing
>>>             for acquire lock
>>>             concurrently. It's passing with all databases.
>>>
>>>             - Default timeout for acquire lock is 900 seconds and
>>>             the time for lock
>>>             recheck is 2 seconds. So if node2 is not able to acquire
>>>             lock within 900
>>>             seconds, it fails to start. There is possibility to
>>>             change in
>>>             keycloak-server.json. Is 900 seconds too much? I was
>>>             thinking about the
>>>             case when there is some large realm file importing at
>>>             startup.
>>>
>>>             Marek
>>>             _______________________________________________
>>>             keycloak-dev mailing list
>>>             keycloak-dev at lists.jboss.org
>>>             <mailto:keycloak-dev at lists.jboss.org>
>>>             https://lists.jboss.org/mailman/listinfo/keycloak-dev
>>>
>>>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/keycloak-dev/attachments/20160308/fcca78d6/attachment-0001.html 


More information about the keycloak-dev mailing list