Then the record in DB will remain locked and needs to be fixed manually. Actually the same behaviour like liquibase. The possibilities to repair from this state is:
- Run keycloak with system property "-Dkeycloak.dblock.forceUnlock=true" . Then Keycloak will release the existing lock at startup and acquire new lock. The warning is written to server.log that this property should be used carefully just to repair DB
- Manually delete lock record from DATABASECHANGELOGLOCK table (or "dblock" collection in mongo)

The other possibility is that after timeout, node2 will assume the current lock is timed-out and will forcefully release existing lock and replace with it's own lock. However I didn't it this way as it's potentially dangerous though - there is some chance that 2 nodes run migration or import at the same time and DB will end in inconsistent state. Or is it acceptable risk?

Marek


On 07/03/16 19:50, Stian Thorgersen wrote:
900 seconds is probably ok, but what happens if the node holding the lock dies?

On 7 March 2016 at 11:03, Marek Posolda <mposolda@redhat.com> wrote:
Send PR with added support for $subject .
https://github.com/keycloak/keycloak/pull/2332 .

Few details:
- Added DBLockProvider, which handles acquire and release of DB lock.
When lock is acquired, the cluster node2 needs to wait until node1
release the lock

- The lock is acquired at startup for the migrating model (both model
specific and generic migration), importing realms and adding initial
admin user. So this can be done always just by one node at a time.

- The lock is implemented at DB level, so it works even if infinispan
cluster is not correctly configured. For the JPA, I've added
implementation, which is reusing liquibase DB locking with the bugfix,
which prevented builtin liquibase lock to work correctly. I've added
implementation for Mongo too.

- Added DBLockTest, which simulates 20 threads racing for acquire lock
concurrently. It's passing with all databases.

- Default timeout for acquire lock is 900 seconds and the time for lock
recheck is 2 seconds. So if node2 is not able to acquire lock within 900
seconds, it fails to start. There is possibility to change in
keycloak-server.json. Is 900 seconds too much? I was thinking about the
case when there is some large realm file importing at startup.

Marek
_______________________________________________
keycloak-dev mailing list
keycloak-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/keycloak-dev