Hello Keycloak developers,
The current Keycloak server docker image comes with some default
configuration (listed below) for the supported databases (mysql, postgres,
mariadb), which work quite well so far.
However, in HA database production environments, we encountered some issues
with the default configuration. We were able to mitigate those issues with
the following general, and in our case PostgreSQL specific, settings.
I'd like to propose adding those settings to the "change-database.cli"
scripts of the default Keycloak docker images.
We faced the following problems:
1. Connections are not always validated before use.
If a database node is not available or has any issues, the connections in
the connection-pool don't reflect this immediately, although the
`check-valid-connection-sql` is set.
2. No timeout configured for database queries.
Some database queries in Keycloak could run quite slowly (due to
performance bugs, inefficient queries etc.). Sometimes this freezes the
admin-console and if the operation is retried in different tabs, could use
up all connections from the pool.
Eventually, the query runs into a timeout, but only after a driver specific
amount of time.
3. Creation of new connections could introduce an unnecessary long delay
(1-5s.
We managed to mitigate those issues by using the following additional
settings in production - and haven't seen any problems with database
connections since.
This mitigates problem 1: This ensures that a connection is "really" tested
before use.
# Configure datasource to connection before use
/subsystem=datasources/data-source=KeycloakDS/:write-attribute(name=validate-on-match,value=${env.DB_VALIDATE_ON_MATCH:true})
This mitigates problem 2: since we can now explicitly control how long a
query can run at max. --> better transparency.
# Configure datasource to use explicit query timeout in seconds
/subsystem=datasources/data-source=KeycloakDS/:write-attribute(name=query-timeout,value=${env.DB_QUERY_TIMEOUT:60})
This mitigates problem 3:
To reduce the time for connection reuse we also use the following setting
to use the next valid connection first, instead of immediately trying to
create a new one.
# Configure datasource to try all other connections before failing
/subsystem=datasources/data-source=KeycloakDS/:write-attribute(name=use-fast-fail,value=${env.DB_USE_CAST_FAIL:false})
In combination with PostgreSQL we also configured the following, which
helped to make the database error handling more robust:
/subsystem=datasources/data-source=KeycloakDS/:write-attribute(name=valid-connection-checker-class-name,value=org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLValidConnectionChecker)
/subsystem=datasources/data-source=KeycloakDS/:write-attribute(name=exception-sorter-class-name,value=org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLExceptionSorter)
This improves the PostgreSQL database error handling in Keycloak.
See:
https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...
Note that this document contains useful configurations for other database
vendors as well.
For reference, this is the current postgres change-database.cli script:
https://github.com/jboss-dockerfiles/keycloak/blob/master/server/tools/cl...
```
/subsystem=datasources/data-source=KeycloakDS: remove()
/subsystem=datasources/data-source=KeycloakDS:
add(jndi-name=java:jboss/datasources/KeycloakDS,enabled=true,use-java-context=true,use-ccm=true,
connection-url=jdbc:postgresql://${env.DB_ADDR:postgres}:${env.DB_PORT:5432}/${env.DB_DATABASE:keycloak}${env.JDBC_PARAMS:},
driver-name=postgresql)
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=user-name, value=${env.DB_USER:keycloak})
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=password, value=${env.DB_PASSWORD:password})
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=check-valid-connection-sql, value="SELECT 1")
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=background-validation, value=true)
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=background-validation-millis, value=60000)
/subsystem=datasources/data-source=KeycloakDS:
write-attribute(name=flush-strategy, value=IdleConnections)
/subsystem=datasources/jdbc-driver=postgresql:add(driver-name=postgresql,
driver-module-name=org.postgresql.jdbc,
driver-xa-datasource-class-name=org.postgresql.xa.PGXADataSource)
```
What do you think about adding the proposed settings?
Cheers,
Thomas