Hello Alistair,
On Fri, Oct 4, 2019 at 4:20 PM Doswald Alistair <alistair.doswald(a)elca.ch>
wrote:
Hello,
We're running into some important errors when running a keycloak on a
multi-site cluster with MariaDB as our multi-master database. We have a
setup similar to
https://www.keycloak.org/docs/latest/server_installation/index.html#cross...,
with keycloak 7.0.0 and MariaDB 10.1.37. Each site will write to its own
database cluster, and we thought that MariaDB would handle the replication
and transactions correctly.
It works well, until we get the following types of errors on the database,
and then everything crashes:
2019-10-03 14:09:46 140205469263616 [ERROR] Slave SQL: Could not execute
Delete_rows_v1 event on table cloudtrust-int-keycloak.EVENT_ENTITY; Can't
find record in 'EVENT_ENTITY', Error_code: 1032; handler error
HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 883,
Internal MariaDB error code: 1032
See MDEV-15405 <
https://jira.mariadb.org/browse/MDEV-15405> -- can you
possibly retry with MariaDB 10.3.5+ if the issue is still there?
If the MariaDB upgrade doesn't help, I would retry with "showSql" enabled
(start Keycloak with "*-Dkeycloak.connectionsJpa.showSql=true*"),
reproduce the issue again & try to isolate the SQL statement / set of SQL
statements, which is leading to this state. Maybe after
couple of times repeating the scenario / crash, such set can be identified.
Having that SQL statements set identified, the question is:
- If this is anoter MariaDB bug (hitting the same error msg & error
code) via those SQL statements (thus something to be fixed on MariaDB
side), or
- If this is serialization issue of some kind (.. it happens sometimes
because SQL slave failed to ...) These circumstances would need to be
identified.
2019-10-03 14:09:46 140205469263616 [Warning] WSREP: RBR event 2
Delete_rows_v1 apply warning: 120, 591931
2019-10-03 14:09:46 140205469263616 [Warning] WSREP: Failed to apply app
buffer: seqno: 591931, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2019-10-03 14:09:46 140205469263616 [ERROR] Slave SQL: Could not execute
Delete_rows_v1 event on table cloudtrust-int-keycloak.EVENT_ENTITY; Can't
find record in 'EVENT_ENTITY', Error_code: 1032; handler error
HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 883,
Internal MariaDB error code: 1032
2019-10-03 14:09:46 140205469263616 [Warning] WSREP: RBR event 2
Delete_rows_v1 apply warning: 120, 591931
2019-10-03 14:09:46 140205469263616 [ERROR] WSREP: Failed to apply trx:
source: 4f98589f-e5bd-11e9-9eb9-12b92fd5aeef version: 3 local: 0 state:
APPLYING flags: 1 conn_id: 395 trx_id: 991166 seqnos (l: 18625, g: 591931,
s: 591930, d: 584704, ts: 31567167461519)
2019-10-03 14:09:46 140205469263616 [ERROR] WSREP: Failed to apply trx
591931 4 times
2019-10-03 14:09:46 140205469263616 [ERROR] WSREP: Node consistency
compromized, aborting...
.....................
>From our analysis, it seems that a transaction was not able to be
replayed, which caused the database to shutdown to protect consistency.
Were you able to identify, at which code part this transaction deadlock
happens? After performing what action / steps? Or is it just
Keycloak is started with that setup & it happens after some time everytime?
Did you try different Keycloak / MariaDB versions?
This can seem to happen with race conditions from multiple writes.
Looking
into it we found in the following document
https://galeracluster.com/library/kb/trouble/multi-master-conflicts.html
this passage "When two transactions are conflicting, the later of the two
is rolled back by the cluster. The client application registers this
rollback as a deadlock error. Ideally, the client application should retry
the deadlocked transaction. However, not all client applications have this
logic built in."
Does anyone else have a similar setup? If yes, have you encountered this
problem? Is there a known resolution?
Best regards,
Alistair Doswald
Thank you && Regards, Jan
--
Jan iankko Lieskovsky / Keycloak / RH-SSO Team
_______________________________________________
keycloak-user mailing list
keycloak-user(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/keycloak-user