On Tue, Jun 21, 2016 at 2:21 PM, John Dennis <jdennis@redhat.com> wrote:

On 06/21/2016 05:27 AM, Marko Strukelj wrote:

On Tue, Jun 21, 2016 at 2:07 AM, John Dennis <jdennis@redhat.com
<mailto:jdennis@redhat.com>> wrote:

On 06/20/2016 06:13 AM, Marko Strukelj wrote:

The first error means that there are existing tables in the local H2
database (under standalone/data there are keycloak.* files).

It looks like the logic determined that they are of some previous db
schema version, and tried to upgrade the schema to latest
version, but
unexpectedly the schema in place already seems to contain the
tables it
wasn't supposed to contain.

I suppose that could happen if upgrade process is interrupted by
restarting the server?

Since you are using the default H2 database I assume you don't care
about any existing data. The solution for you then is to stop the
server, delete the database (rm standalone/data/keycloak.*), and
start
the server again.

Thank you Marko, I've got a few more questions for you.

These errors occur during automated installation and configuration
via ansible.

One of the operations performed is invoking bin/add-user-keycloak to
add the admin user. I seem to recall add-user-keycloak operates on
static files which are read during start up. Could the use of
add-user-keycloak trigger the schema errors seen in the log?

No.

This is a brand new install so why would there be an upgrade process
running?

One possibility would be that the packaged installation already contains
standalone/data/keycloak.* files, which it shouldn't - maybe it's a
custom packaging you put together?
Another possibility is that the error does not happen on first start,
but on subsequent start, after server is forcefully restarted.

The ansible scripts do restart the server. Starting the server is
done via bin/standalone.sh but stopping the server is performed by
systemd sending a SIGTERM, waiting and then sending a SIGKILL (or so
I believe). Does the upgrade process gracefully handle SIGTERM such
that it continues to run until complete and then exit?

This is the most likely culprit. I don't think upgrade procedure will
properly complete when java process is set to shutdown no matter what
signal is used.
The shutdown should only be performed after server has reached started
state.

That's when you see an entry in the log similar to:

11:26:41,410 INFO [org.jboss.as <http://org.jboss.as>] (Controller Boot
Thread) WFLYSRV0025: Keycloak 1.9.7.Final (WildFly Core 2.0.10.Final)
started in 10219ms - Started 416 of 782 services (526 services are lazy,
passive or on-demand)

>From reading the log (I've attached a copy from another run) I think the server is stopping before it properly initialized. I'm not familiar with all your log messages but I base this conclusion on the fact "stopping" messages appear in the log just after the database errors, followed by this "stop" message:

[org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: rh-sso 7.0.0.GA (WildFly Core 2.1.2.Final-redhat-1) stopped in 2100ms

The "start" message

[org.jboss.modules] (main) JBoss Modules version 1.5.1.Final-redhat-1

occurs next followed by more database error messages and then finally the "ready to run" message you cite:

[org.jboss.as] (Controller Boot Thread) WFLYSRV0026: rh-sso 7.0.0.GA (WildFly Core 2.1.2.Final-redhat-1) started (with errors) in 12451ms - Started 473 of 839 services (2 services failed or missing dependencies, 583 services are lazy, passive or on-demand)

Question:

Does the Keycloak team have a working script to start and stop the service in RHEL? When we first started working with Keycloak we were told no and we would need to cobble together something by calling the bin/standalone.sh script.

The "wait for full initialization" problem is not new to us with daemons. It's come up a number of times with IPA and other daemons we work with. The way we've dealt with it is to have our service scripts that start and stop services write to one of the primary sockets and only when it gets a valid response back conclude the service is in fact up (handling timeouts of course). Systemd came along later and might have some support for socket detection, I'll investigate that option.

We really need a script that can start and stop the service reliably without errors.

--
John