]
Martin Choma closed WFWIP-207.
------------------------------
Verified in
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/jboss-eap-7-tech-preview/eap-operator:jb-eap-7.3-operator-rhel8-containers-candidate-33923-20190926082049
UX: Force removal of Operator upon delete - do not hang due to
finalizers
-------------------------------------------------------------------------
Key: WFWIP-207
URL:
https://issues.jboss.org/browse/WFWIP-207
Project: WildFly WIP
Issue Type: Bug
Components: OpenShift
Reporter: Petr Kremensky
Assignee: Ondrej Chaloupka
Priority: Blocker
Labels: operator
We run yet into another use case where finalizers prevent users from deleting the project
- the delete operation hangs.
pods:
{noformat}
$ oc get all
NAME READY STATUS RESTARTS AGE
pod/simple-jaxrs-operator-0 0/1 ImagePullBackOff 0 9m11s
pod/simple-jaxrs-operator-1 0/1 ImagePullBackOff 0 9m11s
pod/wildfly-operator-686846d6fb-db9sj 1/1 Running
$ oc delete wildflyserver simple-jaxrs-operator
wildflyserver.wildfly.org "simple-jaxrs-operator" deleted
... hangs forever
{noformat}
operator log:
{noformat}
{"level":"info","ts":1569308322.2926116,"logger":"controller_wildflyserver","msg":"Reconciling
WildFlyServer","Request.Namespace":"pkremens-namespace","Request.Name":"simple-jaxrs-operator"}
{"level":"info","ts":1569308322.2927597,"logger":"controller_wildflyserver","msg":"WildflyServer
is marked for deletion. Waiting for finalizers to clean the
workspace","Request.Namespace":"pkremens-namespace","Request.Name":"simple-jaxrs-operator"}
{"level":"info","ts":1569308322.2929516,"logger":"controller_wildflyserver","msg":"Transaction
recovery scaledown
processing","Request.Namespace":"pkremens-namespace","Request.Name":"simple-jaxrs-operator","Pod
Name":"simple-jaxrs-operator-0","IP
Address":"10.128.0.227","Pod
State":"SCALING_DOWN_RECOVERY_INVESTIGATION","Pod
Phase":"Pending"}
{"level":"info","ts":1569308322.2931426,"logger":"controller_wildflyserver","msg":"Transaction
recovery scaledown
processing","Request.Namespace":"pkremens-namespace","Request.Name":"simple-jaxrs-operator","Pod
Name":"simple-jaxrs-operator-1","IP
Address":"10.128.0.226","Pod
State":"SCALING_DOWN_RECOVERY_INVESTIGATION","Pod
Phase":"Pending"}
{"level":"error","ts":1569308322.294659,"logger":"kubebuilder.controller","msg":"Reconciler
error","controller":"wildflyserver-controller","request":"pkremens-namespace/simple-jaxrs-operator","error":"Finalizer
processing: failed transaction recovery for WildflyServer
pkremens-namespace:simple-jaxrs-operator name Error: Found 2 errors:\n [[Pod
'simple-jaxrs-operator-0' / 'simple-jaxrs-operator' is in pending phase
Pending. It will be hopefully started in a while. Transaction recovery needs the pod being
fully started to be capable to mark it as clean for the scale down.]], [[Pod
'simple-jaxrs-operator-1' / 'simple-jaxrs-operator' is in pending phase
Pending. It will be hopefully started in a while. Transaction recovery needs the pod being
fully started to be capable to mark it as clean for the scale
down.]],","stacktrace":"github.com/go-logr/zapr.(*zap...
{noformat}
This is a call between safety vs. usability, but we believe that these issues (hanging
delete command due to EAP7-1192) could be a serious usability problem for users.
*actual*
* scale down can require manual user interaction forced by finalizers
* delete can hang, requiring manual user interaction (delete deployment object, remove
finalizer from operator CR, run delete again)
*expected*
* scale down can require manual user interaction forced by finalizers
* delete should never hang, it should be treated like a pulling a plug (rm -rf), in case
users needs to make s graceful shutdown, he make a proper scale down to 0 prior the
project deletion - this should be properly documented