[jboss-jira] [JBoss JIRA] (WFWIP-205) tx recovery intermittently fails after jvm crash

Wed Sep 18 03:57:00 EDT 2019

    [ https://issues.jboss.org/browse/WFWIP-205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785737#comment-13785737 ] 

Ondrej Chaloupka commented on WFWIP-205:
----------------------------------------

[~simkam] I see. It seems there could be a testsuite misconfiguration that I forgot about (blame on me) With help of [~tomekadamski] I tracked down that it could help if you change the {{standalone-openshift.xml}} configuration to use not the {{loadbalancer}} service but the {{headless}} service.
{{<remote-destination host="tx-server-1.tx-server-loadbalancer" port="8080"/>}} => {{<remote-destination host="tx-server-1.tx-server-headless" port="8080"/>}}

> tx recovery intermittently fails after jvm crash
> ------------------------------------------------
>
>                 Key: WFWIP-205
>                 URL: https://issues.jboss.org/browse/WFWIP-205
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: OpenShift
>         Environment: image: 
> {noformat}
> docker-registry.engineering.redhat.com/ochaloup/wildfly18-snapshot:190909-d4ddf04cc2-wfcore-10.0.0.Beta7-SNAPSHOT
> {noformat}
> operator: 
> {noformat}
> docker-registry.engineering.redhat.com/jbossqe-eap/wildfly-operator:EAP7-1192-txn-recovery-issue70
> {noformat}
> operator built from https://github.com/ochaloup/wildfly-operator/tree/issue70-statefulset-headless-service, head 8925e7f64b6fc02b4694da63d93c0a8ce03a566d)
>            Reporter: Martin Simka
>            Assignee: Ondrej Chaloupka
>            Priority: Blocker
>         Attachments: tx-client-0.log, tx-server-0.log, tx-server-1.log, wildfly-operator-668fd79fb5-8chs8.log
>
>
> While testing tx recovery in OpenShift I see that recovery after JVM crash intermittently fails
> Scenario:
> *ejb client* (app tx-client, pod tx-client-0):
> * EJB business method
>   ** lookup remote EJB 
>   ** enlist XA resource 1 to transaction
>   ** enlist XA resource 2 to transaction
>   ** call remote EJB
> *ejb server* (app tx-server, pod tx-server-0):
> * EJB business method
>   **  enlist XA resource 1 to transaction
>   ** enlist XA resource 2 to transaction
> ejb server XA resource 2 crashes JVM in commit method phase. 
> Test waits until crashed pod is restarted, then forces periodic recovery twice and then checks that transaction log store is empty. But it is not empty.
> Attached are logs from client and server pods. 
> It seems that it can be partially mitigated by clearing openshift namespace before test ({{oc delete all --all}}). But it makes it just less frequent. 

--
This message was sent by Atlassian Jira
(v7.13.5#713005)