[jboss-jira] [JBoss JIRA] (WFWIP-218) server scale down keeps data in client's data/ejb-xa-recovery and transactions on client aren't commited

Wed Oct 2 08:50:00 EDT 2019

     [ https://issues.jboss.org/browse/WFWIP-218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tomasz Adamski updated WFWIP-218:
---------------------------------
    Comment: was deleted

(was: The problem is this issue is as follows:
1. transaction log on on server is cleared correctly 
2. server's stateful-set is scaled down (pod referenced by client disappears)
3. client checks in-doubt resource but cannot connect to the server

In OpenShift environment with wildfly-operator, we can be sure that if the server's stateful-set was scaled down correctly then it logs must have been cleared. As a result, client records can be discarded. This is not true in general (bare-metal) where the server may go down and still have records in the log.

in the context of above I would propose:
1. workaround - provide the customer with a manual procedure (if connection error to the server's pod occurs but the pod was scaled down correctly remove log records).  This is not elegant, but this is an emergency and I don't expect it to be used often.
2. solution - if operating within OpenShift client cannot connect to one of the server's pods, client checks with OpenShift API whether the server's pod was scaled down. If it was, the record can be discarded.

I would suggest downgrading this issue with provided workaround and later work on the target solution (which would require OpenShift integration and has to be researched).)

> server scale down keeps data in client's data/ejb-xa-recovery and transactions on client aren't commited
> --------------------------------------------------------------------------------------------------------
>
>                 Key: WFWIP-218
>                 URL: https://issues.jboss.org/browse/WFWIP-218
>             Project: WildFly WIP
>          Issue Type: Bug
>          Components: OpenShift
>            Reporter: Martin Simka
>            Assignee: Ondrej Chaloupka
>            Priority: Blocker
>
> this follows up on WFWIP-206
> While testing tx recovery in OpenShift I see that scale down of pod that has transaction in-doubt on it isn't successful
> Scenario:
> *ejb client* (app tx-client, pod tx-client-0):
> * EJB business method
>   ** lookup remote EJB 
>   ** enlist XA resource 1 to transaction
>   ** enlist XA resource 2 to transaction
>   ** call remote EJB
> *ejb server* (app tx-server, pod tx-server-0):
> * EJB business method
>   **  enlist XA resource 1 to transaction
>   ** enlist XA resource 2 to transaction
> *testTxStatelessServerSecondCommitThrowRmFail*
> ejb server XA resource 2 fails with {{XAException(XAException.XAER_RMFAIL)}}
> Then the test calls scale down (size from 1 to 0) on tx-server pod. Server scale down completes but sometimes there some records left in {{<JBOSS_HOME>/standalone/data/ejb-xa-recovery}} on tx-client and transactions on client aren't commited.

--
This message was sent by Atlassian Jira
(v7.13.8#713008)