[JBoss JIRA] (WFWIP-218) server scale down keeps data in client's data/ejb-xa-recovery and transactions on client aren't commited

Wednesday, 2 October 2019

    [
https://issues.jboss.org/browse/WFWIP-218?page=com.atlassian.jira.plugin....
] 

Tomasz Adamski commented on WFWIP-218:
--------------------------------------

The problem is this issue is as follows:
1. transaction log on on server is cleared correctly 
2. server's stateful-set is scaled down (pod referenced by client disappears)
3. client checks in-doubt resource but cannot connect to the server

In OpenShift environment with wildfly-operator, we can be sure that if the server's
stateful-set was scaled down correctly then it logs must have been cleared. As a result,
client records can be discarded. This is not true in general (bare-metal) where the server
may go down and still have records in the log.

in the context of above I would propose:
1. workaround - provide the customer with a manual procedure (if connection error to the
server's pod occurs but the pod was scaled down correctly remove log records).  This
is not elegant, but this is an emergency and I don't expect it to be used often.
2. solution - if operating within OpenShift client cannot connect to one of the
server's pods, client checks with OpenShift API whether the server's pod was
scaled down. If it was, the record can be discarded.

I would suggest downgrading this issue with provided workaround and later work on the
target solution (which would require OpenShift integration and has to be researched).

...
 server scale down keeps data in client's data/ejb-xa-recovery and
transactions on client aren't commited

--------------------------------------------------------------------------------------------------------

                 Key: WFWIP-218
                 URL: https://issues.jboss.org/browse/WFWIP-218
             Project: WildFly WIP
          Issue Type: Bug
          Components: OpenShift
            Reporter: Martin Simka
            Assignee: Ondrej Chaloupka
            Priority: Blocker

 this follows up on WFWIP-206
 While testing tx recovery in OpenShift I see that scale down of pod that has transaction
in-doubt on it isn't successful
 Scenario:
 *ejb client* (app tx-client, pod tx-client-0):
 * EJB business method
   ** lookup remote EJB 
   ** enlist XA resource 1 to transaction
   ** enlist XA resource 2 to transaction
   ** call remote EJB
 *ejb server* (app tx-server, pod tx-server-0):
 * EJB business method
   **  enlist XA resource 1 to transaction
   ** enlist XA resource 2 to transaction
 *testTxStatelessServerSecondCommitThrowRmFail*
 ejb server XA resource 2 fails with {{XAException(XAException.XAER_RMFAIL)}}
 Then the test calls scale down (size from 1 to 0) on tx-server pod. Server scale down
completes but sometimes there some records left in
{{<JBOSS_HOME>/standalone/data/ejb-xa-recovery}} on tx-client and transactions on
client aren't commited. 

--
This message was sent by Atlassian Jira
(v7.13.8#713008)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006