[
https://issues.jboss.org/browse/WFLY-11750?page=com.atlassian.jira.plugin...
]
Tomasz Adamski edited comment on WFLY-11750 at 2/25/19 3:32 PM:
----------------------------------------------------------------
[~pferraro] But if I understand correctly HA features and correct transaction recovery are
independent.
Let's suppose we are on bare-metal and have a cluster of n nodes.
At some point, one of the nodes is invoked by the client, takes part in the transaction
with two-phase commit and after prepare stage the node fails.
As a result, the client has a record in its permanent object store and tries to finish the
transaction. As the record contains the IP address of the failed node the client will wait
till the server is up again to finish the transaction.
That node will have the new cluster identity. My understanding is that it has no effect on
transaction recovery which will work fine.
On the other hand, OpenShift does not guarantee the IP node identity - only DNS identity -
so IMO the cluster will work fine but recovery won't.
But the key thing here IMO is, the way remoting gets node information. When clustering is
involved it gets raw IP addresses which won't work fine on OpenShift and my thought
was that the simplest solution would be to use DNS addresses as base for cluster
operation.
was (Author: tomekadamski):
[~pferraro] But if I understand correctly HA features and correct transaction recovery are
independent.
Let's suppose we are on bare-metal and have a cluster of n nodes.
At some point, one of the nodes (let's call it node A) is invoked by the client, takes
part in the transaction with two-phase commit and after prepare stage the node fails.
As a result, the client has a record in its permanent object store and tries to finish the
transaction. As the record contains the IP address of the failed node the client will wait
till the server is up again to finish the transaction.
That node will have the new cluster identity. My understanding is that it has no effect on
transaction recovery which will work fine.
On the other hand, OpenShift does not guarantee the IP node identity - only DNS identity -
so IMO the cluster will work fine but recovery won't.
But the key thing here IMO is, the way remoting gets node information. When clustering is
involved it gets raw IP addresses which won't work fine on OpenShift and my thought
was that the simplest solution would be to use DNS addresses as base for cluster
operation.
Allow cluster to use DNS addresses instead of IP addresses
----------------------------------------------------------
Key: WFLY-11750
URL:
https://issues.jboss.org/browse/WFLY-11750
Project: WildFly
Issue Type: Feature Request
Components: Clustering
Affects Versions: 16.0.0.Beta1
Reporter: Tomasz Adamski
Assignee: Paul Ferraro
Priority: Major
We would need a configuration that would allow for the cluster to use DNS addresses
instead of IP addresses. The reason is that OpenShift guarantees the node identity under
DNS address and not under IP address.
Sample scenario that may currently fail when application are deployed in OpenShift:
A (application)
B (clustered application)
1. A calls transactional invocation on B
2. as a result of discovery process A obtains a cluster topology from B and uses one of
obtained IP addresses for the connection
3. as the invocation is transactional the object-store records are written in A's
persistent object store; those records are based on the data obtained from the cluster
=> subordinate node is identified by the IP address from point two
4. B node fails
5. OpenShift restarts node B on another IP address
6. A attempts recovery and persistently fails
OTOH OpenShift guarantees node identity under DNS address. As a result, at point 5 node
is guaranteed to restart on established DNS address so if the cluster used this address
instead of physical addresses the scenario above will finish with A being able to recover
the transaction.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)