[JBoss JIRA] (WFLY-12624) Return hostname instead of IP address when generating default client mapping
by Richard Achmatowicz (Jira)
[ https://issues.jboss.org/browse/WFLY-12624?page=com.atlassian.jira.plugin... ]
Richard Achmatowicz updated WFLY-12624:
---------------------------------------
Description:
When an EJB client application interacts with a clustered Wildfly deployment, it receives topology updates from cluster nodes describing the membership of the cluster.
For each node in the cluster, a set of one or more client mappings is provided to indicate how the client may connect to the node, if it hasn't already. If the node is connected to a single network, there will be one client mapping; if the node is multi-homed and connected to two networks, there will be two client mappings, etc. Client mappings specify three things: a CIDR representation of the network the client may be on, a destination hostname or IP address and a destination port.
Client mappings may be generated by default (if none are provided in the server profile) or may be specified by the user via client mappings defined in the socket binding of the Remoting connector. For example:
{noformat}
<socket-binding name="remoting" port="1099">
<client-mapping source-network="135.121.1.0/24" destination-address="135.121.1.29" destination-port="1099"/>
</socketbinding>
{noformat}
When the client mapping information is received by the EJB client application, it is added to the discovered node registry (DNR) in the Discovery component of the EJB client. The DNR represents all known information about nodes with which the client can interact which was received from nodes in one or more clusters.
When an invocation is attempted, the Discovery component uses this information to generate a set of ServiceURLs which represent candidate targets (i.e. servers containing the deployment and module the client is invoking on) for the invocation. The Discovery component uses "an algorithm" to take the information in the DNR (and other places) and convert that information to a corresponding set of ServiceURLs representing available targets. The Discovery component will then select one such ServiceURL and return this as the target for the invocation. For example, in the above case, the service URL will look something like:
{noformat}
service:ejb.jboss:remote://135.121.1.29:1099;cluster=ejb;node=node1;ejb-module=my-foo-app/my-bar-module;source-ip=135.121.1.0/24"
{noformat}
This service URL describes a server with logical name "node1" which:
* is a member of a cluster called "ejb"
* has the EJB module "my-foo-app/my-bar-module" and all the beans that it contains deployed
* can be connected to by the URL "remote://135.121.1.29:1099" as long as you are on network "135.121.1.0/24"
Discovery obtains node information used in the algorithm from various sources: client mappings associated with cluster nodes, as described above, as well as Remoting endpoints associated with established connections to nodes. These pieces of information describe at a minimum a host and a port.
The problem is that "the algorithm" used in Discovery to compute the set of ServiceURLs treats hostnames and IP addresses as simple strings. So "localhost" and "127.0.0.1" are treated as different hosts, even though they refer to the same host. If a mix of hostnames and IP addresses for the same node is received, this results in an incomplete/incorrect set of ServiceURLs being generated which in turn leads to incorrect Discovery failures.
was:
When an EJB client application interacts with a clustered Wildfly deployment, it receives topology updates from cluster nodes describing the membership of the cluster.
For each node in the cluster, a set of one or more client mappings is provided to indicate how the client may connect to the node, if it hasn't already. If the node is connected to a single network, there will be one client mapping; if the node is multi-homed and connected to two networks, there will be two client mappings, etc. Client mappings specify three things: a CIDR representation of the network the client may be on, a destination hostname or IP address and a destination port.
Client mappings may be generated by default (if none are provided in the server profile) or may be specified by the user via client mappings defined in the socket binding of the Remoting connector. For example:
{noformat}
<socket-binding name="remoting" port="1099">
<client-mapping source-network="135.121.1.29/16" destination-address="135.121.1.29" destination-port="1099"/>
</socketbinding>
{noformat}
When the client mapping information is received by the EJB client application, it is added to the discovered node registry (DNR) in the Discovery component of the EJB client. The DNR represents all known information about nodes with which the client can interact which was received from nodes in one or more clusters.
When an invocation is attempted, the Discovery component uses this information to generate a set of ServiceURLs which represent candidate targets (i.e. servers containing the deployment and module the client is invoking on) for the invocation. The Discovery component uses "an algorithm" to take the information in the DNR (and other places) and convert that information to a corresponding set of ServiceURLs representing available targets. The Discovery component will then select one such ServiceURL and return this as the target for the invocation.
Discovery obtains node information used in the algorithm from various sources: client mappings associated with cluster nodes, as described above, as well as Remoting endpoints associated with established connections to nodes. These pieces of information describe at a minimum a host and a port.
The problem is that "the algorithm" used in Discovery to compute the set of ServiceURLs treats hostnames and IP addresses as simple strings. So "localhost" and "127.0.0.1" are treated as different hosts, even though they refer to the same host. If a mix of hostnames and IP addresses for the same node is received, this results in an incomplete/incorrect set of ServiceURLs being generated which in turn leads to incorrect Discovery failures.
> Return hostname instead of IP address when generating default client mapping
> -----------------------------------------------------------------------------
>
> Key: WFLY-12624
> URL: https://issues.jboss.org/browse/WFLY-12624
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 18.0.0.Beta1
> Environment: An EJB client application interacting with a cluster of Wildfly server nodes
> Reporter: Richard Achmatowicz
> Assignee: Richard Achmatowicz
> Priority: Major
>
> When an EJB client application interacts with a clustered Wildfly deployment, it receives topology updates from cluster nodes describing the membership of the cluster.
> For each node in the cluster, a set of one or more client mappings is provided to indicate how the client may connect to the node, if it hasn't already. If the node is connected to a single network, there will be one client mapping; if the node is multi-homed and connected to two networks, there will be two client mappings, etc. Client mappings specify three things: a CIDR representation of the network the client may be on, a destination hostname or IP address and a destination port.
> Client mappings may be generated by default (if none are provided in the server profile) or may be specified by the user via client mappings defined in the socket binding of the Remoting connector. For example:
> {noformat}
> <socket-binding name="remoting" port="1099">
> <client-mapping source-network="135.121.1.0/24" destination-address="135.121.1.29" destination-port="1099"/>
> </socketbinding>
> {noformat}
> When the client mapping information is received by the EJB client application, it is added to the discovered node registry (DNR) in the Discovery component of the EJB client. The DNR represents all known information about nodes with which the client can interact which was received from nodes in one or more clusters.
> When an invocation is attempted, the Discovery component uses this information to generate a set of ServiceURLs which represent candidate targets (i.e. servers containing the deployment and module the client is invoking on) for the invocation. The Discovery component uses "an algorithm" to take the information in the DNR (and other places) and convert that information to a corresponding set of ServiceURLs representing available targets. The Discovery component will then select one such ServiceURL and return this as the target for the invocation. For example, in the above case, the service URL will look something like:
> {noformat}
> service:ejb.jboss:remote://135.121.1.29:1099;cluster=ejb;node=node1;ejb-module=my-foo-app/my-bar-module;source-ip=135.121.1.0/24"
> {noformat}
> This service URL describes a server with logical name "node1" which:
> * is a member of a cluster called "ejb"
> * has the EJB module "my-foo-app/my-bar-module" and all the beans that it contains deployed
> * can be connected to by the URL "remote://135.121.1.29:1099" as long as you are on network "135.121.1.0/24"
> Discovery obtains node information used in the algorithm from various sources: client mappings associated with cluster nodes, as described above, as well as Remoting endpoints associated with established connections to nodes. These pieces of information describe at a minimum a host and a port.
> The problem is that "the algorithm" used in Discovery to compute the set of ServiceURLs treats hostnames and IP addresses as simple strings. So "localhost" and "127.0.0.1" are treated as different hosts, even though they refer to the same host. If a mix of hostnames and IP addresses for the same node is received, this results in an incomplete/incorrect set of ServiceURLs being generated which in turn leads to incorrect Discovery failures.
>
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (JGRP-2396) increasing networkdata, cpu and heap
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2396?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2396:
--------------------------------
TransferQueueBundler has been in use for a *long* time in many deployments, so I'm skeptical this is the issue. In any case, I need a stack trace when there's high CPU to see what's going on, and where things go south.
> increasing networkdata, cpu and heap
> ------------------------------------
>
> Key: JGRP-2396
> URL: https://issues.jboss.org/browse/JGRP-2396
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.19
> Reporter: Rob van der Boom
> Assignee: Bela Ban
> Priority: Major
>
> hey,
> we have an keycloak (sso) setup, version 7.0.1 running in kubernetes - aws.
> Its build on wildfly 17, infinispan 9.4 and jgroups 4.0.19.
> We have 3 pods running in standalone-ha with cache setup on distribution (all 3 nodes - so equivalent to replication)
> ISSUE:
> We see a slowly growing of networkstatistics, heap and cpu, while the number of sessions in keycloak (cached) remain almost stable.
> The cpu growth is caused by the TQbundler process, which explaines the networkdata growth. It looks like this is causing also a memory leakage..
> every 5 days we have to restart the pods and then every resets to a very low level including the heap. this while all sessions are still valid and cached.
> The only issue i could find maybe related to this is:
> https://issues.jboss.org/browse/JGRP-2382?jql=project%20%3D%20JGRP%20AND%...
> Could this be the same issue and does it also cause increasing network and cpu (since that is why we have to restart, the heap has much space left !).
> And if so how does this issue continue since for us its a major issue.
> We als had this issue already in keycloak 5 (wildfly 15), thats why we upgraded to the latest available version.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (JGRP-2396) increasing networkdata, cpu and heap
by Rob van der Boom (Jira)
[ https://issues.jboss.org/browse/JGRP-2396?page=com.atlassian.jira.plugin.... ]
Rob van der Boom edited comment on JGRP-2396 at 11/8/19 8:31 AM:
-----------------------------------------------------------------
True many components involved.
I will deliver more details next week, heap dump etc.
Its more tracing down what it can and can not be, ofcourse i am not sure, only things i no so far:
- there are no know keycloak related issues so far and it looks like relating to infinispan cache replication between the nodes. The issue increases/ show up only when having many cached sessions (>300.000) but it is NOT related to activity since it keeps increasing with same amount even at night when almost no traffic on the site.
- Its is only the TQ bundler taken up more and more cpu not other tasks (unless heap grows towards max. but since we doubled heap GC doesnt grow above 1% cpu when tq bundler is already into the 30% and quickly after higher. Network data transfer (also when almost no trafic on site is growing as fast as the cpu to levels we can not explain since).
- zero errors occure anywhere that can explain issues.
So will try to hand over more details, thanks in advance
was (Author: robvanderboom):
True many components involved.
I will deliver more details next week, heap dump etc.
Its more tracing down what it can and can not be, ofcourse i am not sure, only things i no so far:
- there are no know keycloak related issues so far and it looks like relating to infinispan cache replication between the nodes. The issue increases/ show up only when having many cached sessions (>300.000) but it is NOT related to activity since it keeps increasing with same amount even at night when almost no traffic on the site.
- Its is only the TQ bundler taken up more and more cpu not other tasks (unless heap grows towards max. but since we doubled heap GC doesnt grow above 1% cpu when tq bundler is already into the 30% and quickly after higher. Network data transfer (also when almost no trafic on site is growing as fast as the cpu to levels we can not explain since).
- zero errors occure anywhere that can explain issues.
So will try to hand over more details,
> increasing networkdata, cpu and heap
> ------------------------------------
>
> Key: JGRP-2396
> URL: https://issues.jboss.org/browse/JGRP-2396
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.19
> Reporter: Rob van der Boom
> Assignee: Bela Ban
> Priority: Major
>
> hey,
> we have an keycloak (sso) setup, version 7.0.1 running in kubernetes - aws.
> Its build on wildfly 17, infinispan 9.4 and jgroups 4.0.19.
> We have 3 pods running in standalone-ha with cache setup on distribution (all 3 nodes - so equivalent to replication)
> ISSUE:
> We see a slowly growing of networkstatistics, heap and cpu, while the number of sessions in keycloak (cached) remain almost stable.
> The cpu growth is caused by the TQbundler process, which explaines the networkdata growth. It looks like this is causing also a memory leakage..
> every 5 days we have to restart the pods and then every resets to a very low level including the heap. this while all sessions are still valid and cached.
> The only issue i could find maybe related to this is:
> https://issues.jboss.org/browse/JGRP-2382?jql=project%20%3D%20JGRP%20AND%...
> Could this be the same issue and does it also cause increasing network and cpu (since that is why we have to restart, the heap has much space left !).
> And if so how does this issue continue since for us its a major issue.
> We als had this issue already in keycloak 5 (wildfly 15), thats why we upgraded to the latest available version.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (JGRP-2396) increasing networkdata, cpu and heap
by Rob van der Boom (Jira)
[ https://issues.jboss.org/browse/JGRP-2396?page=com.atlassian.jira.plugin.... ]
Rob van der Boom edited comment on JGRP-2396 at 11/8/19 8:31 AM:
-----------------------------------------------------------------
True many components involved.
I will deliver more details next week, heap dump etc.
Its more tracing down what it can and can not be, ofcourse i am not sure, only things i no so far:
- there are no know keycloak related issues so far and it looks like relating to infinispan cache replication between the nodes. The issue increases/ show up only when having many cached sessions (>300.000) but it is NOT related to activity since it keeps increasing with same amount even at night when almost no traffic on the site.
- Its is only the TQ bundler taken up more and more cpu not other tasks (unless heap grows towards max. but since we doubled heap GC doesnt grow above 1% cpu when tq bundler is already into the 30% and quickly after higher. Network data transfer (also when almost no trafic on site is growing as fast as the cpu to levels we can not explain since).
- zero errors occure anywhere that can explain issues.
So will try to hand over more details,
was (Author: robvanderboom):
True many components involved.
I will deliver more details next week, heap dump etc.
Its more tracing down what it can and can not be, ofcourse i am not sure, only things i no so far:
- there are no know keycloak related issues so far and it looks like relating to infinispan cache replication between the nodes. The issue increases/ show up only when having many cached sessions (>300.000) but it is NOT related to activity since it keeps increasing with same amount even at night when almost no traffic on the site.
- Its is only the TQ bundler taken up more and more cpu not other tasks (unless heap grows towards max. but since we doubled heap GC doesnt grow above 1% cpu when tq bundler is already into the 30% and quickly after higher. Network data transfer (also when almost no trafic on site is growing as fast as the cpu to levels we can not explain since).
-0 errors occure anywhere that can explain issues.
So will try to hand over more details,
> increasing networkdata, cpu and heap
> ------------------------------------
>
> Key: JGRP-2396
> URL: https://issues.jboss.org/browse/JGRP-2396
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.19
> Reporter: Rob van der Boom
> Assignee: Bela Ban
> Priority: Major
>
> hey,
> we have an keycloak (sso) setup, version 7.0.1 running in kubernetes - aws.
> Its build on wildfly 17, infinispan 9.4 and jgroups 4.0.19.
> We have 3 pods running in standalone-ha with cache setup on distribution (all 3 nodes - so equivalent to replication)
> ISSUE:
> We see a slowly growing of networkstatistics, heap and cpu, while the number of sessions in keycloak (cached) remain almost stable.
> The cpu growth is caused by the TQbundler process, which explaines the networkdata growth. It looks like this is causing also a memory leakage..
> every 5 days we have to restart the pods and then every resets to a very low level including the heap. this while all sessions are still valid and cached.
> The only issue i could find maybe related to this is:
> https://issues.jboss.org/browse/JGRP-2382?jql=project%20%3D%20JGRP%20AND%...
> Could this be the same issue and does it also cause increasing network and cpu (since that is why we have to restart, the heap has much space left !).
> And if so how does this issue continue since for us its a major issue.
> We als had this issue already in keycloak 5 (wildfly 15), thats why we upgraded to the latest available version.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (JGRP-2396) increasing networkdata, cpu and heap
by Rob van der Boom (Jira)
[ https://issues.jboss.org/browse/JGRP-2396?page=com.atlassian.jira.plugin.... ]
Rob van der Boom commented on JGRP-2396:
----------------------------------------
True many components involved.
I will deliver more details next week, heap dump etc.
Its more tracing down what it can and can not be, ofcourse i am not sure, only things i no so far:
- there are no know keycloak related issues so far and it looks like relating to infinispan cache replication between the nodes. The issue increases/ show up only when having many cached sessions (>300.000) but it is NOT related to activity since it keeps increasing with same amount even at night when almost no traffic on the site.
- Its is only the TQ bundler taken up more and more cpu not other tasks (unless heap grows towards max. but since we doubled heap GC doesnt grow above 1% cpu when tq bundler is already into the 30% and quickly after higher. Network data transfer (also when almost no trafic on site is growing as fast as the cpu to levels we can not explain since).
-0 errors occure anywhere that can explain issues.
So will try to hand over more details,
> increasing networkdata, cpu and heap
> ------------------------------------
>
> Key: JGRP-2396
> URL: https://issues.jboss.org/browse/JGRP-2396
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.19
> Reporter: Rob van der Boom
> Assignee: Bela Ban
> Priority: Major
>
> hey,
> we have an keycloak (sso) setup, version 7.0.1 running in kubernetes - aws.
> Its build on wildfly 17, infinispan 9.4 and jgroups 4.0.19.
> We have 3 pods running in standalone-ha with cache setup on distribution (all 3 nodes - so equivalent to replication)
> ISSUE:
> We see a slowly growing of networkstatistics, heap and cpu, while the number of sessions in keycloak (cached) remain almost stable.
> The cpu growth is caused by the TQbundler process, which explaines the networkdata growth. It looks like this is causing also a memory leakage..
> every 5 days we have to restart the pods and then every resets to a very low level including the heap. this while all sessions are still valid and cached.
> The only issue i could find maybe related to this is:
> https://issues.jboss.org/browse/JGRP-2382?jql=project%20%3D%20JGRP%20AND%...
> Could this be the same issue and does it also cause increasing network and cpu (since that is why we have to restart, the heap has much space left !).
> And if so how does this issue continue since for us its a major issue.
> We als had this issue already in keycloak 5 (wildfly 15), thats why we upgraded to the latest available version.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (SWSQE-1002) Update OCP provisioning jobs to allow choose flavor
by Filip Brychta (Jira)
Filip Brychta created SWSQE-1002:
------------------------------------
Summary: Update OCP provisioning jobs to allow choose flavor
Key: SWSQE-1002
URL: https://issues.jboss.org/browse/SWSQE-1002
Project: Kiali QE
Issue Type: QE Task
Reporter: Filip Brychta
Assignee: Filip Brychta
It seems that general aggregate flavors are very slow and are causing problems for OCP clusters. We need to update jenkins jobs to allow to choose ci flavors which should be faster but less reliable.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months
[JBoss JIRA] (SWSQE-1001) OCP clusters in PSI are very flaky
by Filip Brychta (Jira)
Filip Brychta created SWSQE-1001:
------------------------------------
Summary: OCP clusters in PSI are very flaky
Key: SWSQE-1001
URL: https://issues.jboss.org/browse/SWSQE-1001
Project: Kiali QE
Issue Type: QE Task
Reporter: Filip Brychta
Assignee: Filip Brychta
Both OCP 4.1 and OCP 4.2 clusters created in PSI are very slow. Following alerts are visible from time to time:
OCP 4.1 on ocp-master flavor
The API server has a 99th percentile latency of 1.98 seconds for PUT clusterrolebindings.
The API server has a 99th percentile latency of 3.780000000000002 seconds for PUT poddisruptionbudgets.
The API server has a 99th percentile latency of 2.309999999999997 seconds for GET namespaces.
The API server has a 99th percentile latency of 5.2000000000000055 seconds for GET deployments.
The API server has a 99th percentile latency of 7.6 seconds for GET openshiftapiservers.
KubeAPIErrorsHigh
API server is returning errors for 22.222222222222218% of requests for POST pods .
Also jaeger tests are failing.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 6 months