[
https://issues.jboss.org/browse/WFLY-5822?page=com.atlassian.jira.plugin....
]
Richard Achmatowicz edited comment on WFLY-5822 at 1/11/16 8:18 PM:
--------------------------------------------------------------------
With regard to remote accesses of keys:
7.0.0 jobs:
- run with the Byteman rule RetrieveFromRemoteSource which checks for invocations of the
method BaseDistributionInterceptor.retrieveFromRemoteSource
- output found for this rule in the job output for random cluster nodes (in other words,
on one run, perf18 has to remotely retrieve a key, on another run, perf21 has to retrieve
a remote key)
- because this is a stress test and failures are not triggered, unless there is an ISPN
rebalance for reasons other than failure, the node with the remote key will have to make
that extra invocation on every access to that key
6.4.0 jobs:
- run with the Byteman rule RetrieveFromRemoteSource which checks for invocations of the
method BaseDistributionInterceptor.retrieveFromRemoteSource
- no output found for this rule in the job output for any cluster node
These "extra" remote accesses are triggered during the resetting of the
weakAffinity, after a successful invocation and before the invocation result is returned
to the client. The server will set the weak affinity of the session by calling
MethodInvocationMessageHandler.getWeakAffinity(sessionID). This invocation passes in turn
through DistributableCache.getWeakAffinity(..), InfinispanBeanManager.getWeakAffinity(...)
until it looks up the primary owner of the session via
InfinispanBeanManager.locatePrimaryOwner(sessionID) which returns a Node object
identifying the Node in the Group which should be used for weak affinity.
However, the Node needs to be translated into an Address / string representation. So the
code makes a call to the Registry (a distributed cache) to lookup the String name of the
host. Unfortunately, there is no guarantee that the cache entry in the Registry for a Node
perf18 is resident on perf18 - and so a call to retrieveFromRemote has to be made to get
the cache entry from the other node(s).
I'm not certain that this is the cause of the performance problem, but it seems to be
the reason why the extra remote call is made.
was (Author: rachmato):
With regard to remote accesses of keys:
7.0.0 jobs:
- run with the Byteman rule RetrieveFromRemoteSource which checks for invocations of the
method BaseDistributionInterceptor.retrieveFromRemoteSource
- output found for this rule in the job output for random cluster nodes (in other words,
on one run, perf18 has to remotely retrieve a key, on another run, perf21 has to retrieve
a remote key)
- because this is a stress test and failures are not triggered, unless there is an ISPN
rebalance for reasons other than failure, the node with the remote key will have to make
that extra invocation on every access to that key
6.4.0 jobs:
- run with the Byteman rule RetrieveFromRemoteSource which checks for invocations of the
method BaseDistributionInterceptor.retrieveFromRemoteSource
- no output found for this rule in the job output for any cluster node
These "extra" remote accesses are triggered during the resetting of the
weakAffinity, after a successful invocation and before the invocation result is returned
to the client. The server will set the weak affinity of the session by calling
MethodInvocationMessageHandler.getWeakAffinity(sessionID). This invocation passes in turn
through DistributableCache.getWeakAffinity(..), InfinispanBeanManager.getWeakAffinity(...)
until it looks up the primary owner of the session via
InfinispanBeanManager.locatePrimaryOwner(sessionID) which returns a Node object
identifying the Node in the Group which should be used for weak affinity. However, the
Node needs to be translated into an Address / string representation. So the code makes a
call to the Registry (a distributed cache) to lookup the String name of the host.
Unfortunately, there is no guarantee that the cache entry in the Registry for a Node
perf18 is resident on perf18 - and so a call to retrieveFromRemote has to be made to get
the cache entry from the other node(s).
I'm not certain that this is the cause of the performance problem, but it seems to be
the reason why the extra remote call is made.
Clustering performance regression in ejbremote-dist-sync scenario
------------------------------------------------------------------
Key: WFLY-5822
URL:
https://issues.jboss.org/browse/WFLY-5822
Project: WildFly
Issue Type: Bug
Components: Clustering, EJB
Affects Versions: 10.0.0.CR5
Reporter: Michal Vinkler
Assignee: Richard Achmatowicz
Priority: Critical
Compared to EAP 6, all SYNC scenarios have the same/better performance except of this
one, wonder why?
Compare these results:
stress-ejbremote-dist-sync
7.0.0.ER2:
[
throughput|http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-7x-str...]
6.4.0.GA:
[
throughput|http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-str...]
---------------------------------------
Just for comparison: ejbremote REPL_SYNC scenario *performs well* on the other hand:
stress-ejbremote-repl-sync
7.0.0.ER2:
[
throughput|http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-7x-str...]
6.4.0.GA:
[
throughput|http://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-str...]
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)