[jboss-jira] [JBoss JIRA] (AS7-4862) Remote EJB cluster view propagation or SFSB load balancing broken

Tue Jun 5 09:41:18 EDT 2012

    [ https://issues.jboss.org/browse/AS7-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698841#comment-12698841 ] 

jaikiran pai commented on AS7-4862:
-----------------------------------

Rado, like I said, this isn't a bug. If there are multiple nodes that can handle the same bean and if the client application expects the deployment node selection to be as per his application requirements, then the application is expected to configure such a deployment node selector for the initial load balancing on the beans. The EJB client API will have no knowledge of which of the nodes to select. For now, the only available implementation of a deployment node selector in the EJB client API is the random node selector. 

{quote}
The defaults and default implementation logic MUST be reasonable. We cannot expect users to go on an change every default to get the product working.
{quote}
What's reasonable? Which part of the product isn't working? Like I have said in some JIRAs and forum threads, the EJB client API identifies a EJB target by the appname/modulename/distinctname/beanname combination. If multiple nodes can handle it, then it uses a deployment node selector to pick one node. The only implementation of that selector currently is the random node selector. The EJB client API cannot really come up with a different reasonable implementation since it's the application which decides what's reasonable. For example, how do you define a round robin implementation when the number of nodes keep changing? Each application might deal with that differently.

{quote}
 we have a test for that in the upstream testsuite but only on 2 nodes, so even the balancing might be really broken.
{quote}
It's not broken. Which test is this? The tests that I'm aware of do not check any round robin or other similar load balancing techniques.

{quote}
The latest job shows the distribution problem in CR1 build as well https://hudson.qa.jboss.com/hudson/job/eap-6x-failover-ejb-ejbremote-undeploy-dist-async/2/
{quote}
Why would the results be any different if the same random deployment node selector was used?

I'm assigning this back, since this isn't broken and isn't a bug. If there's a requirement to provide a different implementation for deployment node selector then please create a feature request with details of the exact requirements of that selector. 

> Remote EJB cluster view propagation or SFSB load balancing broken
> -----------------------------------------------------------------
>
>                 Key: AS7-4862
>                 URL: https://issues.jboss.org/browse/AS7-4862
>             Project: Application Server 7
>          Issue Type: Bug
>          Components: Clustering, EJB
>    Affects Versions: 7.1.2.Final (EAP)
>            Reporter: Radoslav Husar
>            Assignee: jaikiran pai
>             Fix For: 7.1.3.Final (EAP)
>
>
> Looking at the load-balancing again, it seems still wrong. Checking the logs, servers were started *and cluster was formed* ~10 seconds prior to starting the client and load-balancing was roughly: 
> * node perf18: ~50% of sessions
> * node perf19: 0
> * node perf20: 0
> * node perf21: ~50% of sessions
> I let all the session to be created in one minute (not all at once).
> The nodes were also specified in the properties.
> {noformat}
> node perf18
> [JBossINF] 07:01:53,991 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (CacheService lifecycle - 1) ISPN000094: Received new cluster view: [perf19/ejb|3] [perf19/ejb, perf21/ejb, perf20/ejb, perf18/ejb]
> node perf19
> [JBossINF] 07:01:53,781 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-19,null) ISPN000094: Received new cluster view: [perf19/ejb|3] [perf19/ejb, perf21/ejb, perf20/ejb, perf18/ejb]
> node perf20
> [JBossINF] 07:01:53,781 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-14,null) ISPN000094: Received new cluster view: [perf19/ejb|3] [perf19/ejb, perf21/ejb, perf20/ejb, perf18/ejb]
> node perf21
> [JBossINF] 07:01:53,780 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-12,null) ISPN000094: Received new cluster view: [perf19/ejb|3] [perf19/ejb, perf21/ejb, perf20/ejb, perf18/ejb]
> controller node
> 2012/05/22 07:02:02:932 EDT [DEBUG][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - Setting thread count: 2000, was: 0
> 2012/05/22 07:02:53:731 EDT [DEBUG][TestController] HOST perf17.mw.lab.eng.bos.redhat.com:rootProcess:c - All runners (2000) started in 50799 milliseconds
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira