Hi,
we get intermittent NameNotFound-Exceptions when a client makes a lookup against our
JBoss-4.0.2-Cluster. The client uses the HA-JNDI-Ports. When he uses the non HA-JNDI-Ports
we don't see the exceptions. At the same time another client can make lookups without
any problems. When the client gets the first exception a high number of following lookups
fail too. Sometimes, after a non predictable time, the client has successful lookups
again, sometimes not.
Has anybody seen this too and is there a solution known? Or in other words, is it ok to
make a non-HA-JNDI-Call for getting clustered Session-Facades?
Here are the details:
We have a JBoss-4.0.2-Cluster with three nodes. The cluster has its own partition name and
multicast address.
The clients are a number of servers with a clustered webapplication. For short we can
asume that there are only two client machines. The cluster-nodes and client machines are
in the same subnet and see each others multicasts and broadcasts.
We stripped the code on the clients and build a servlet which has the following code
snipped:
|
| Hashtable env = new Hashtable();
|
env.put("java.naming.factory.initial","org.jnp.interfaces.NamingContextFactory");
|
env.put("java.naming.factory.url.pkgs","org.jboss.naming:org.jnp.interfaces");
|
env.put("java.naming.provider.url","jnp://server1:<ha-port>,jnp://server2:<ha-port>,jnp://server3:<ha-port>");
|
| Context ctx = new InitialContext(env);
| Object obj = ctx.lookup(<existing path>);
| <call create on home interface after narrowing etc.>
|
All variables are local, so no singleton with shared data between two servlet calls etc.
We use non standard port numbers, so no default can affect the behaviour. The port number
is the same on all cluster nodes. But there is another server in the subnet which uses the
default ports. When the client fails, we get a NameNotFoundException in the lookup-line.
We played around with the property for switching off the auto discovery
(jnp.disableDiscovery), setting a connection timeout (jnp.timeout). But nothing changes.
We called the servlet on two machines of the webclient cluster. Both used the
HA-JNDI-Ports of the three JBoss-Nodes. After a few seconds one of the webclients got the
above mentioned exception. After this a lot of following calls got the same exception. We
made a tcpdump and saw that the network traffic on the JNDI-Ports stopped after the
exception. At the same time the other webclient was able to make lookups. Sometimes the
webclient with exceptions went back to normal, but not everytime. We used a small script
on a third machine to call the two servlets. We inserted a sleep of 1 second between two
calls to one webclient-node. Then we saw the described behaviour. If we run the script
without 'sleep' both webclients got exceptions after a few seconds and mostly they
don't went back to normal.
The interesting point is, that we can't reproduce the problem when we make the
jndi-calls from a development machine from a different network. Then the JBoss-Cluster
answered all JNDI-Lookups without any exceptions.
Another interesting point is that the exceptions on the webclients were gone when we
changed the port in the provider URL to the non HA-JNDI-Port. This port is a non standard
one too.
We would like to use the HA-JNDI-Ports again, is there a known solution for this problem?
Thanks for all your help!
Bernd
View the original post :
http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3959551#...
Reply to the post :
http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&a...