[JBoss JIRA] Created: (JGRP-346) Connection objects are removed from the ConnectionTable, but remain active on the system and eventually consume available system resources.
by Stuart Jensen (JIRA)
Connection objects are removed from the ConnectionTable, but remain active on the system and eventually consume available system resources.
-------------------------------------------------------------------------------------------------------------------------------------------
Key: JGRP-346
URL: http://jira.jboss.com/jira/browse/JGRP-346
Project: JGroups
Issue Type: Bug
Affects Versions: 2.3 SP1
Environment: SUSE Linux 9
Reporter: Stuart Jensen
Assigned To: Bela Ban
To duplicate the issue:
1) Create a four member cluster using the following configuration: (two member clusters exhibit the problem as well, just not as exaggerated)
TCP(start_port=7801):
TCPPING(initial_hosts=<ip addresses go here>;port_range=3;timeout=3500;num_initial_members=3;up_thread=true;down_thread=true):
MERGE2(min_interval=5000;max_interval=10000):
FD(shun=true;timeout=2500;max_tries=5;up_thread=true;down_thread=true):
VERIFY_SUSPECT(timeout=2000;down_thread=false;up_thread=false):
pbcast.NAKACK(down_thread=true;up_thread=true;gc_lag=100;retransmit_timeout=3000):
pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):
pbcast.GMS(join_timeout=5000;join_retry_timeout=3500;shun=true;print_local_addr=true;down_thread=true;up_thread=true)
2) I was running JGroups in a Tomcat servlet application. Start up the cluster. To determine the number of threads on Linux I executed the following commands:
ps -ef | grep tomcat
echo "" > catalina.out
kill -QUIT <pid from ps command above>
grep ".Sender \[" catalina.out | wc -l
You get the process id of Tomcat using the ps command. Then clear the content of the catalina.out file. The kill command causes the threads to be printed into the catalina.out file. Then the grep searches for and counts all of the "ConnectionTable.Connnection.Sender" threads that are currently active on the system.
3) Pick one of the cluster member boxes and pull the network cable out of the box such that all communication with the other three members is terminated.
4) After one or two minutes, replace the network cable.
5) Repeat the steps to determine the number of threads currently active on the system.
6) Repeat steps 3 through 5, each time watching the number of threads. Each iteration will cause more and more threads to be orphaned on the system. It seems to grow exponentially, after about 4 iterations we have around 300-400 Sender threads. The Receiver threads will be orphaned also in similar numbers.
After investigating the issue, I came up with the following "fix" which cleared the problem up.
In the file ConnectionTable.java there is a method called retainAll(). It appears that this method is called by the TCP protocol when a view change occurs. This method removes Connnections from the "Connection Pool" (member variable conns) but does not destroy them. We initially thought the reaper thread may clean them up, but since the Connection objects are actually removed from the Connection Pool, the reaper does not help the situation. As we watched our connections we noticed that the Connections orphaned by this routine were the ones filling up the system's set of threads. So, we added code to call destroy() on all of the Connection objects that retainAll() removes from the Connection Pool. The "diff" is provided below. Note that we did our change in the JGroups 2.3 SP1 file ConnectionTable.java. Scott Marlow did this diff for me, the same change, but applied to the BasicConnectionTable from the 2.4 source set.
Index: BasicConnectionTable.java
===================================================================
RCS file: /cvsroot/javagroups/JGroups/src/org/jgroups/blocks/BasicConnectionTable.java,v
retrieving revision 1.8
diff -r1.8 BasicConnectionTable.java
22,26c22
< import java.util.Map;
< import java.util.Iterator;
< import java.util.HashMap;
< import java.util.Vector;
< import java.util.Collection;
---
> import java.util.*;
263c259,289
< conns.keySet().retainAll(c);
---
> // conns.keySet().retainAll(c);
> ArrayList alConnsToDestroy = new ArrayList();
> synchronized(conns)
> {
> HashMap copy=new HashMap(conns);
> conns.keySet().retainAll(c);
> Set ks = copy.keySet();
> Iterator iter = ks.iterator();
> while (iter.hasNext())
> {
> Object oKey = iter.next();
> if (null == conns.get(oKey))
> { // This connection NOT in the resultant connection set
> Connection conn = (Connection)copy.get(oKey);
> if (null != conn)
> { // Destroy this connection
> alConnsToDestroy.add(conn);
> }
> }
> }
> }
> // All of the connections that were not retained must be destroyed
> // so that their resources are cleaned up.
> for (int a=0; a<alConnsToDestroy.size(); a++)
> {
> Connection conn = (Connection)alConnsToDestroy.get(a);
> if(log.isTraceEnabled())
> log.trace("Destroy this orphaned connection: " + conn);
> conn.destroy();
> }
>
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] Created: (JBCLUSTER-186) Implementations of Invoker should implement equals as an equality check rather than relying on Object.equals, this is important for cluster fail-over support
by Scott Marlow (JIRA)
Implementations of Invoker should implement equals as an equality check rather than relying on Object.equals, this is important for cluster fail-over support
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: JBCLUSTER-186
URL: http://jira.jboss.com/jira/browse/JBCLUSTER-186
Project: JBoss Clustering
Issue Type: Bug
Security Level: Public (Everyone can see)
Environment: JBoss as 4.0.4, although this seems to happen in 4.5 + 5.x as well.
Reporter: Scott Marlow
Assigned To: Scott Marlow
Priority: Minor
Part of how JRMPInvokerProxyHA handles fail-over includes removing the reference to the node that left the cluster. However, the dead node is not removed as an equality check is not implemented by certain Invoker implementations.
The relevant code in JRMPInvokerProxyHA is
protected void removeDeadTarget(Object target)
{
if (this.familyClusterInfo != null)
this.familyClusterInfo.removeDeadTarget (target);
}
The code in familyClusterInfo is:
public ArrayList removeDeadTarget(Object target)
{
synchronized (this)
{
ArrayList tmp = (ArrayList) targets.clone();
tmp.remove (target);
this.targets = tmp;
this.isViewMembersInSyncWithViewId = false;
}
return this.targets;
}
Since, we didn't include an equals test in many of the different Invoker implementations, the above "tmp.remove(target)" operation fails. The reason for the failure is due to the "targets" ArrayList changing on every invocation (to reflect the current cluster server membership list), a new "targets" is created (so of course comparing references later will not work.)
A similar problem occurs with the EJB2 load balancers after a cluster membership changes.
I think that these issues will be solved by implementing an equals test in the different invokers that can handle equality testing.
PooledInvokerProxy should implement equals based on ServerAddress.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] Created: (JGRP-371) TCP_NIO with SSL
by Bela Ban (JIRA)
TCP_NIO with SSL
----------------
Key: JGRP-371
URL: http://jira.jboss.com/jira/browse/JGRP-371
Project: JGroups
Issue Type: Feature Request
Affects Versions: 2.4
Reporter: Bela Ban
Assigned To: Bela Ban
Fix For: 2.5
Attachments: ssl-nio.jar
>From Hal Hildebrand:
Attached are the sources to allow a new protocol stack which uses SSL over
NIO. This protocol stack element provides security and authentication
(using client side authentication) for a JGroups TCP stack using NIO.
This required two minor modifications in the ConnectionTableNIO class.
These modifications allow one to subclass to create a connection table which
uses SSL for the connections. Finally, there is a new protocol stack
element, SSL_NIO, which one can add to a stack to make use of it.
Regardless of whether this makes it into the codeline of JGroups, it would
be nice to have the changes to ConnectionTableNIO make it into the mainline,
as I currently have to overwrite the original class to easily implement this
- the last thing I want to do is fork ConnectionTableNIO ;) I'd rather just
subclass it. The mods are simple and innocuous (marked with "HSH").
Right now, the SSL_NIO needs to be configured with an SSLSocketFactory. I
didn't bother with integrating with the normal JGroups mechanism using
properties from the configuration because I consider it inherently insecure
to ensconce my passwords in configuration files. But the changes to enable
this are straight forward. Currently, to configure the factory for the
protocol layer, do something like the following before connecting your
channel:
// Construct your Jchannel
JChannel jchannel = ...
// Access your protocol stack
ProtocolStack protocolStack = jchannel.getProtocolStack();
// Retrieve the SSL_NIO protocol layer
SSL_NIO protocol = (SSL_NIO) protocolStack.findProtocol("SSL_NIO");
// Create your SSLSocketFactory
SSLSocketFactory socketFactory = ....
// Set up the protocol
protocol. SetSocketFactory(socketFactory);
// Connect your channel
jchannel.connnect("my-group");
Cheers.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 4 months
[JBoss JIRA] Created: (JBAS-5900) jars are not loaded from the lib directory of a sar in JBoss AS 5
by Matt Wringe (JIRA)
jars are not loaded from the lib directory of a sar in JBoss AS 5
-----------------------------------------------------------------
Key: JBAS-5900
URL: https://jira.jboss.org/jira/browse/JBAS-5900
Project: JBoss Application Server
Issue Type: Bug
Security Level: Public (Everyone can see)
Components: ClassLoading
Affects Versions: JBossAS-5.0.0.CR2
Reporter: Matt Wringe
Assignee: Scott M Stark
Attachments: testsar.tar.gz
In JBoss AS 5, any jars placed in the lib directory of a sar are not loaded and results in a class not found exception. The classes are loaded if the jar is placed in the root directory of the sar.
On previous versions of JBoss AS, the classes would be loaded from either location. This prevents sars that would have worked on JBoss AS 4 to fail on JBoss AS 5.
Steps to reproduce:
1) take zip attached to this report.
2) deploy test-in-lib.sar to JBoss AS 5. The sar will not deploy and will fail with a classNotFoundException
3) deploy test-in-lib.sar to JBoss AS 4, the sar will deploy without issue.
test-in-root.sar will work on both versions of JBoss. The only difference between these sars is the location of the jar.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 5 months