[jboss-jira] [JBoss JIRA] Resolved: (JGRP-244) long connect time in application that frequently closes and opens new channel

Bela Ban (JIRA) jira-events at jboss.com
Fri Aug 11 08:52:13 EDT 2006


     [ http://jira.jboss.com/jira/browse/JGRP-244?page=all ]

Bela Ban resolved JGRP-244.
---------------------------

    Resolution: Done

> long connect time in application that frequently closes and opens new channel
> -----------------------------------------------------------------------------
>
>                 Key: JGRP-244
>                 URL: http://jira.jboss.com/jira/browse/JGRP-244
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.2.8, 2.2.9, 2.3, 2.2.9.1, 2.2.9.2
>         Environment: SLES 9
>            Reporter: Bruce Schuchardt
>         Assigned To: Bela Ban
>             Fix For: 2.4
>
>
> I have an application that has one long-lived jgroups member and four other processes that often close their channel and create a new one.  Most of the time these new connections are formed in a dozen milliseconds or so, but sometimes they're taking over 20 seconds.  The apps are using TCPGOSSIP with multicast turned off.
> I turned on tracing and saw that the coordinator's UNICAST was having some trouble.  It got a message from a departed member that it stored up and dispatched later when the departed member's address was reused by a new channel.
> 	a) A member left the view and UNICAST removed its connection for the member and added it to previous_members.  
> 	b) Another message then arrived from the member, and UNICAST created a new connection for it.  The message had seqno 4, and was put in the AckReceiverWindow and not passed up.
> 	c) A few seconds later, a process created a new channel and it got the same ID as the one the coordinator's UNICAST just dealt with.  
> 	d) The new channel sent three UNICAST messages to the coordinator. On the third message, the coordinator's UNICAST removed #3 and the old #4 and passed them both up.
> 	e) The new channel sent message #4, a JOIN_REQ, and UNICAST discarded it
> The new channel eventually goes through discovery again and gets into the group, but it adds quite a bit to channel startup time, and I'm a little worried that there might be a case where a much higher seqno gets trapped in the receiver window like this.
> I fixed this for my needs by changing UNICAST.handleDataReceived to reject a previous_member message if the seqno is higher than the default initial seqno, but I suppose that might wreak havoc with some other algorithms.
>     private void handleDataReceived(Object sender, long seqno, Message msg) {
>         if(trace)
>             log.trace(new StringBuffer().append(local_addr).append(" <-- DATA(").append(sender).append(": #").append(seqno));
>         if(previous_members.contains(sender)) {
>             // we don't want to see messages from departed members
>             if (seqno > DEFAULT_FIRST_SEQNO) {
>               if (trace)
>                 log.trace("discarding message " + seqno + " from previous member " + sender);
>               return;
>             }
>             if(trace)
>                 log.trace("removed " + sender + " from previous_members as we received a message from it");
>             previous_members.removeElement(sender);
>         }
>         Entry    entry;
>         synchronized(connections) {

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list