[jboss-jira] [JBoss JIRA] Commented: (JGRP-1200) GossipRouter: lookup fails when we have multiple GossipRouters and any of them throws an exception

Bela Ban (JIRA) jira-events at lists.jboss.org
Tue May 4 10:41:05 EDT 2010


    [ https://jira.jboss.org/jira/browse/JGRP-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12528876#action_12528876 ] 

Bela Ban commented on JGRP-1200:
--------------------------------

I believe the underlying cause may be the fact that the reconnector in TCPGOSSIP always tries to reconnect *all* stubs, not just the ones which are currently disconnected. We could simply fix this by checking in RouterStub.connect() whether we're already connected and - is yes - simply return from connect().

However, there is another problem: when out of {A, B, C}, B and C are disconnected, and suddenly the reconnector successfully connects to B, then the reconnector will get stopped, and won't even try to connect to C !

I think we should change both the reconnector and the connection checker to only reconnect to disconnected stubs (reconnector) and to only check the connection health (connection checker) for connected stubs.

I propose that both reconnector and connection checker have a list of stubs that we can remove from or add to. When the list is empty, the task is stopped.

This issue causes me problems in my JBossWorld demo, so I need to fix this this week.



> GossipRouter: lookup fails when we have multiple GossipRouters and any of them throws an exception
> --------------------------------------------------------------------------------------------------
>
>                 Key: JGRP-1200
>                 URL: https://jira.jboss.org/jira/browse/JGRP-1200
>             Project: JGroups
>          Issue Type: Bug
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 2.10
>
>
> Example: <TCPGOSSIP initial_hosts="A[12001],B[12001]" />
> If GossipRouter on host A is not running, but B *is* running, then the discovery will return with an empty list because connecting to A threw an exception.
> SOLUTION: parallelize connections to A and B and ignore exception for non-available hosts

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the jboss-jira mailing list