[
https://jira.jboss.org/jira/browse/JGRP-844?page=com.atlassian.jira.plugi...
]
Brian Stansberry commented on JGRP-844:
---------------------------------------
AIUI, this assumes homogeneous deployment of channels on top of the shared discovery.
Which may not be true in general, and definitely isn't true at temporary times.
Scenario:
Two nodes, A and B, with two services 1 and 2 that open channels on top of a shared
discovery. A and B are starting/deploying services concurrently. Order of start is:
1) A1
2) B1
3) B2
4) A2
Problem I think might be there is in step 3, B2's GMS will think A is the coordinator
and try to send JOIN msgs to A2, which doesn't exist.
That specific scenario could be recoverable, i.e. A2 will eventually start, at which point
B2's JOIN retries will succeed. But if service 2 wasn't deployed at all on A,
there would be no step 4). In that case I'd think there'd need to be some
mechanism by which GMS could eventually force a real discovery, bypassing the cached
data.
Discovery: make it a singleton with a shared transport
------------------------------------------------------
Key: JGRP-844
URL:
https://jira.jboss.org/jira/browse/JGRP-844
Project: JGroups
Issue Type: Feature Request
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 2.8
When we have a shared transport and 5 channels on top of it, then every channel will run
the discovery protocol. If it is the first node in a cluster, this will take <5 *
Discovery.timeout> ms.
Now, if the 5 channels didn't just share the transport, but also the discovery
protocol, then only the first channel to start would have to wait for Discovery.timeout
ms. It would then cache the results of that discovery and, when view changes are received,
replace the contents of the cache with view information.
The remaining 4 channels would then not even need to run the discovery phase, but the
discovery protocol would simply use the current view to return the coordinator. This means
that instead of 5 * timeout, we have 1 * timeout !
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira