[jboss-jira] [JBoss JIRA] (JGRP-1826) Discovery: file-based discovery protocols should not send discovery requests

Tue Apr 22 05:24:33 EDT 2014

    [ https://issues.jboss.org/browse/JGRP-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12962942#comment-12962942 ] 

Bela Ban commented on JGRP-1826:
--------------------------------

The discovery should not have to read N files in a bucket (e.g. 1000 files for 1000 nodes), as this is very slow: the REST API to could storage needs to issue 1000 HTTP requests, parse the responses etc...
First of all, we need to see if those HTTP requests use connection pooling and how many of those requests are sent in parallel.
Secondly, we should terminate the loop as soon as we've received a coordinator response.

Alternatively, we could create a special file ending in *.node.coord*, and only read the coord node(s). However, this would not get us the information (address-physical address mapping) of the other nodes...

> Discovery: file-based discovery protocols should not send discovery requests
> ----------------------------------------------------------------------------
>
>                 Key: JGRP-1826
>                 URL: https://issues.jboss.org/browse/JGRP-1826
>             Project: JGroups
>          Issue Type: Enhancement
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.5
>
>
> When a node stores its information in a directory ({{FILE_PING}}, {{S3_PING}} or {{GOOGLE_PING}}), then we can optimize discovery by implementing a few things:
> * After reading all files, we send each node (represented by a file) a discovery request. That node processes the request and sends back a discovery response. This is unneeded traffic, especially with large clusters. Instead
> ** Read all files and add the information read from the files into the local caches (logical_addr_cache, UUID cache etc). This is the same as processing discovery responses from all members
> * Determine the coordinators directly from the file information. Perhaps we could even create a special file which contains information about the coordinator.
> ** This would prevent partitions from happening when starting up a large number of nodes: as long as that special file exists, nobody else will take ownership of it. When the coord leaves or crashes, we atomically replace the special file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira