[wildfly-dev] Inter Host Controller group communication mesh

Brian Stansberry brian.stansberry at redhat.com
Mon Apr 18 11:04:57 EDT 2016


As an FYI, copying Bela Ban, who I stupidly forgot to copy on the first 
post. Sebastian kindly copied him on the other main branch of the thread.

Bela, tl;dr on this branch is it mostly discusses concerns about N^2 TCP 
connections in a possibly very large cluster. Whether the JGroups 
cluster would need to get very large depends on what use cases we used 
it to solve.

On 4/11/16 4:20 PM, Brian Stansberry wrote:
> On 4/11/16 3:43 PM, Ken Wills wrote:
>>
>>
>> On Mon, Apr 11, 2016 at 11:57 AM, Brian Stansberry
>> <brian.stansberry at redhat.com <mailto:brian.stansberry at redhat.com>> wrote:
>>
>>      Just an FYI: I spent a couple days and worked up a POC[1] of creating a
>>      JGroups-based reliable group communication mesh over the sockets our
>>      Host Controllers use for intra-domain management communications.
>>
>>
>> Nice! I've been thinking about the mechanics of this a bit recently, but
>> I hadn't gotten to any sort of transport details, this looks interesting.
>>
>>      Currently those sockets are used to form a tree of connections; master
>>      HC to slave HCs and then HCs to their servers. Slave HCs don't talk to
>>      each other. That kind of topology works fine for our current use cases,
>>      but not for other use cases, where a full communication mesh is more
>>      appropriate.
>>
>>      2 use cases led me to explore this:
>>
>>      1) A longstanding request to have automatic failover of the master HC to
>>      a backup. There are different ways to do this, but group communication
>>      based leader election is a possible solution. My preference, really.
>>
>>
>> I'd come to the same conclusion of it being an election. A deterministic
>> election algorithm, perhaps allowing the configuration to supply some
>> sort of weighted value to influence the election on each node, perhaps
>> analogous to how the master browser smb election works (version + weight
>> + etc).
>
> Yep.
>
> For sure the master must be running the latest version.
>
>>
>>
>>      2) https://issues.jboss.org/browse/WFLY-1066, which has led to various
>>      design alternatives, one of which is a distributed cache of topology
>>      information, available via each HC. See [2] for some of that discussion.
>>
>>      I don't know if this kind of communication is a good idea, or if it's
>>      the right solution to either of these use cases. Lots of things need
>>      careful thought!! But I figured it was worth some time to experiment.
>>      And it worked in at least a basic POC way, hence this FYI.
>>
>>
>> Not knowing a lot about jgroups .. for very large domains is the mesh
>> NxN in size?
>
> Yes.
>
> For thousands of nodes would this become a problem,
>
> It's one concern I have, yes. There are large JGroups clusters, but they
> may be based on the UDP multicast transport JGroups offers.
>
>> or would
>> a mechanism to segment into local groups perhaps, with only certain
>> nodes participating in the mesh and being eligible for election?
>
>
> For sure we'd have something in the host.xml that controls whether a
> particular HC joins the group.
>
> I don't think this is a big problem for the DC election use case, as you
> don't need a large number of HCs in the group. You'd have a few
> "potential" DCs that could join the group, and the remaining slaves
> don't need to.
>
> For use cases where you want slave HCs to be in the cluster though, it's
> a concern. The distributed topology cache thing may or may not need
> that. It needs a few HCs to provide HA, but those could be the same ones
> that are "potential" HCs. But if only a few are in the group, the
> servers need to be told how to reach those HCs. Chicken and egg, as the
> point of the topology cache is to provide that kind of data to servers!
> If a server's own HC is required to be a part of the group though, that
> helps cut through the chicken/egg problem.
>
>
>> Ken
>>
>
>


-- 
Brian Stansberry
Senior Principal Software Engineer
JBoss by Red Hat


More information about the wildfly-dev mailing list