[wildfly-dev] Inter Host Controller group communication mesh

Sebastian Laskawiec slaskawi at redhat.com
Tue Apr 12 05:29:54 EDT 2016


Adding Bela to the thread...

The POC looks really nice to me. I could try take it from herre and finish
WFLY-1066 implementation to see how everything works together.

The only thing that comes into my mind is whether we should (or or not) add
capability and server group information to it? I think most of the
subsystems would be interested in that.

On Mon, Apr 11, 2016 at 6:57 PM, Brian Stansberry <
brian.stansberry at redhat.com> wrote:

> Just an FYI: I spent a couple days and worked up a POC[1] of creating a
> JGroups-based reliable group communication mesh over the sockets our
> Host Controllers use for intra-domain management communications.
>
> Currently those sockets are used to form a tree of connections; master
> HC to slave HCs and then HCs to their servers. Slave HCs don't talk to
> each other. That kind of topology works fine for our current use cases,
> but not for other use cases, where a full communication mesh is more
> appropriate.
>
> 2 use cases led me to explore this:
>
> 1) A longstanding request to have automatic failover of the master HC to
> a backup. There are different ways to do this, but group communication
> based leader election is a possible solution. My preference, really.
>
> 2) https://issues.jboss.org/browse/WFLY-1066, which has led to various
> design alternatives, one of which is a distributed cache of topology
> information, available via each HC. See [2] for some of that discussion.
>
> I don't know if this kind of communication is a good idea, or if it's
> the right solution to either of these use cases. Lots of things need
> careful thought!! But I figured it was worth some time to experiment.
> And it worked in at least a basic POC way, hence this FYI.
>
> If you're interested in details, here are some Q&A:
>
> Q: Why JGroups?
>
> A: Because 1) I know it well 2) I trust it and 3) it's already used for
> this kind of group communications in full WildFly.
>
> Q: Why the management sockets? Why not other sockets?
>
> A: Slave HCs already need configuration for how to discover the master.
> Using the same sockets lets us reuse that discovery configuration for
> the JGroups communications as well. If we're going to use this kind of
> communication in an serious way, the configuration needs to be as easy
> as possible.
>
> Q: How does it work?
>
> A: JGroups is based on a stack of "protocols" each of which handles one
> aspect of reliable group communications. The POC creates and uses a
> standard protocol stack, except it replaces two standard protocols with
> custom ones:
>
> a) JGroups has various "Discovery" protocols which are used to find
> possible peers. I implemented one that integrates with the HC's domain
> controller discovery logic. It's basically a copy of the oft used
> TCPPING protocol with about 10-15 lines of code changed.
>
> b) JGroups has various "Transport" protocols which are responsible for
> actually sending/receiving over the network. I created a new one of
> those that knows how to use the WF management comms stuff built on JBoss
> Remoting. JGroups provides a number of base classes to use in this
> transport area, so I was able to rely on a lot of existing functionality
> and could just focus on the details specific to this case.
>
> Q: What have you done using the POC?
>
> A: I created a master HC and a slave on my laptop and saw them form a
> cluster and exchange messages. Typical stuff like starting and stopping
> the HCs worked. I see no reason why having multiple slaves wouldn't have
> worked too; I just didn't do it.
>
> Q: What's next?
>
> A: Nothing really. We have a couple concrete use cases we're looking to
> solve. We need to figure out the best solution for those use cases. If
> this kind of thing is useful in that, great. If not, it was a fun POC.
>
> [1]
>
> https://github.com/wildfly/wildfly-core/compare/master...bstansberry:jgroups-dc
> . See the commit message on the single commit to learn a bit more.
>
> [2] https://developer.jboss.org/wiki/ADomainManagedServiceRegistry
>
> --
> Brian Stansberry
> Senior Principal Software Engineer
> JBoss by Red Hat
> _______________________________________________
> wildfly-dev mailing list
> wildfly-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/wildfly-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/wildfly-dev/attachments/20160412/e7e99005/attachment-0001.html 


More information about the wildfly-dev mailing list