[jbosscache-dev] Re: [jgroups-dev] Shared transport in AS

Thu Jan 24 11:57:06 EST 2008

Bela Ban wrote:
> 
> 
> Brian Stansberry wrote:
>> Been playing with using the 2.6 branch version of the shared transport
>> in JBoss AS. Unfortunately, the dead-simple integration approach I was
>> thinking of is not valid. (See
>> http://jira.jboss.com/jira/browse/JBAS-5167). I've got other ideas I'm
>> playing with; I'm sure I can come up with something; won't get into
>> details of the problem here.
> 
> Is this a JBoss or JGroups issue ?
> 

Really more of a JBC issue. The way JBC works, you either give it a full 
protocol stack config, or inject a JChannelFactory and a stack name. In 
the AS we're not interested in the former. If you configure for the 
latter, JBC internally gets its channel by calling 
factory.createMultiplexerChannel(...).  That's no good because we don't 
want a multiplexer channel; we want a shared transport channel called 
via factory.createChannel(stack_name).

The AS has the same problem, but I control the code so I can work around 
it more easily.  JBC has its own release cycle, dependency mgmt, etc.

The JBAS-5167 approach is no good -- it tries to use the MC to create 
the channel (using factory.createChannel(stack_name)) and then inject it 
into the target service.  No good because that injection only happens 
once. If you stop() and re- start() the service, the injected channel 
gets closed; when the re- start() tries to call connect() on it again 
that fails.

The AS already uses a custom JChannelFactory subclass. It already 
overrides createMultiplexerChannel(). As a temporary workaround, I've 
added logic so it checks the requested protocol stack's config to see if 
singleton_name is set; if it is it delegates to 
createChannel(stack_name), if not it calls 
super.createMultiplexerChannel().  Basically changed the semantic of the 
method to create a shared transport channel if possible, and a 
multiplexer channel if not possible.

That seems to work for now, well enough to allow continued experimentation.

>> But, aside from a few failures related to
>> the flaw in JBAS-5167, it looks like the AS testsuite ran well w/ shared
>> transport channels used instead of multiplexed ones. :)
>>
>> There are a few things I saw that IMHO need improvement before we can
>> switch the AS to this:
>>
>> 1) TP.members -- used to send to all members via unicast by TCP (or by
>> UDP if multicast is disabled). AIUI this should probably be a
>> Map<String, List<Address>> so separate memberships can be maintained.
> 
> Why ? If we have a shared channel C1 and apps A1 (cluster name="A1") A2 
> (name="A2")  and A3 (name="A3") on top of it, then all 3 apps have the 
> *same* JGroups address, e.g. 192.168.2.5:5000. If we have another shared 
> channel C2 in a separate JVM, and apps A1, A2, A3 on top, with address 
> 192.168.2.5:6000, then C1.A1's view is
> {192.168.2.5:5000, 192.168.2.5:6000}, same for all apps.
> 
> So when C1.A1 multicasts a message, then it will conceptually send it to 
> 192.168.2.5:5000 ("A1") and 192.168.2.5:6000 ("A1"). So its view has 2 
> members. A2 and A3 will not receive the multicast, as the cluster name 
> ("A1") is used to demultiplex the message to the right protocol above 
> the shared channel.
> 

True, but I'm talking about multiple unicast, not multicast. And assume 
on node 2 that app A1 is not deployed. So now you're expending resources 
sending messages to peers who will just drop them. Can't be helped and 
is not very costly with multicast, but tracking the members by group 
name allows you to avoid it for multiple unicast.

>> 2) MPING. We'd discussed having MPING drop GET_MBRS_RSP messages that
>> are not associated with the channel's group name. This allows a single
>> MPING config (i.e. mcast_addr, mcast_port) in a stack to be instantiated
>> multiple times, for different channels. AFAICT, the way it is now, 2
>> channels created from the same stack will see each others' ping requests
>> and respond, with the irrelevant responses from the wrong channel being
>> treated as valid.
> 
> No, the invalid ping request should get discarded, this is analogous to 
> a *non-shared stack*:
> - draw -props ./mping.xml -groupname X
> - draw -props ./mping.xml -groupname Y
> 
> Both instances use the *same* MPING config, but will discard their 
> discovery requests as they belong to different clusters.
> 
> So, in your example, you must have different stacks. Or use the same 
> stack, but change MPING's mcast_addr and mcast_port.
> 
> I don't want to change this as it is *exactly* the same behavior as in 
> the non-shared stack !
> 

OK, it sounds like you are saying this already works the way I want.  I 
admit I didn't test it. :(  I just looked at the MPING code and didn't 
see anything where it would discard a message from the wrong group (e.g. 
in MPING.run(), MPING.up() or Discovery.up()).

>> That's a pretty small list. I really like this stuff; excellent work, 
>> guys!!
> 
> Thanks !
> 
>> Something else that seems *much* less a show-stopper but is still a 
>> concern:
>>
>> The shared transports are stored in a VM-singleton map --
>> ProtocolStack.shared_transports. Can these instead be stored as an
>> instance field in JChannelFactory? 
> 
> But then you wouldn't be able to share transports between JChannels, and 
> this would force you to use JChannelFactory... I'd like folks to be able 
> to simply create a new JChannel(). If I do this, the variable has to be 
> on the ProtocolStack, can't be on JChannelFactory.
> 
>> This makes the scope of sharing
>> controllable. Perhaps this could be a flag on the JChannelFactory; use
>> the VM-singleton map by default but use a factory-scoped map if a flag
>> is set.
> 
> Can you send me a short program that shows what you want to do ? I 
> assume you don't use JChannelFactory.createMultiplexerChannel() right ? 
> Do you use JCF.createChannel() ?
> 

I'll play with this a bit; if it looks like a serious problem I'll send 
you something.

>> This bit me in one unit test where I deliberately create 2
>> JChannelFactory instances to simulate two cluster nodes. Each factory
>> creates two channels (were multiplexed, now shared TP). Then I do some
>> manipulations. This test fails because when I create the channels from
>> the 2nd factory, Configurator.startProtocolStack barfs:
>>
>> Caused by: java.lang.IllegalStateException: cluster &apos;tunnel&apos;
>> is already connected to singleton transport: [tunnel, 
>> dummy-1201128933100]
>> at 
>> org.jgroups.stack.Configurator.startProtocolStack(Configurator.java:88)
>> at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:316)
>> at org.jgroups.JChannel.startStack(JChannel.java:1442)
>> ... 28 more
>>
>> With the channels stored in a static map, there's no clean way to
>> simulate two separate environments. I imagine this kind of scenario
>> would be common in test cases; I believe JBC has some similar kinds of
>> tests.
> 
> I'll take a look
> 

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com