[jbosscache-dev] Re: Non Blocking State Transfer Status (& Integration with JGroups)

Tue Jan 6 18:38:25 EST 2009

Jason T. Greene wrote:
> Jason T. Greene wrote:
>> Brian Stansberry wrote:
>>> Jason T. Greene wrote:
>>>> Brian Stansberry wrote:
>>>>> Jason T. Greene wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> I wanted to summarize my initial research into NBST. The planned 
>>>>>> design (outlined in the wiki: 
>>>>>> http://www.jboss.org/community/docs/DOC-10275) only needs to block 
>>>>>> transactional activity once, at the end of the process when 
>>>>>> sending the tx log. Unfortunately it appears that flush and 
>>>>>> partial flush can not be used for this, since the application 
>>>>>> needs the ability to send state (tx log) during the flush. I.e. 
>>>>>> transactions need to be paused by only 2 nodes, while the transfer 
>>>>>> state. This however is not a big deal because we can just do this 
>>>>>> in JBoss Cache using a normal RPC message that flips a gate.
>>>>>>
>>>>>> In addition, the state transfer and streaming state transfer 
>>>>>> facilities in jgroups can not be used (since they are designed 
>>>>>> around blocking the entire group). This means JBoss Cache needs to 
>>>>>> stream state itself. Ideally this would be a separate 
>>>>>> point-to-point connection, since we don't want to pollute 
>>>>>> multicast traffic with potentially huge volumes of noise. 
>>>>>> Currently jgroups does not yet support a streaming API like this:
>>>>>> https://jira.jboss.org/jira/browse/JGRP-653
>>>>>>
>>>>>> IMO This leaves us with 3 options:
>>>>>>
>>>>>> 1. Wait on JGRP-653 (upping its priority), also add requirements 
>>>>>> for a p2p connection.
>>>>>> 2. Implement our own p2p connection using tcp (probably using xnio).
>>>>>> 3. Somehow enhance state transfer / partial flush to meet our needs
>>>>>>
>>>>>> Option 1 seems to be a useful feature for other applications. 
>>>>>> Although we need feedback from Bela and Vladimir about that.
>>>>>>
>>>>>> Option 2 would give us more flexibility in the implementation, 
>>>>>> however care has to be taken to ensure that communication can only 
>>>>>> happen between group members (for security reasons), and that the 
>>>>>> network address configurations are somehow reused.
>>>>>>
>>>>>> Option 3 I am less found of, since we would likely end up adding a 
>>>>>> bunch of JBoss Cache specific code to JGroups that no one else 
>>>>>> would use.
>>>>>>
>>>>>
>>>>> Option 2 makes me nervous. Two separate communication frameworks, 
>>>>> added dependencies, opening new sockets etc. Sounds like 
>>>>> integration hassles for sure.
>>>>>
>>>>
>>>> Yes there are definitely integration hassles that make this option 
>>>> less desirable than the first.
>>>>
>>>>  From a dependency perspective, we are already using non-jgroups p2p 
>>>> with TCPCacheServer (currently Java sockets based), although I 
>>>> believe Manik was evaluating xnio for it since it would simplify 
>>>> development. While it is an added dep for JBC, it will eventually be 
>>>> part of AS, since Remoting 3 depends on it.
>>>
>>> Yeah, that's part of the concern. Two otherwise independent projects 
>>> using an underlying library for a critical function and both have to 
>>> play nice in the AS.  In my self-centered viewpoint TcpCacheServer 
>>> isn't a critical function since I don't use it. ;) (Mostly kidding 
>>> here; I recognize that AS users may use it so it would need to work 
>>> in the AS.)
>>>
>>> BTW, is a *JGroups* streaming API necessary here?  The old AS Farm 
>>> service passed arbitrary sized files by sending byte[] chunks via 
>>> RpcDispatcher calls. Worked fine. That's not quite what JBC would 
>>> need, since FarmService read a chunk from a FileInputStream and 
>>> passed it to JGroups; you'd want an OutputStream impl that would pass 
>>> a chunk to JGroups when it's internal buffer reached size X.
>>
>> Their could be an abstraction in JBC that does this, however we are 
>> still missing the piece that says this message needs to go across a 
>> separate p2p tcp connection. Otherwise state transfer traffic will 
>> compete with live standard operations. This problem is amplified when 
>> we have multiple state transfers going on.
>>
>> I suppose we could dynamically create a new JGroups channel for this, 
>> but it seems like a lot of overhead for a single p2p connection.
>>
> 
> Thinking again, building this over JGroups would probably better with 
> two static channels. One would be TCP + MPING, the other whatever the 
> user defines.
> 

I don't understand why two channels are needed. You can easily unicast 
over the regular JGroups channel. Is your concern that the state 
transfer messages between the sender and the recipient will get held up 
by FC or UNICAST or something due to the regular traffic? I don't see 
why that would be a big problem. The recipient until it completes the 
state transfer isn't going to be doing anything (e.g. holding locks) 
that will cause significant delays in processing state transfer 
messages.  (Also, I may be wrong but I believe the UNICAST and NAKACK 
locks are separate, so a thread carrying a unicast state transfer 
message from node A will not block on recipient B because a multicast 
replication message from A is being handled by another thread.)

Also, going back to your very first paragraph "flush and
partial flush can not be used for this, since the application
needs the ability to send state (tx log) during the flush" -- I believe 
unicast messages are not blocked by FLUSH. Although depending on that 
seems like a hack.

-- 
Brian Stansberry
Lead, AS Clustering
JBoss, a division of Red Hat
brian.stansberry at redhat.com