[jbosscache-dev] Re: Non Blocking State Transfer Status (& Integration with JGroups)

Tue Jan 6 19:10:53 EST 2009

Brian Stansberry wrote:
> Jason T. Greene wrote:
>> Jason T. Greene wrote:
>>> Brian Stansberry wrote:
>>>> Jason T. Greene wrote:
>>>>> Brian Stansberry wrote:
>>>>>> Jason T. Greene wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I wanted to summarize my initial research into NBST. The planned 
>>>>>>> design (outlined in the wiki: 
>>>>>>> http://www.jboss.org/community/docs/DOC-10275) only needs to 
>>>>>>> block transactional activity once, at the end of the process when 
>>>>>>> sending the tx log. Unfortunately it appears that flush and 
>>>>>>> partial flush can not be used for this, since the application 
>>>>>>> needs the ability to send state (tx log) during the flush. I.e. 
>>>>>>> transactions need to be paused by only 2 nodes, while the 
>>>>>>> transfer state. This however is not a big deal because we can 
>>>>>>> just do this in JBoss Cache using a normal RPC message that flips 
>>>>>>> a gate.
>>>>>>>
>>>>>>> In addition, the state transfer and streaming state transfer 
>>>>>>> facilities in jgroups can not be used (since they are designed 
>>>>>>> around blocking the entire group). This means JBoss Cache needs 
>>>>>>> to stream state itself. Ideally this would be a separate 
>>>>>>> point-to-point connection, since we don't want to pollute 
>>>>>>> multicast traffic with potentially huge volumes of noise. 
>>>>>>> Currently jgroups does not yet support a streaming API like this:
>>>>>>> https://jira.jboss.org/jira/browse/JGRP-653
>>>>>>>
>>>>>>> IMO This leaves us with 3 options:
>>>>>>>
>>>>>>> 1. Wait on JGRP-653 (upping its priority), also add requirements 
>>>>>>> for a p2p connection.
>>>>>>> 2. Implement our own p2p connection using tcp (probably using xnio).
>>>>>>> 3. Somehow enhance state transfer / partial flush to meet our needs
>>>>>>>
>>>>>>> Option 1 seems to be a useful feature for other applications. 
>>>>>>> Although we need feedback from Bela and Vladimir about that.
>>>>>>>
>>>>>>> Option 2 would give us more flexibility in the implementation, 
>>>>>>> however care has to be taken to ensure that communication can 
>>>>>>> only happen between group members (for security reasons), and 
>>>>>>> that the network address configurations are somehow reused.
>>>>>>>
>>>>>>> Option 3 I am less found of, since we would likely end up adding 
>>>>>>> a bunch of JBoss Cache specific code to JGroups that no one else 
>>>>>>> would use.
>>>>>>>
>>>>>>
>>>>>> Option 2 makes me nervous. Two separate communication frameworks, 
>>>>>> added dependencies, opening new sockets etc. Sounds like 
>>>>>> integration hassles for sure.
>>>>>>
>>>>>
>>>>> Yes there are definitely integration hassles that make this option 
>>>>> less desirable than the first.
>>>>>
>>>>>  From a dependency perspective, we are already using non-jgroups 
>>>>> p2p with TCPCacheServer (currently Java sockets based), although I 
>>>>> believe Manik was evaluating xnio for it since it would simplify 
>>>>> development. While it is an added dep for JBC, it will eventually 
>>>>> be part of AS, since Remoting 3 depends on it.
>>>>
>>>> Yeah, that's part of the concern. Two otherwise independent projects 
>>>> using an underlying library for a critical function and both have to 
>>>> play nice in the AS.  In my self-centered viewpoint TcpCacheServer 
>>>> isn't a critical function since I don't use it. ;) (Mostly kidding 
>>>> here; I recognize that AS users may use it so it would need to work 
>>>> in the AS.)
>>>>
>>>> BTW, is a *JGroups* streaming API necessary here?  The old AS Farm 
>>>> service passed arbitrary sized files by sending byte[] chunks via 
>>>> RpcDispatcher calls. Worked fine. That's not quite what JBC would 
>>>> need, since FarmService read a chunk from a FileInputStream and 
>>>> passed it to JGroups; you'd want an OutputStream impl that would 
>>>> pass a chunk to JGroups when it's internal buffer reached size X.
>>>
>>> Their could be an abstraction in JBC that does this, however we are 
>>> still missing the piece that says this message needs to go across a 
>>> separate p2p tcp connection. Otherwise state transfer traffic will 
>>> compete with live standard operations. This problem is amplified when 
>>> we have multiple state transfers going on.
>>>
>>> I suppose we could dynamically create a new JGroups channel for this, 
>>> but it seems like a lot of overhead for a single p2p connection.
>>>
>>
>> Thinking again, building this over JGroups would probably better with 
>> two static channels. One would be TCP + MPING, the other whatever the 
>> user defines.
>>
> 
> I don't understand why two channels are needed. You can easily unicast 
> over the regular JGroups channel. Is your concern that the state 
> transfer messages between the sender and the recipient will get held up 
> by FC or UNICAST or something due to the regular traffic? I don't see 
> why that would be a big problem. The recipient until it completes the 
> state transfer isn't going to be doing anything (e.g. holding locks) 
> that will cause significant delays in processing state transfer 
> messages.  (Also, I may be wrong but I believe the UNICAST and NAKACK 
> locks are separate, so a thread carrying a unicast state transfer 
> message from node A will not block on recipient B because a multicast 
> replication message from A is being handled by another thread.)
> 

I am not too worried about the locking, although that is a factor, but 
more about a multicast configuration, where you have state transfer 
traffic for N transfers hitting the multicast address. Even if everyone 
discards the traffic, it still has to be processed in java land, and it 
will be alot of traffic.

-- 
Jason T. Greene
JBoss, a division of Red Hat