[jboss-dev-forums] [Design of Messaging on JBoss (Messaging/JBoss)] - Re: Messaging and Remoting

Wed Oct 18 07:15:12 EDT 2006

I originally started this discussion (in the thread Ovidiu pointed out) since after working deeeply with remoting with some time I started to wonder whether it is the right fit for our project.

Before I go any further I would also like to make clear that I am totally aware that Tom and Ron have spent a lot of time working with us and for that we are really grateful, also I am not criticising the remoting architecture per se, rather I am saying our requirements are somewhat different to the "classic" user of remoting and that, perhaps remoting is not _currently_ the right tool for the job.

That is not to say, it can't be the right tool for the job at some time in the future, but as Ovidiu and others have mentioned we have extremely tight deadlines, strong competition in the form of QPid (AMQP) and many eyes watching us. 

At the end of the day, we will be benchmarked against QPid, ActiveMQ and others and the bottom line is, it doesn't matter how wonderful our internal architecture is, we won't be able to touch them because the benchmarks will be decided on how fast we can push bytes over sockets, and we will lose right now. Period.

Much as I hate to say it, our competition has it right when it comes to their approach to remoting, actually they all seem to pretty much do it the same way (apart from us).

Earlier I mentioned a "classic" user of remoting. I'm not sure "classic" is the right word, but what I mean by that is your "standard" RPC style user. E.g. EJB.
If EJB is using remoting you want to do synchronous request/response, you also probably want to use serialization to transport objects since typically you won't know their types until runtime. You also probably don't have requirement to do 100000 invocations per sec on a single box.

This isn't the case with messaging.

In many cases with messaging we don't want RPC (in some cases we do). In fact RPC is a positive handicap. Why? 

Here is a an example that came up in the forums the other day (there is also a support case related to this from another customer with JBossMQ which suffers from a similar problem):

The customer has a fantastic high bandwidth network, but it has very high latency (large round trip time). They would love to use their network to it's capacity with our messaging product.

Currently when we deliver messages (usually one by one) from the server to the client consumer they are delivered by a remoting RPC call. This writes to a socket then waits for a response. Therefore the minimum amount of time to send one message is 2 x latency. And this is for every message (!).

The "correct" way to do this is to forget responses, just write to the socket and carry on. Flow control messages from the client to the server then prevent the client being overrun with messages (this is kind of analagous to how TCP flow control works - although there are some differences).

Moving on to serialization. JBoss Remoting is based around the idea of invocations that get serialized. We a) don't want invocations and b) In almost every case we don't want serialization either (there is one case we do want serialization) - this is because we know the types of the objects being transmitted at compile time so we can encode that information in a much more efficient way (in a byte) - the overhead of serialization is just to much for us.

(Currently we have hacked it so DataOutput/InputStreams are passed into the marshaller - but this is, as I say a hack.)

Another issue I have is the remoting core abstraction being a unidirectional connection. This forces remoting to have to introduce the concept of callback servers to be able to handle communication from the server to the client.

To me a bidirectional channel abstraction would have been much simpler and avoid a lot of complexity from our side in using API (i.e. we wouldn't need callback servers)

This brings me to the multiplex transport.

The current multiplex transport if I remember correctly has some significant overhead as compared to the standard socket transport. It's my understanding that this is due to having to design this in terms of virtual sockets so it can work with the core remoting abstraction of a unidirectional connection.

If remoting had used a bidirectional connection this would have been much simpler IMHO.

With a birectional abstraction, multiplex connections are not a hard problem to solve. After all you just need to encapsulate each lump of bytes written in a packet with the id of the logical connection and the length of the packet and simply correlate them on receipt according to the id and send them off to logical connections. (Actually when we get to AMQP we _must_ implement multiplex this way - we have no options since it is a requirement of the specification)

I can see how this would become more difficult to implement if you only have a unidirectional remoting abstraction to work with, and have to somehow unite the two remoting connections (one in one direction and one in another) so they actually use the same connection.

If implemented using a bidirectional remoting abstraction I don't see why there should be any major overhead over and above the socket transport. After all the extra stuff you need to write is just thee few extra bytes for the logical connection id and in the majority of cases this wouldn't require an extra packet to be sent anyway - so there should be zero overhead when it comes to the network. On the server side you would have an extra hashmap lookup to find the logical connection. This shouldn't be a significant overhead.

The fact is, that we can't afford any overhead in the multiplex transport since we'll lose in benchmarks because of this. Our competitors AFAIK do their multiplexing in a similar way to how I have described.

Wire format compatibility. Yes, we can provide our own marshaller which allows us to have some control over the wire format, although it doesn't currently give us full control, since ObjectInput/OutputStreams are used in some transports which have already written bytes to the socket even before the marshaller is called. As I mentioned before I have temporarily hacked things so this does not occur.

I am also not sure if we have full control of the wire format when using the multiplex transport - since I assume it encodes stuff in it's own way on the wire that we can't override using the standard marshaller.

We should bear in mind, that at some point (not too long) we probably need to support the AMQP protocol mainly for political reasons. The AMQP protocol defines a full wire format protocol down to the last byte. Including the wireformat of the multiplex, heartbeats etc.

(I suggest everyone skims the AMQP spec if they haven't done so already)

I don't see how we would get that to work with remoting. So we would have to do major work then _anyway_. So why not doing it now? Then we kill two birds with one stone.

The bottom line is that I think remoting is currently very well suited to an EJB RPC approach but it is not currently the best tool for us.

Ideally, yes, we should be using remoting, but in the timescales available I do not think it is feasible to get remoting to implement our requirements to the level where we can use it.

I agree with Juha that we should implement our requirements in our own way now, so we can get JBoss Messaging 1.2 on time and performing well. If we can't then all ths discussion is moot since there will probably be no JBoss Messaging since we will have failed, and QPid will have a good argument for positioning themselves as the "enterprise messaging system".

We should then look at contributing back our custom transport into the remoting codebase, so it can be used by others. I suspect this would require changes to the remoting API - specifically in the area of bidirectional connections as I have discussed.

We certainly should not be using our custom transport in the longer time.

At the end of the day we want to build the best messaging system out there and don't want to be prevented from this goal because of this. This is currently our #1 risk.

View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3979034#3979034

Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3979034