[infinispan-dev] [infinispan-internal] async processing documentation (+ nice inconsistency scenario example)

Tue Mar 19 13:32:43 EDT 2013

On Tue, Mar 19, 2013 at 5:17 PM, Manik Surtani <msurtani at redhat.com> wrote:

>
> On 19 Mar 2013, at 15:07, Sanne Grinovero <sanne at redhat.com> wrote:
>
> >
> >
> > ----- Original Message -----
> >>
> >> On 19 Mar 2013, at 12:21, Mircea Markus <mmarkus at redhat.com> wrote:
> >>
> >>> On 19 Mar 2013, at 11:05, Sanne Grinovero wrote:
> >>>> Does Marshalling really need to be performed in a separate thread
> >>>> pool?
> >>>> I think we have too many pools, too much context switching, and
> >>>> situations like this one which should be avoided.
> >>>>
> >>>> We could document it  but all these details are making it very
> >>>> hard to feel comfortable with, and for this specific use case I
> >>>> wonder if there
> >>>> is a strong benefit: plain serial operations seem so much cleaner
> >>>> to me.
> >>> +1 for dropping it in 6.0. It isn't enabled by default and AFAIK it
> >>> created more confusion through the users than benefits.
> >>
> >> Why?  I don't agree.  If network transfer is the most expensive part
> >> of performing a write, then marshalling is the second-most
> >> expensive.  If you don't take the marshalling offline as well,
> >> you're only realising a part of the performance gains of using
> >> async.
> >
> > Of course. I didn't mean to put it on the thread of the invoker, I would
> expect
> > this to happen "behind the scenes" when using async, but in the same
> thread which
> > is managing the lower IO so to reduce both context switching and these
> weird
> > race conditions.. so removing the option only.
>
> Well, when using the same lower IO pool, while common sense, isn't as easy
> since it is a JGroups pool.  If we pass the marshaller itself into JGroups,
> the marshalling still happens online, and just the IO happening in a
> separate thread.  Also, JGroups allows you to register one marshaller and
> unmarshaller per channel - which doesn't work when you have a transport
> shared by multiple cache instances potentially on different class loaders.
>
> So yes, this can be done much better, but that means a fair few changes in
> JGroups such that:
>
> * Marshalling happens in the async thread (the same one that puts the
> message on the wire) rather than in the caller's thread
> * sendMessage() should accept a marshaller and unmarshaller per invocation
>
> Then we can drop this additional thread pool.
>
>
The upper-most protocol in the default stack is FRAG2, and it already needs
the serialized payload - it can't split an Object in 2 messages. Most other
protocols need at least the message size. So there's no way our payload is
going to get serialized only in the TP thread that actually puts the bytes
on the wire.

In fact, I would go the other way around. Because we have multiple
marshallers, I think it would be cleaner if we used MessageDispatcher
directly and did the request/response serialization in Infinispan.

I wouldn't recommend async marshalling anyway. The user must be very
careful not to modify the value object at any time after calling
cache.put(key, value), so to me using async marshalling is just asking for
trouble.

There are a couple places where I think we could save an async transport
thread, but I don't think either would make a perceptible change in
performance:
* Waiting for a response from the recipients is much slower than sending
the message. If RpcManagerImpl.invokeRemotelyInFuture just sent the message
and returned the JGroups Request object, I don't think we'd need the thread
pool there.
* We could also detect if the user invoked cache.putAsync and avoid using
an extra async transport thread when the cache is async.

> >
> >>
> >>> On top of that the number of pools is growing (5.3 adds another
> >>> pool in the scope of ISPN-2808).
> >>
> >> You can configure to use a single thread pool for all these tasks, if
> >> hanging on to multiple thread pools is too complex.
> >
> > I don't believe you can always do that, if you don't keep tasks isolated
> > in different pools deadlocks could happen. So unless you can come up with
> > a nice diagram and explain which ones are safe to share, it is very
> > complex to handle.
> >
>

If queueing is disabled and the caller runs tasks when the thread pool is
full, dependencies are not a problem. If queueing is enabled... yes,
dependencies are a big problem.

But I'm pretty sure you can have dependency cycles with 2 thread pools as
well, if both have queueing enabled.

> > Would be nice to have these discussions on the public mailing list.
>
> +1.  Adding infinispan-dev in cc.
>
> >
> > Sanne
> >
> >>
> >> - M
> >>
> >> --
> >> Manik Surtani
> >> manik at jboss.org
> >> twitter.com/maniksurtani
> >>
> >> Platform Architect, JBoss Data Grid
> >> http://red.ht/data-grid
> >>
> >>
> >>
> >
>
> --
> Manik Surtani
> manik at jboss.org
> twitter.com/maniksurtani
>
> Platform Architect, JBoss Data Grid
> http://red.ht/data-grid
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/infinispan-dev/attachments/20130319/1fd5f0bb/attachment-0001.html