[infinispan-dev] Is anyone else experiencing JGRP-1675

Bela Ban bban at redhat.com
Wed Oct 16 10:05:50 EDT 2013



On 10/11/13 2:40 PM, Radim Vansa wrote:
> Hi,
>
> since Infinispan moved to JGroups 3.4, we're experiencing occassional
> deadlocks in some tests - most of threads that send anything over
> JGroups are waiting in JGroups' FlowControl.decrementCredits.


Are those real deadlocks ? Meaning, the system never recovers ? If not, 
then it's just flow control doing its job and preventing a fast sender 
from overrunning a busy receiver.

For example, if the receiver is busy processing something or locked on a 
lock acquisition, then it may not be able to send back credits and 
blocks the sender until it is done. Also, if the receiver's thread pool 
drops the message because the pool is full, no credits will be sent, 
thus blocking the sender. This is a *good thing*, or else the pool would 
be exhausted even more.

We really need to make a clear distinction between these 2 modes. 
Naturally, if a receiver performs some blocking (as is done in 
Infinispan), the sender should stop sending at some point, or the 
receiver would simply drop all messages and cause a lot of retransmissions.

> The
> problem sometimes goes away after several seconds, but it produces some
> ugly spikes in our througput/response time charts.

OK, good, so it's the latter: temp blocking caused by flow control.

Flow control *can* cause some hiccups every now and then, especially if 
the receiver can block processing a message. The 200K credits is pretty 
low (unless you send very small messages), but the 10M mentioned here 
might be too much, I'd suggest a middle ground, e.g. 2-4 MB (default).

With the new INTERNAL thread pool, these blockings will *not* go away, 
as credits (even sent INTERNAL) won't get sent in the first place until 
the receiver(s) process the messages...

> Originally this
> affected just some RadarGun tests but this is appearing in some
> client-server tests as well (we've recently investigated an issue where
> this appeared in a regular soak test).
> I was looking into that [1] for some time but haven't really figured out
> the cause. The workaround is to set up MFC and UFC credits high enough
> (I use 10M) and stuff works then. I was trying to reproduce that on pure
> JGroups, but unsuccessfully.
> I am not asking anyone to dig into that, but I wanted to know whether QA
> is alone experiencing that or if there are more of us.
>
> Radim
>
> [1] https://issues.jboss.org/browse/JGRP-1675
>

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list