On 18/09/14 15:28, Dan Berindei wrote:
On Thu, Sep 18, 2014 at 3:09 PM, Bela Ban <bban(a)redhat.com
<mailto:bban@redhat.com>> wrote:
On 18/09/14 13:03, Dan Berindei wrote:
> Thanks Pedro, this looks great.
>
> However, I don't think it's ok to treat CommitCommands/Pessimistic
> PrepareCommands as RemoteLockCommands just because they may send L1
> invalidation commands. It's true that those commands will block, but
> there's no need to wait for any other command before doing the L1
> invalidation. In fact, the non-tx writes on backup owners, which you
> consider to be non-blocking, can also send L1 invalidation
commands (see
> L1NonTxInterceptor.invalidateL1).
>
> On the other hand, one of the good things that the remote
executor did
> was to allow queueing lots of commands with a higher topology id,
when
> one of the nodes receives the new topology much later than the
others.
> We still have to consider each TopologyAffectedCommand as potentially
> blocking and put it through the remote executor.
>
> And InvalidateL1Commands are also TopologyAffectedCommands, so
there's
> still a potential for deadlock when L1 is enabled and we have
maxThreads
> write commands blocked sending L1 invalidations and those L1
> invalidation commands are stuck in the remote executor's queue on
> another node. And with (very) unlucky timing the remote executor
might
> not even get to create maxThreads threads before the deadlock
appears. I
> wonder if we could write a custom executor that checks what the first
> task in the queue is every second or so, and creates a bunch of new
> threads if the first task in the queue hasn't changed.
>
> You're right about the remote executor getting full as well, we're
> lacking any feedback mechanism to tell the sender to slow down,
except
> for blocking the OOB thread.
JGroups sends credits back to the sender *after* the message has been
delivered into the application. If the application is slow in processing
the messages, or blocks for some time, then the sender will not receive
enough credits and thus also slow down, or even block.
> I wonder if we could tell JGroups somehow
> to discard the message from inside MessageDispatcher.handle (e.g. throw
> a DiscardMessageException), so the sender has to retransmit it
At this point, JGroups considers the message *delivered* (as it has
passed the UNICAST or NAKACK protocols), and it won't get resent. You
cannot discard it either, as this will be a message loss. However, if
you can tolerate loss, all is fine. E.g. if you discard a topo message
with a lower ID, I don't think any harm is done in Infinispan. (?). To
discard or not, is Infinispan's decision.
The other thing is to block, but this can have an impact back into the
JGroups thread pools.
Right, I was hoping the message would be marked as delivered only after
Infinispan finished processing the message (i.e. when up() returns in
UNICAST/NAKACK).
No, that's not the case. JGroups is a reliable transport and works
similarly to TCP: a message is considered delivered when it leaves the
NAKACK or UNICAST protocols. Same for TCP: a read reads bytes available
from the socket, and those bytes are considered delivered by TCP.
Perhaps we could delay sending the credits instead? When process a
message on our internal thread pool, it would be nice if we could tell
JGroups to send credits back only when we really finished processing the
message.
Not nice, as this break encapsulation. This stuff is supposed to be
hidden from you.
But what you can do is to block the incoming thread: only when it
returns will JGroups send credits back to the sender.
If there are a lot of requests, then at one point the internal ISPN
thread pool will have to start blocking selected threads, and possibly
start discarding selected messages IMO.
--
Bela Ban, JGroups lead (
http://www.jgroups.org)