[infinispan-issues] [JBoss JIRA] (ISPN-6799) OOB thread pool fills with threads trying to send remote get responses

Sanne Grinovero (JIRA) issues at jboss.org
Mon Jun 27 08:23:00 EDT 2016


    [ https://issues.jboss.org/browse/ISPN-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257323#comment-13257323 ] 

Sanne Grinovero commented on ISPN-6799:
---------------------------------------

Rather than modelling each "need to reply" with a thread, couldn't we have a clever data structure which collects all the responses-to-be-sent.

I believe this was shared during past face to face meetings, so to recap the high level concepts:

You'd have a limited number of threads periodically scanning this and sending them out in batches.
The collection would need to be:
 - somewhat fairly ordered to make sure that no response waits too long
 - efficiently drop those responses which already waited too long
 - organised by target address, so it can be scanned efficiently for multiple answers which need to be batched to the same address: there's high likelihood for many responses to need to in the same "direction"

Contention, context switches (and complexity) could be minimised by doing all this in a single thread in a disruptor-style queue, at most sharding by a combination of segment-id and target-address to achieve multiple "shared nothing" structures, maintaining each queue as a fully independent highway lane.

This would also remove some pressure from the batching done at JGroups level as many small messages will already be delivered "packed by target" at JGroups - we might even consider disabling the JGroups batch features and that would improve latency.

On top of this, I believe that such a design would allow us to (in future) enhance JGroups to not always need to hold on the entire message payload for the retransmission table but rather keep a hook to the Infinispan's "queue of outbound messages", re-marshalling on demand; this might also need an entry "snapshot" mechanism to re-transmit fragments of the original value if it changed, or a protocol improvement to rather undo the other fragments and re-send with the updated value.

> OOB thread pool fills with threads trying to send remote get responses
> ----------------------------------------------------------------------
>
>                 Key: ISPN-6799
>                 URL: https://issues.jboss.org/browse/ISPN-6799
>             Project: Infinispan
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 9.0.0.Alpha2, 8.2.2.Final
>            Reporter: Dan Berindei
>             Fix For: 9.0.0.Alpha3
>
>
> Note: This is a scenario that happens in the stress tests, with 4 nodes in dist mode, and 200+ threads per node doing only reads. I have not been able to reproduce it locally, even with a much lower OOB thread pool size and UFC.max_credits.
> We don't use the {{NO_FC}} flag, so threads sending both requests and responses can block in UFC/MFC. Remote gets are executed directly on the OOB thread, so when we run out of credits for one node, the OOB pool can quickly become full with threads waiting to send a remote get response to that node.
> While we can't send responses to that node, we won't send credits to it, either, as credits are only sent *after* the message has been processed by the application. That means OOB threads on all nodes will start blocking, trying to send remote get responses to us.
> This is made a worse by our staggering of remote gets. As remote get responses block, the stagger timeout kicks in and we send even more remote gets, making it even harder for the system to recover.
> UFC/MFC can send a {{CREDIT_REQUEST}} message to ask for more credits. The {{REPLENISH}} messages are handled on JGroups' internal thread pool, so they are not blocked. However, the CREDIT_REQUEST can be sent at most once every {{UFC.max_block_time}} ms, so they can't be relied on to provide enough credits. With the default settings, the throughput would be {{max_credits / max_block_time == 2mb / 0.5s == 4mb/s}}, which is really small compared to regular throughput.



--
This message was sent by Atlassian JIRA
(v6.4.11#64026)


More information about the infinispan-issues mailing list