[jboss-jira] [JBoss JIRA] Commented: (JGRP-1043) UNICAST: high contention

Tue Sep 15 11:48:23 EDT 2009

    [ https://jira.jboss.org/jira/browse/JGRP-1043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12485894#action_12485894 ] 

Bela Ban commented on JGRP-1043:
--------------------------------

Re: Further to the issue of contention for AckReceiverWindow.msgs

The reason for ack'ing OOB messages directly, instead of later with the regular message processing, is that if the OOB thread returns, there won't be an ACK for that message. As a matter of fact, if we have a lot of OOB messages, and no regular message for a while, there would be a large number of unacked OOB messages.

Re: use of processing to determine whether to call win.removeOOBMessage()

The 'processing' AtomicBoolean needs to be changed while holding the AckReceiverWindow.msgs lock, or else we can end up with messages left in the AckReceiverWindow, as follows (AckReceiverWindow.remove(AtomicBooloean):
/**
     * We need to have the lock on 'msgs' while we're setting processing to false (if no message is available), because
     * this prevents some other thread from adding a message. Use case:
     * <ol>
     * <li>Thread 1 calls msgs.remove() --> returns null
     * <li>Thread 2 calls add()
     * <li>Thread 2 checks the CAS and returns because Thread1 hasn't yet released it
     * <li>Thread 1 releases the CAS
     * </ol>
     * The result here is that Thread 2 didn't go into the remove() processing and returned, and Thread 1 didn't see
     * the new message and therefore returned as well. Result: we have an unprocessed message in 'msgs' !
     * @param processing
     * @return
     */

Let's discuss this tomorrow

> UNICAST: high contention
> ------------------------
>
>                 Key: JGRP-1043
>                 URL: https://jira.jboss.org/jira/browse/JGRP-1043
>             Project: JGroups
>          Issue Type: Task
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 2.6.13, 2.8
>
>
> If UNICAST receives a high number of messages *and* sends a high number of messages concurrently, there will be a lot of retransmissions. Reason is contention on 'connections', with up- and down messages accessing it concurrently. Both up- and down- messages impede each other and thus messages are not received in time, causing retransmissions (ACKs to be resent).
> SOLUTION:
> #1 Reduce contention and turn 'connections' from HashMap into ConcurrentHashMap
> #2 Break 'connections' into a send-table and receive-table: sends and acks access the send-table, receives the receive-table

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira