[jboss-jira] [JBoss JIRA] (JGRP-1402) NAKACK: too much lock contention between sending and receiving messages

Wednesday, 14 December 2011

    [
https://issues.jboss.org/browse/JGRP-1402?page=com.atlassian.jira.plugin....
] 

Bela Ban commented on JGRP-1402:
--------------------------------

The cost is:

NRW.removeAll():
- Acquires the lock and tries to remove as many messages as possible

NRW.add():
- If the message is the expected message: just add
- If the message is a missing message: 
  - Add and remove from retransmitter
  - The retransmitter maintains a bitset of missing messages for each range, that's a
hashmap and a bitset operation
- If the message is > than the next expected seqno:
  - Add and add to retransmitter
  - This creates a range, a bitset and an an entry in the timer

See https://issues.jboss.org/browse/JGRP-1396 for details

...
 NAKACK: too much lock contention between sending and receiving
messages
 -----------------------------------------------------------------------

                 Key: JGRP-1402
                 URL: https://issues.jboss.org/browse/JGRP-1402
             Project: JGroups
          Issue Type: Enhancement
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 3.2

 When we have only 1 node in a cluster, sending and receiving messages creates a lot of
contention in NakReceiverWindow (NRW). To reproduce:
 - Start MPerf
 - Press '1' to send 1 million messages
 - The throughput is ca 20-30 MB/sec, compared to 140 MB when running multiple instances
of MPerf on the same box !
 In the profiler, we can see that the write lock in NRW makes up for ca 99% of all the
blocking ! Ca. half is caused by NRW.add(), the other half by NRW.removeMany().
 The reason is that, when we send a message, it is added to the NRW (add()). The incoming
thread then tries to remove as many messages as possible (removeMany()), and blocks
messages being added to NRW by the sender, and vice versa; the removeMany() method is
blocked accessing the NRW by many add()s.
 SOLUTION 1:
 - If we only have 1 member in the cluster, call removeMany() immediately after NRW.add()
on the sender. No need for a message to be processed by the incoming thread pool, if
we're the only member in the cluster
 - The downside here is that we don't reduce the contention on NRW if we have more
than 1 member: this lock contention may even slow down the case of more than 1 member
clusters !
 SOLUTION 2:
 - Make NRW.add() and remove() more efficient, and contend less on the same lock. 
 - [1] should help.
 [1] https://issues.jboss.org/browse/JGRP-1396 
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] (JGRP-1402) NAKACK: too much lock contention between sending and receiving messages