[jboss-jira] [JBoss JIRA] Resolved: (JGRP-1134) UNICAST.down(): move add to retransmitter out of the lock scope

Friday, 12 March 2010

     [
https://jira.jboss.org/jira/browse/JGRP-1134?page=com.atlassian.jira.plug...
]

Bela Ban resolved JGRP-1134.
----------------------------

    Resolution: Done

This solution seems to work - haven't seen an issue in any of the manual, unit or
performance tests...

Plus, UNICAST2 might soon replace UNICAST, so this issue is not that important any longer

...
 UNICAST.down(): move add to retransmitter out of the lock scope
 ---------------------------------------------------------------

                 Key: JGRP-1134
                 URL: https://jira.jboss.org/jira/browse/JGRP-1134
             Project: JGroups
          Issue Type: Task
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 2.10

 In UNICAST.down(), we acquire a lock per sender to which we send a message:
             entry.lock(); // threads will only sync if they access the same entry
                 try {
                     seqno=entry.sent_msgs_seqno;
                     send_conn_id=entry.send_conn_id;
                     hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno
== DEFAULT_FIRST_SEQNO);
                     msg.putHeader(getName(), hdr);
                     entry.sent_msgs.add(seqno, msg);  // add *including* UnicastHeader,
adds to retransmitter
                     entry.sent_msgs_seqno++;
                 }
                 finally {
                     entry.unlock();
                 }
 the code
 entry.sent_msgs.add() 
 is costly as it adds the message to the hashmap, but also to the retransmitter, which
schedules a timer task etc.
 The temp solution is to split add(0 into 2 part, which add the message to the hashmap
(fast) and to the retransmitter (costly). The costly part is moved outside the lock scope,
for example:
        entry.lock(); // threads will only sync if they access the same entry
                 try {
                     seqno=entry.sent_msgs_seqno;
                     send_conn_id=entry.send_conn_id;
                     hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno
== DEFAULT_FIRST_SEQNO);
                     msg.putHeader(getName(), hdr);
                     entry.sent_msgs.addToMessages(seqno, msg);  // add *including*
UnicastHeader, adds to hashmap           
                     entry.sent_msgs_seqno++;
                 }
                 finally {
                     entry.unlock();
                 }
                 entry.sent_msgs.addToRetransmitter(seqno, msg);  // adds to
retransmitter
 However, the issie is if the addition to the retransmitter fails (e.g. due to an OOME):
then we'd have a message gap on the receiver !
 SOLUTION:
 #1 Do the add to the retransmitter in a loop. If there's a failure, sleep a bit and
try again. Increase the sleep time and so on. Not very nice code, but works and
doesn't ever *lose* a message. OK, if we get OOMEs, then sth's wrong anyway, but
this covers temp OOMEs
 #2 If there's an issue, set a flag. Next time around, we check the flag. If it is
set, we re-add all messages in the hashmap into the retransmitter. Involves locking of the
hashmaps and retransmitter, but that's OK since this case should almost never happen
anyway ! 
-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

[jboss-jira] [JBoss JIRA] Resolved: (JGRP-1134) UNICAST.down(): move add to retransmitter out of the lock scope