[jboss-jira] [JBoss JIRA] Created: (JGRP-1134) UNICAST.down(): move add to retransmitter out of the lock scope

Bela Ban (JIRA) jira-events at lists.jboss.org
Wed Jan 20 06:58:54 EST 2010


UNICAST.down(): move add to retransmitter out of the lock scope
---------------------------------------------------------------

                 Key: JGRP-1134
                 URL: https://jira.jboss.org/jira/browse/JGRP-1134
             Project: JGroups
          Issue Type: Task
            Reporter: Bela Ban
            Assignee: Bela Ban
             Fix For: 2.9


In UNICAST.down(), we acquire a lock per sender to which we send a message:

            entry.lock(); // threads will only sync if they access the same entry
                try {
                    seqno=entry.sent_msgs_seqno;
                    send_conn_id=entry.send_conn_id;
                    hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
                    msg.putHeader(getName(), hdr);
                    entry.sent_msgs.add(seqno, msg);  // add *including* UnicastHeader, adds to retransmitter
                    entry.sent_msgs_seqno++;
                }
                finally {
                    entry.unlock();
                }

the code

entry.sent_msgs.add() 

is costly as it adds the message to the hashmap, but also to the retransmitter, which schedules a timer task etc.

The temp solution is to split add(0 into 2 part, which add the message to the hashmap (fast) and to the retransmitter (costly). The costly part is moved outside the lock scope, for example:

       entry.lock(); // threads will only sync if they access the same entry
                try {
                    seqno=entry.sent_msgs_seqno;
                    send_conn_id=entry.send_conn_id;
                    hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
                    msg.putHeader(getName(), hdr);
                    entry.sent_msgs.addToMessages(seqno, msg);  // add *including* UnicastHeader, adds to hashmap           
                    entry.sent_msgs_seqno++;
                }
                finally {
                    entry.unlock();
                }

                entry.sent_msgs.addToRetransmitter(seqno, msg);  // adds to retransmitter


However, the issie is if the addition to the retransmitter fails (e.g. due to an OOME): then we'd have a message gap on the receiver !

SOLUTION:
#1 Do the add to the retransmitter in a loop. If there's a failure, sleep a bit and try again. Increase the sleep time and so on. Not very nice code, but works and doesn't ever *lose* a message. OK, if we get OOMEs, then sth's wrong anyway, but this covers temp OOMEs

#2 If there's an issue, set a flag. Next time around, we check the flag. If it is set, we re-add all messages in the hashmap into the retransmitter. Involves locking of the hashmaps and retransmitter, but that's OK since this case should almost never happen anyway !

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list