[jboss-jira] [JBoss JIRA] Updated: (JGRP-1134) UNICAST.down(): move add to retransmitter out of the lock scope

Wed Jan 20 07:27:47 EST 2010

     [ https://jira.jboss.org/jira/browse/JGRP-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban updated JGRP-1134:
---------------------------

    Fix Version/s: 2.10
                       (was: 2.9)

Implemented solution #1. Needs to be revisited in 2.10, ie. add solution #2 to it

> UNICAST.down(): move add to retransmitter out of the lock scope
> ---------------------------------------------------------------
>
>                 Key: JGRP-1134
>                 URL: https://jira.jboss.org/jira/browse/JGRP-1134
>             Project: JGroups
>          Issue Type: Task
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 2.10
>
>
> In UNICAST.down(), we acquire a lock per sender to which we send a message:
>             entry.lock(); // threads will only sync if they access the same entry
>                 try {
>                     seqno=entry.sent_msgs_seqno;
>                     send_conn_id=entry.send_conn_id;
>                     hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
>                     msg.putHeader(getName(), hdr);
>                     entry.sent_msgs.add(seqno, msg);  // add *including* UnicastHeader, adds to retransmitter
>                     entry.sent_msgs_seqno++;
>                 }
>                 finally {
>                     entry.unlock();
>                 }
> the code
> entry.sent_msgs.add() 
> is costly as it adds the message to the hashmap, but also to the retransmitter, which schedules a timer task etc.
> The temp solution is to split add(0 into 2 part, which add the message to the hashmap (fast) and to the retransmitter (costly). The costly part is moved outside the lock scope, for example:
>        entry.lock(); // threads will only sync if they access the same entry
>                 try {
>                     seqno=entry.sent_msgs_seqno;
>                     send_conn_id=entry.send_conn_id;
>                     hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
>                     msg.putHeader(getName(), hdr);
>                     entry.sent_msgs.addToMessages(seqno, msg);  // add *including* UnicastHeader, adds to hashmap           
>                     entry.sent_msgs_seqno++;
>                 }
>                 finally {
>                     entry.unlock();
>                 }
>                 entry.sent_msgs.addToRetransmitter(seqno, msg);  // adds to retransmitter
> However, the issie is if the addition to the retransmitter fails (e.g. due to an OOME): then we'd have a message gap on the receiver !
> SOLUTION:
> #1 Do the add to the retransmitter in a loop. If there's a failure, sleep a bit and try again. Increase the sleep time and so on. Not very nice code, but works and doesn't ever *lose* a message. OK, if we get OOMEs, then sth's wrong anyway, but this covers temp OOMEs
> #2 If there's an issue, set a flag. Next time around, we check the flag. If it is set, we re-add all messages in the hashmap into the retransmitter. Involves locking of the hashmaps and retransmitter, but that's OK since this case should almost never happen anyway !

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira