[jboss-jira] [JBoss JIRA] Updated: (JGRP-1134) UNICAST.down(): move add to retransmitter out of the lock scope
Bela Ban (JIRA)
jira-events at lists.jboss.org
Wed Jan 20 07:27:47 EST 2010
[ https://jira.jboss.org/jira/browse/JGRP-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bela Ban updated JGRP-1134:
---------------------------
Fix Version/s: 2.10
(was: 2.9)
Implemented solution #1. Needs to be revisited in 2.10, ie. add solution #2 to it
> UNICAST.down(): move add to retransmitter out of the lock scope
> ---------------------------------------------------------------
>
> Key: JGRP-1134
> URL: https://jira.jboss.org/jira/browse/JGRP-1134
> Project: JGroups
> Issue Type: Task
> Reporter: Bela Ban
> Assignee: Bela Ban
> Fix For: 2.10
>
>
> In UNICAST.down(), we acquire a lock per sender to which we send a message:
> entry.lock(); // threads will only sync if they access the same entry
> try {
> seqno=entry.sent_msgs_seqno;
> send_conn_id=entry.send_conn_id;
> hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
> msg.putHeader(getName(), hdr);
> entry.sent_msgs.add(seqno, msg); // add *including* UnicastHeader, adds to retransmitter
> entry.sent_msgs_seqno++;
> }
> finally {
> entry.unlock();
> }
> the code
> entry.sent_msgs.add()
> is costly as it adds the message to the hashmap, but also to the retransmitter, which schedules a timer task etc.
> The temp solution is to split add(0 into 2 part, which add the message to the hashmap (fast) and to the retransmitter (costly). The costly part is moved outside the lock scope, for example:
> entry.lock(); // threads will only sync if they access the same entry
> try {
> seqno=entry.sent_msgs_seqno;
> send_conn_id=entry.send_conn_id;
> hdr=new UnicastHeader(UnicastHeader.DATA, seqno, send_conn_id, seqno == DEFAULT_FIRST_SEQNO);
> msg.putHeader(getName(), hdr);
> entry.sent_msgs.addToMessages(seqno, msg); // add *including* UnicastHeader, adds to hashmap
> entry.sent_msgs_seqno++;
> }
> finally {
> entry.unlock();
> }
> entry.sent_msgs.addToRetransmitter(seqno, msg); // adds to retransmitter
> However, the issie is if the addition to the retransmitter fails (e.g. due to an OOME): then we'd have a message gap on the receiver !
> SOLUTION:
> #1 Do the add to the retransmitter in a loop. If there's a failure, sleep a bit and try again. Increase the sleep time and so on. Not very nice code, but works and doesn't ever *lose* a message. OK, if we get OOMEs, then sth's wrong anyway, but this covers temp OOMEs
> #2 If there's an issue, set a flag. Next time around, we check the flag. If it is set, we re-add all messages in the hashmap into the retransmitter. Involves locking of the hashmaps and retransmitter, but that's OK since this case should almost never happen anyway !
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list