[jboss-jira] [JBoss JIRA] (JGRP-1416) NAKACK / NAKACK2: shorten scope of seqno_lock

Mon Jan 16 06:44:18 EST 2012

     [ https://issues.jboss.org/browse/JGRP-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bela Ban updated JGRP-1416:
---------------------------

    Description: 
The code in NAKACK/NAKACK2.send() acquires seqno_lock like this:
{code:title=NAKACK.send()|borderStyle=solid}
seqno_lock.lock();
try {
    try { // incrementing seqno and adding the msg to sent_msgs needs to be atomic
        msg_id=seqno +1;
        msg.putHeader(this.id, NakAckHeader.createMessageHeader(msg_id));
        win.add(msg_id, msg);
        seqno=msg_id;
    }
    catch(Throwable t) {
        // throw exception
    }
}
finally {
    seqno_lock.unlock();
}
{code}

This slows concurrent sender threads down if add() takes a while. Method add() can take a while if we have many concurrent adds and removes.
The reason we use seqno_lock is to prevent gaps in the sequence numbers, e.g. if we have an exception in add().

SOLUTION:
- Assign a new msg_id by incrementing seqno *atomically* (seqno_lock is now de-scoped to only increment seqno, might replace this with an AtomicLong anyway)
- In a loop: add the message (calling add()), until add() returns successfully, then break

Example:
{code}
msg_id=seqno.incrementAndGet(); // uses an AtomicLong
while(running) { // maybe bound with a counter
  try {
      msg.adddHeader(...);
      add();
      break;
  }
  catch(Throwable t) {
      // log
  }
}
{code}

  was:
The code in NAKACK/NAKACK2.send() acquires seqno_lock like this:
{code:title=NAKACK.send()|borderStyle=solid}
        seqno_lock.lock();
        try {
            try { // incrementing seqno and adding the msg to sent_msgs needs to be atomic
                msg_id=seqno +1;
                msg.putHeader(this.id, NakAckHeader.createMessageHeader(msg_id));
                win.add(msg_id, msg);
                seqno=msg_id;
            }
            catch(Throwable t) {
                throw new RuntimeException("failure adding msg " + msg + " to the retransmit table for " + local_addr, t);
            }
        }
        finally {
            seqno_lock.unlock();
        }
{code}

This slows concurrent sender threads down if add() takes a while. Method add() can take a while if we have many concurrent adds and removes.
The reason we use seqno_lock is to prevent gaps in the sequence numbers, e.g. if we have an exception in add().

SOLUTION:
- Assign a new msg_id by incrementing seqno *atomically* (seqno_lock is now de-scoped to only increment seqno, might replace this with an AtomicLong anyway)
- In a loop: add the message (calling add()), until add() returns successfully, then break

Example:
{code}
msg_id=seqno.incrementAndGet(); // uses an AtomicLong
while(running) { // maybe bound with a counter
  try {
      msg.adddHeader(...);
      add();
      break;
  }
  catch(Throwable t) {
      // log
  }
}
{code}

> NAKACK / NAKACK2: shorten scope of seqno_lock
> ---------------------------------------------
>
>                 Key: JGRP-1416
>                 URL: https://issues.jboss.org/browse/JGRP-1416
>             Project: JGroups
>          Issue Type: Enhancement
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>            Priority: Minor
>             Fix For: 3.0.3, 3.1
>
>
> The code in NAKACK/NAKACK2.send() acquires seqno_lock like this:
> {code:title=NAKACK.send()|borderStyle=solid}
> seqno_lock.lock();
> try {
>     try { // incrementing seqno and adding the msg to sent_msgs needs to be atomic
>         msg_id=seqno +1;
>         msg.putHeader(this.id, NakAckHeader.createMessageHeader(msg_id));
>         win.add(msg_id, msg);
>         seqno=msg_id;
>     }
>     catch(Throwable t) {
>         // throw exception
>     }
> }
> finally {
>     seqno_lock.unlock();
> }
> {code}
> This slows concurrent sender threads down if add() takes a while. Method add() can take a while if we have many concurrent adds and removes.
> The reason we use seqno_lock is to prevent gaps in the sequence numbers, e.g. if we have an exception in add().
> SOLUTION:
> - Assign a new msg_id by incrementing seqno *atomically* (seqno_lock is now de-scoped to only increment seqno, might replace this with an AtomicLong anyway)
> - In a loop: add the message (calling add()), until add() returns successfully, then break
> Example:
> {code}
> msg_id=seqno.incrementAndGet(); // uses an AtomicLong
> while(running) { // maybe bound with a counter
>   try {
>       msg.adddHeader(...);
>       add();
>       break;
>   }
>   catch(Throwable t) {
>       // log
>   }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira