]
Bela Ban resolved JGRP-180.
---------------------------
Resolution: Done
With the concurrent stack, we introduced return values for down() and up(), so now every
sender of an event can decide for itself whether to resend the event or not.
Harden stack to prevent loss of intra-stack events
--------------------------------------------------
Key: JGRP-180
URL:
https://jira.jboss.org/jira/browse/JGRP-180
Project: JGroups
Issue Type: Task
Affects Versions: 2.2.8, 2.2.9, 2.2.9.1
Reporter: Bela Ban
Assignee: Bela Ban
Fix For: 2.x
When an *intra-stack* event is lost, this can be serious (*inter-stack* message are
simply retransmitted, so that is not critical).
Losses are mainly caused by runtime exceptions, e.g. OutOfMemory exception (OOM), or
resource problems, and to a lesser extent by program bugs.
For example, when a user sends a message using Channel.send(), and there is an exception,
then the user will simply send the message again, possibly after fixing the cause of the
exception.
However, for events such as VIEW_CHANGE that are multicast by the GMS protocol, a loss
can be serious: in this case, the view would never be received !
The same applies to the up direction: when NAKACK has successfully delivered a message,
if that message is lost travelling between NAKACK and the Channel, then is serious
(essentially loss of that message).
So while these error situations don't occur very often, if they do occur, they have
serious consequences.
SOLUTION:
- Do nothing for user messages: Channel.send() throws an exception, user has to resend
message. Note that in
http://jira.jboss.com/jira/browse/JGRP-179, we made retransmission
handling atomic, e.g. if there is an exception, there will *not* be a gap in the seqnos
for NAKACK and UNICAST
- Provide either a pass{Up/Down}Reliably() method or an Event with a RELIABLE field, such
that this event needs to be acked. The sender (e.g. GMS on a VIEW_CHANGE) sends down the
message and waits until it gets an ACK, which could be sent by the NAKACK or UNICAST
protocols, or as last resort by the transport (TP). If the ACK is not received within M
ms, the event is resent.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: