[jboss-jira] [JBoss JIRA] (JGRP-1317) Compress Digest and MutableDigest

Fri Sep 6 03:21:04 EDT 2013

    [ https://issues.jboss.org/browse/JGRP-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802148#comment-12802148 ] 

Bela Ban edited comment on JGRP-1317 at 9/6/13 3:20 AM:
--------------------------------------------------------

Sending the digest and associating it (at the receiving end) with a view proved to be problematic in some cases: 
* If we sent a digest with view-id V1, but the sender already installed V2, the digest would not be associated with a view
* In some cases, a view was meaningless, e.g.
** On a merge, each subgroup coordinator (e.g. X in \{X,Y,Z\}) needs to get the digest from all of its subgroup members
** Each member returns its own digest, so Y returns Y=[50:51]. However, Y's view would be \{X,Y,Z\}, so this doesn't work, as Y would return a digest whose seqnos for X and Z would be 0.
* On a state transfer, if the membership is \{A,B,C\}, we'd return the state and digest \{A,B,C\}. However,
** If D joined meanwhile, the receiver would have a different view-id and would not be able to associate the received digest with a view.
** Even if the receiver was able to get the view, if B left, the digest for \{A,B,C\} would be associated with view \{A,C\}, which would lead to incorrect seqnos

Therefore, digest needs to be changed to include either a view (*reference digest*) or a membership list (*full digest*).

The full digest is a bit bigger in memory and when marshalled over the wire, but most digests are unmarshalled, and then some entries in an xmit-table are added/changed or removed, and then the digest is discarded, so this shouldn't be an issue. The only place where we keep a digest around is in the coordinator in STABLE. So this shouldn't increase memory consumption.

So when would be use ful digests and when ref digests ?

# JOIN: the existing members would not receive any digest with the new view. The joiner would receive the new view and the *full digest*.
# STABLE: members send the *ref digest*; as it is associated with a view-id, the coordinator discards STABLE messages which have a different view-id.
# STATE transfer: a state provider always returns the state and the *full digest*
# MERGE: the digests for the subgroup are always returned as a *full digest* (as discussed above). However, this only includes *1 member* anyway, so no harm done. The MERGE-REQs always return the *full digests* as well. However, the merge view installation includes the *ref digest* only, as the MergeView and the ref digest are shipped together, and the latter can have a reference to the former.

So if we assume that STABLE messages are the most frequent (ref digest), followed by view installation to existing members (ref digest), followed by JOINs (full digest), followed by STATE transfer (full digest) and merge (full and ref digest), then we can see that the cost incurred is not so bad.

      was (Author: belaban):
    Sending the digest and associating it (at the receiving end) with a view proved to be problematic in some cases: 
* If we sent a digest with view-id V1, but the sender already installed V2, the digest would not be associated with a view
* In some cases, a view was meaningless, e.g.
** On a merge, each subgroup coordinator (e.g. X in \{X,Y,Z\}) needs to get the digest from all of its subgroup members
** Each member returns its own digest, so Y returns Y=[50:51]. However, Y's view would be \{X,Y,Z\}, so this doesn't work, as Y would return a digest whose seqnos for X and Z would be 0.
* On a state transfer, if the membership is \{A,B,C\}, we'd return the state and digest \{A,B,C\}. However,
** If D joined meanwhile, the receiver would have a different view-id and would not be able to associate the received digest with a view.
** Even if the receiver was able to get the view, if B left, the digest for \{A,B,C\} would be associated with view \{A,C\}, which would lead to incorrect seqnos

Therefore, digest needs to be changed to include either a view (*reference digest*) or a membership list (*full digest*).

The full digest is a bit bigger in memory and when marshalled over the wire, but most digests are unmarshalled, and then some entries in an xmit-table are added/changed or removed, and then the digest is discarded, so this shouldn't be an issue. The only place where we keep a digest around is in the coordinator in STABLE.

So when would be use ful digests and when ref digests ?

# JOIN: the existing members would not receive any digest with the new view. The joiner would receive the new view and the *full digest*.
# STABLE: members send the *ref digest*; as it is associated with a view-id, the coordinator discards STABLE messages which have a different view-id.
# STATE transfer: a state provider always returns the state and the *full digest*
# MERGE: the digests for the subgroup are always returned as a *full digest* (as discussed above). However, this only includes *1 member* anyway, so no harm done. The MERGE-REQs always return the *full digests* as well. However, the merge view installation includes the *ref digest* only, as the MergeView and the ref digest are shipped together, and the latter can have a reference to the former.

So if we assume that STABLE messages are the most frequent (ref digest), followed by view installation to existing members (ref digest), followed by JOINs (full digest), followed by STATE transfer (full digest) and merge (full and ref digest), then we can see that the cost incurred is not so bad.

> Compress Digest and MutableDigest
> ---------------------------------
>
>                 Key: JGRP-1317
>                 URL: https://issues.jboss.org/browse/JGRP-1317
>             Project: JGroups
>          Issue Type: Feature Request
>            Reporter: Bela Ban
>            Assignee: Bela Ban
>             Fix For: 3.4
>
>
> For large clusters, STABLE messages are quite large, and should be compressed to be sent over the wire. 
> STABLE messages are sent between members in the same view, so we could only send the ViewId + highest_delivered/highest_received seqnos.
> Everybody who receives a STABLE message grabs the View associated with the ViewId (should be the current view !) and creates a Digest based on the View and the long[] array.
> Further optimization:
> - Canonicalize digests: if everyone has (14) 20, 22, then we could write it once, give it an ID of (say) 1 and then only refer to 1 again if we encounter the same digest. Actually, as a matter of fact, most of the digests would be the same, so this optimization could have a big effect !

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira