[jboss-jira] [JBoss JIRA] (JGRP-1644) NAKACK2 violates FIFO property
Vadim Tsesko (JIRA)
jira-events at lists.jboss.org
Tue Jul 2 06:06:20 EDT 2013
[ https://issues.jboss.org/browse/JGRP-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786837#comment-12786837 ]
Vadim Tsesko commented on JGRP-1644:
------------------------------------
It seems that you are right. We have rather big messages and fragmentation is necessary and at the same time many fragments are redelivered because of network flaps between datacenters.
I can test against the new version of JGroups, but I am not able to put some JAR into the project. Can you publish Maven artifact -- something like {{jgroups-3.3-SNAPSHOT}} to http://search.maven.org? I can't compile Maven artifact from {{3.3}} branch myself:
{code}
~/devel/JGroups$ mvn compile
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.jgroups:jgroups:bundle:3.3.2.Final
[WARNING] 'build.plugins.plugin.version' for org.apache.felix:maven-bundle-plugin is missing. @ line 242, column 21
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 131, column 21
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-jar-plugin is missing. @ line 233, column 21
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building JGroups 3.3.2.Final
[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 0.877s
[INFO] Finished at: Tue Jul 02 14:02:26 MSK 2013
[INFO] Final Memory: 8M/127M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project jgroups: Could not resolve dependencies for project org.jgroups:jgroups:bundle:3.3.2.Final: Could not find artifact com.sun:tools:jar:1.6 at specified path /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/../Classes/classes.jar -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
{code}
> NAKACK2 violates FIFO property
> ------------------------------
>
> Key: JGRP-1644
> URL: https://issues.jboss.org/browse/JGRP-1644
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.3.1
> Environment: Ubuntu 12.04 LTS, kernel 3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux, Java 1.7.0_21
> Reporter: Vadim Tsesko
> Assignee: Bela Ban
> Fix For: 3.4
>
> Attachments: TCP-NAKACK2.png, UDP-NAKACK2-NAKACK.png
>
>
> In the [documentation documentation|http://www.jgroups.org/manual/html/protlist.html#ReliableMessageTransmission] it is stated that:
> {quote}
> NAKACK provides reliable delivery and FIFO (= First In First Out) properties for messages sent to all nodes in a cluster.
> {quote}
> and
> {quote}
> NAKACK2 was introduced in 3.1 and is a successor to NAKACK (at some point it will replace NAKACK). It has the same properties as NAKACK, but its implementation is faster and uses less memory, plus it creates fewer tasks in the timer.
> {quote}
> I have observed that sometimes multicast messages are received out of order.
> We use the following protocol stack configuration:
> {code:xml}
> <config xmlns="urn:org:jgroups"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.3.xsd">
> <UDP bind_addr="match-interface:$interface"
> bind_interface="$interface"
> bind_port="$unicastPort"
> ip_ttl="128"
> mcast_addr="$multicastGroup"
> mcast_port="$multicastPort"
> singleton_name="udp-transport"/>
> <PING return_entire_cache="true"
> break_on_coord_rsp="false"/>
> <MERGE3/>
> <FD_SOCK/>
> <FD_ALL/>
> <VERIFY_SUSPECT/>
> <BARRIER/>
> <pbcast.NAKACK print_stability_history_on_failed_xmit="true"/>
> <pbcast.STABLE/>
> <pbcast.GMS/>
> <MFC max_credits="8M"/>
> <FRAG2/>
> <RSVP/>
> </config>
> {code}
> As you can see, mostly we use the defaults.
> The messages are being sent from a single thread using the following code:
> {code:java}
> channel.send(new Message(null, msg))
> {code}
> Each message has size from 300 KB up to 4 MB. The message rate is 1-5 messages per second.
> We have a sequential counter inside each message being sent. Sometimes the messages are received out of order, for instance:
> {code}
> #1198
> #1199
> #1200
> #1202
> #1201
> #1203
> #1204
> {code}
> If we replace {{NAKACK2}} by {{NAKACK}} the problem disappears -- everything works as expected (FIFO).
> If we replace JGroups-based transport by ZeroMQ-based transport (actually running over EPGM and being used for a year) everything works as expected (FIFO) -- just to let you know, that there are no bugs in out message numbering logic.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list