[jboss-jira] [JBoss JIRA] Commented: (JGRP-1179) Incoming PingRsp is ignored despite being sent by a Coordinator.
Renaud Devarieux (JIRA)
jira-events at lists.jboss.org
Thu Apr 1 05:36:37 EDT 2010
[ https://jira.jboss.org/jira/browse/JGRP-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12523357#action_12523357 ]
Renaud Devarieux commented on JGRP-1179:
----------------------------------------
I have not been able to reproduce this issue using 2.10. I need yet to check if the control of the logical and physical address of the GET_MBRS_RSP allowing to overwrite is really the cause of the improving but I am confident it is.
However I ran into other issues tied to PING/Discovery. It's close but perhaps worth another issue. Your call Bela.
Basically, the Discovery sends up to n GET_MBRS_REQ to discover the members. Each GET_MBRS_REQ triggers a round of GET_MBRS_RSP which increase the initial_member count up to its limit in the Promise blocking the discovery. One round of GET_MBRS_RSP may not be sufficient to discover all the members, the second round of RSP then completes the count of the Promise, but depending on the order of reception of the RSP, the Promise condition may be signalled before all the RSP are processed, and those unprocessed RSP may belong to a Coordinator elected between the two REQ sent. => trouble.
About TCPPING I am clueless, I haven't tried anything TCP with Jgroups.
> Incoming PingRsp is ignored despite being sent by a Coordinator.
> ----------------------------------------------------------------
>
> Key: JGRP-1179
> URL: https://jira.jboss.org/jira/browse/JGRP-1179
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 2.6.9, 2.6.14
> Environment: Linux Red Hat Enterprise 5.0 kernel 2.6.18-8.el5 java 1.6.0_18
> Reporter: Renaud Devarieux
> Assignee: Bela Ban
> Fix For: 2.10
>
>
> I launch successively (nearly simultaneously) 5 nodes A B C D E using the same protocol stack and one channel to communicate between themselves.
> UDP(mcast_addr=231.8.8.8;mcast_port=45578):PING(num_initial_members=4):MERGE2:FD:VERIFY_SUSPECT:pbcast.NAKACK:pbcast.STABLE:FRAG2:pbcast.GMS(shun=true):pbcast.FLUSH
> Often as not, it depends on the speed/rythm between each node launch, I get 2 views, ie {D} and {A B C E}.
> Merge occurs later but when it does it's a bit late for my application and I don't think I should have to handle one save in case of a real electric/network failure.
> I noticed that on D I was timing out (3000ms) on during the discovery process despite having received the 4 GET_MBRS_RSP of the other nodes. Then D would decide there was no coordinator outside and become coordinator itself.
> What seems to happen is D sends two GET_MBRS_REQ and A replies to both, but at the time of the first reply, A is not yet coordinator and when D receives the second response, A became coordinator but D ignores the response and doesn"t add it to its list of Responses.
> I have written a workaround in Discovery.Responses method addResponse, it seems to work for my case but I am afraid it would break something else I am not aware of.
> public void addResponse(PingRsp rsp) {
> if(rsp == null)
> return;
> promise.getLock().lock();
> try {
> //Workaround 29/03/2010
> int index = ping_rsps.indexOf(rsp);
> // equivalent to does not contain.
> if (index == -1) {
> ping_rsps.add(rsp);
> promise.getCond().signalAll();
> } else if (rsp.isCoord()) {
> PingRsp pr = ping_rsps.get(index);
>
> //Check if the already existing element is not server
> if (!pr.isCoord()) {
> ping_rsps.set(index, rsp);
> promise.getCond().signalAll();
> }
> }
> /*if(!ping_rsps.contains(rsp)) {
> ping_rsps.add(rsp);
> promise.getCond().signalAll();
> }*/ // Old JGroups code
> }
> finally {
> promise.getLock().unlock();
> }
> }
> Regards
> Renaud
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jboss-jira
mailing list