[jboss-jira] [JBoss JIRA] Created: (JGRP-746) FD: messages from members other than ping_dest causes missing-heartbeat count to be reset

Bela Ban (JIRA) jira-events at lists.jboss.org
Mon Apr 28 05:52:09 EDT 2008


FD: messages from members other than ping_dest causes missing-heartbeat count to be reset
-----------------------------------------------------------------------------------------

                 Key: JGRP-746
                 URL: http://jira.jboss.com/jira/browse/JGRP-746
             Project: JGroups
          Issue Type: Bug
            Reporter: Bela Ban
         Assigned To: Bela Ban
            Priority: Critical
             Fix For: 2.6.3, 2.7


[email from John Smith]
I'm not sure FD is behaving like it should.

I started a group with two members. I then suspended one instance with a kill -SIGSTOP. After a while I expected the FD protocol to suspect the suspended jvm but it did not do it.

I looked at FD code and it seems like messages that do not come from ping_dest reset num_tries and thus prevent the member from being suspected. Is this intended? Why would a message from self reset num_tries?

I'm using jgroups 2.6.2.

Here is and the relevant part of the jgroups logs:

10:10:48,291 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:10:48,291 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=0)
10:11:18,293 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:11:18,293 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=1)
10:11:48,294 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:11:48,294 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=2)
10:12:18,296 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:12:18,296 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=3)
10:12:48,299 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:12:48,299 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=4)
10:12:51,265 DEBUG [FD] received msg from 192.168.128.129:57685 (counts as ack) 10:13:18,300 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:13:19,336 DEBUG [FD] received msg from 192.168.128.129:57685 (counts as ack)
10:13:45,988 DEBUG [FD] received msg from 192.168.128.129:57685 (counts as ack)
10:13:48,302 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:14:18,303 DEBUG [FD] sending are-you-alive msg to 192.168.128.105:47870 (own address=192.168.128.129:57685)
10:14:18,303 DEBUG [FD] heartbeat missing from 192.168.128.105:47870
(number=0)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list