[jboss-jira] [JBoss JIRA] Commented: (JGRP-1161) Repeated ERROR logging from MPING PingSenderTask on Solaris

Sat Mar 13 11:38:37 EST 2010

    [ https://jira.jboss.org/jira/browse/JGRP-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12519722#action_12519722 ] 

Bela Ban commented on JGRP-1161:
--------------------------------

The problem is that the PingSender task is scheduled to execute at fixed delays, so with timeout=3000 and num_ping_requests=2, it'll execute at time 0, 1500, 3000, 4500, 6000 and so on, and will never stop. The stop() method is then called by Discovery.findInitialMembers(), and stop() interrupts the task, so we'll see the InterruptedException.

Solution #1: remove the warning, so we don't see it in the log
Solution #2: only run the task num_ping_requests times. After that, have the task cancel itself, so before the next execution, it is removed from the timer

Also, if timeout=3000, then execution #3 is at T=3000. It could happen that the 3rd execution and stop() coincide, and we'd still see the exception. The execution times, should therefore be changed to be well within the timeout, e.g. 0, 1000, 2000.

> Repeated ERROR logging from MPING PingSenderTask on Solaris
> -----------------------------------------------------------
>
>                 Key: JGRP-1161
>                 URL: https://jira.jboss.org/jira/browse/JGRP-1161
>             Project: JGroups
>          Issue Type: Bug
>    Affects Versions: 2.6.10
>         Environment: Solaris server (not sure what release), JBoss 5.1.0.GA
>            Reporter: Brian Stansberry
>            Assignee: Bela Ban
>             Fix For: 2.10
>
>         Attachments: probe.txt
>
>
> AS 5.1.0.GA user (see forum thread) reports repeated logging of java.io.InterruptedIOException from MPING, coming from the task thread that's running Discovery$PingSenderTask. The thread is interrupted during the send of the discovery packet. From the pattern of logging (repeated, and after initial cluster formation) my assumption is the task is triggered by MERGE2. My assumption is also that the thread is interrupted because Discovery.findInitialMembers() has timed out waiting for a response, and has called PingSenderTask.stop(), which ultimately interrupts the task thread.  The MPING discovery timeout is 2 secs; the unusual thing here is the discovery packet hasn't been sent w/ in 2 seconds.
> There are really two aspects to this JIRA:
> 1) See if there is any problem with the way JGroups is scheduling the task, or configuring the multicast socket etc. It looks OK to me, and quite possibly this is due to an environmental issue on the user's system. Would be good to understand though, as this is the second report I've had of this general problem.
> 2) Change MPING.sendMcastDiscoveryRequest to handle InterruptedIOException or let it propagate. I see PingSenderTask.run() is coded to handle InterruptedIOException by logging a WARN, but MPING.sendMcastDiscoveryRequest doesn't let it propagate and instead logs it as an ERROR with a stack trace.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://jira.jboss.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira