[jboss-jira] [JBoss JIRA] Updated: (JBAS-4287) run.sh can consume 100% single CPU resources on Solaris

Ian Springer (JIRA) jira-events at lists.jboss.org
Fri Jun 27 17:04:32 EDT 2008


     [ http://jira.jboss.com/jira/browse/JBAS-4287?page=all ]

Ian Springer updated JBAS-4287:
-------------------------------

    Attachment: JBAS-4287.patch

The attached patch (JBAS-4287.patch) should fix the infinite loop issue on Solaris, as well as when /bin/sh is linked to dash (e.g. on Ubuntu).

The new wait-for-exit code looks like this:

      # Wait until the background process exits
      WAIT_STATUS=128
      while [ "$WAIT_STATUS" -ge 128 ]; do
         wait $JBOSS_PID 2>/dev/null
         WAIT_STATUS=$?
         if [ "${WAIT_STATUS}" -gt 128 ]; then
            SIGNAL=`expr ${WAIT_STATUS} - 128`
            SIGNAL_NAME=`kill -l ${SIGNAL}`
            echo "*** JBossAS process (${JBOSS_PID}) received ${SIGNAL_NAME} signal. ***" >&2
         fi           
      done
      if [ "${WAIT_STATUS}" -lt 127 ]; then
         JBOSS_STATUS=$WAIT_STATUS
      else
         JBOSS_STATUS=0
      fi


This new while loop covers the different variations of the wait command:

1) bash on Linux: when the bash run.sh process receives a termination signal, 1st call to wait returns an exit status >128 that represents the signal received, 2nd call to wait returns the exit status of the JBoss process itself (typically 0)

2) dash on Linux: same as above. NOTE: as an earlier comment in this issue described, dash's wait will infinitely return 0, even after the JBossAS process has exited (it should return 127 once the process no longer exists), unless a sleep is added to the loop. this is no longer an issue, since the loop's condition has been changed so the loop will terminate on any wait status <128

3) bash on CygWin: when the bash run.sh process receives a termination signal, 1st call to wait returns an exit status >128 that represents the signal *the bash process* received (I believe this is a bug in CygWin bash), 2nd call to wait returns an exit status >128 that represents the signal *the JBossAS process* received, 3rd call to wait returns the exit status of the JBoss process itself (typically 0)

4) sh on Solaris: 1st call to wait returns the exit status of the JBoss process itself (typically 0)

I've tested all of the above environments. It would be good if someone else could test on HP-UX and AIX.



> run.sh can consume 100% single CPU resources on Solaris
> -------------------------------------------------------
>
>                 Key: JBAS-4287
>                 URL: http://jira.jboss.com/jira/browse/JBAS-4287
>             Project: JBoss Application Server
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Other
>    Affects Versions: JBossAS-4.0.5.GA
>         Environment: Solaris on SPARC (version 9 & 10)
>            Reporter: Quenten Alick
>         Assigned To: Dimitris Andreadis
>             Fix For: JBossAS-4.2.3.GA
>
>         Attachments: JBAS-4287.patch, run.sh, run.sh.patch
>
>
> When shutting down jboss the run.sh script remain alive and consumes 100% of a single CPUs resources.  run.sh needs to be killed off.
> To trigger the bug, you need to 
> 1) set LAUNCH_JBOSS_IN_BACKGROUND
> 2) start the jboss server normally
> 3) Once it's started stop the server
> At this point the JVM stops, however the run.sh script remains running consuming 100% of a single CPUs resources.
> The problem seems to be this bit of script, plus the fact that the script shebang is #!/bin/sh 
> while [ "$WAIT_STATUS" -ne 127 ]; do
>          JBOSS_STATUS=$WAIT_STATUS
>          wait $JBOSS_PID 2>/dev/null
>          WAIT_STATUS=$?
> done 
> On Solaris, /bin/sh is *real* bourne shell and the wait shell-built-in for /bin/sh on Solaris returns 0 (not 127) if the PID (passed as an argument) doesn't exist. The man page for wait states that this is the correct behaviour.  Therefore wait returns 0 and the while loop continues forever burning up CPU resources (until you kill it with one of the signals not being trapped). 
> Here's a link to the patch that may have introduced the bug.
> http://jira.jboss.com/jira/browse/JBAS-3748 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.jboss.com/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        



More information about the jboss-jira mailing list