I have implemented the following but still need to test some corner cases
- PM starts up and listens on a socket on a port (PPM) for connections from the processes it manages.
- PM starts SM passing in PPM, PM’s host address and ‘ServerManager’ as the name
- SM opens a socket on a different port (PSM) which listens for connections from the Server processes and from the Domain Controller (DC).
- SM initiates communication with PM, by connecting to port PPM. The first command it sends is 'CONNECTED ServerManager', which helps PM associate the socket with the correct ManagedProcess.
- For each Server configured in SM:
- SM tells PM to add Server
- SM tells PM to start Server process
- PM launches the Server process, passing in PPM, PM_ADDRESS, PSM, SM_ADDRESS and the SERVER_NAME
- Server initiates communication with PM, by connecting to port PPM. The first command it sends is 'CONNECTED <SERVER_NAME>', which helps PM associate the socket with the correct ManagedProcess.
- Server starts listening for commands on the PM socket.
- Server initiates communication with SM, by connecting to port PSM. The first command it sends is 'CONNECTED <SERVER_NAME>', which helps SM associate the socket with the correct Server proxy.
- Server sends the ‘SERVER_AVAILABLE’ command on the SM socket
- Server starts listening for commands on the SM socket
- SM sends the ‘START_SERVER serverConfig’ message to the server via the Server’s socket
- Server parses the serverConfig, starts up and sends to SM either
- ‘SERVER_STARTED’ if successful.
- ‘SERVER_START_FAILED’ if failed
- SM tells PM to stop process
- If Server auto-restart=true and number retries is < respawn_policy_max SM repeats 2.3.2 and 2.3.3 - The respawn policy should be configurable (https://jira.jboss.org/browse/JBAS-8390)
- Otherwise tell PM to remove process
- While a ManagedProcess is registered as started in PM
- Processes are connected to the PM socket
- The ManagedProcess monitors whether the process is still alive (with a thread doing Process.waitFor())
- If a Server process goes down PM stops the process and sends ‘DOWN <SERVER_NAME>’ to SM on the PM-SM connection.
- SM respawns the server process according to the rules in 2.3.3.2.
- If the ServerManager process goes down PM respawns it as in 6.3
- To shut down a server
- SM sends ‘STOP_SERVER’ to server.
- Server closes down
- Server sends ‘SERVER_STOPPED’ to SM.
- SM tells PM to stop the Server process
- SM tells PM to remove the Server process
- Closing down everything
- Message to shutdown comes from
- SM gets SHUTDOWN command from DC or management interface
- Shutdown hook in PM
- Send ‘SHUTDOWN_SERVERS’ command to SM
- For each server do 4 to close it down
- PM sends 'SHUTDOWN' message to SM which closes down SM as in 6
- Restarting SM
- SM process is stopped by
- Message from DC
- Process is killed
- SM is down...
- SM process is started
- PM starts SM passing in PPM, PM’s host address and ‘ServerManager’ as the name along with the -restarted-server-manager flag.
- See 2.1
- See 2.2
- SM sends the ‘RECONNECT_SERVERS <SM_ADDRESS> <PSM>’ command to PM
- For each Server process PM sends ‘RECONNECT_SERVER_MANAGER <SM_ADDRESS> <PSM>’
- Server reconnects to SM as in 2.3.2.4
- Server sends ‘SERVER_RECONNECT_STATUS <Current_State>’ to SM
- if the server is not in the starting, started, stopping or stopped state (I added some basic state management) SM does 2.3.3