JBoss Community

ServerManager-Server(-ProcessManager) communication

reply from Brian Stansberry in JBoss AS7 Development - View the full discussion

Agreed on SM should be responsible for respawning servers. But I'll comment further in another post.

 

Re:

 

2.4.3.2.1 and .2 -- those seem correct. Otherwise there is a kind of zombie process. I suppose 2.4.3.2.2 (the remove) could be skipped if the SM knew it was going to restart the process.

 

3.1 I like the idea of the PM being responsible for failure detection. For one thing it has the stdio streams as a fallback to confirm server failure. Contrived example: admin restarts interface lo so all the socket connections break. But the servers are all still working and can reconnect.

 

3.2.2 I think the PM should restart the SM. The PM has no interface beyond whatever it reads in from the command line. And the servers expect to be managed via the SM. So if the SM is allowed to die and not restart, the entire set of processes can't be managed except via a kill command.

 

The original design was to always have a running SM. We then split out the PM as a separate process just to

  • Allow the SM to be upgraded/patched w/o requiring Servers to shut down. The PM would be dead simple with no dependencies and thus wouldn't need to be patched.
  • Make the Servers more reliable by having the process that consumes their stdio streams as simple as possible. So bugs in SM would not causes Servers to crash.

 

5 Besides the shutdown hook in the PM, the other reasons for closing down everything are

  • SM receives a command from the DC telling it to do so
  • If the SM exposes a remote managment interface, it receives a command via that telling it to do so

 

6.1.1 I don't think this applies. The PM has no remote interface that would let it receive an instruction to restart the SM. And I don't think any internal state change in the PM would trigger such a thing.

 

6.3.1 IMO it would be restarted by the PM.

 

6.3.4 Could the differentiator be a simple param passed by the PM to the SM on the command line?

 

Something to think about though is the SM needs get its internal state consistent with the actual state of affairs. What if it thinks there are 3 servers but actually there are 4? (That would be odd.) Or there are only two -- what triggers it to realize it never got a new connection from the 3rd? Right now the SM doesn't keep much state, but as we flesh it out it probably will (e.g. a copy of each Server's Standalone config object). Following a restart it probably shouldn't just assume it's state is consistent with the Server. That can be checked easily enough by getting the result of Standalone.elementHash() from the server and comparing it to its own value.

 

One kinda hacky thought is if the PM knows the ManagedProcess is already started, it could ignore 2.4.1 and when it receives 2.4.2 it sends an SM_RESTARTED message to the Server (instead of starting it). That tells the server to connect to the SM.

 

Another thing to think about with 2.4 is whether we want to support concurrent startup of servers. Also, we need some mechanism to control the order of server start across the domain. That could just be following the order of server-group elements in the domain.xml, and then the order of server elements in the host.xml.

Reply to this message by going to Community

Start a new discussion in JBoss AS7 Development at Community