Yes, that makes sense. Not only a subsystem failure will lead to failed deployments but in general it'd make sense to keep a server running if it could recover from the error which in this case means to restart the subsystem which I don't think we even considered?
Actually my point starting this thread was clarify the current deployment mechanism and (more importantly) how it's actually supposed to work and how it is supposed to fail.
But what effects errors from different kinds of "startable" pieces (subsystems, applications, etc) have has to be clearly defined as well. I'll have a look and start a new thread for that.