there must be an exception when session receives task complete events. These are processes in sequence by single thread and might be that the exception is swallowed by that thread and thus some process instances are not resumed. Two things could be done:
- implement extended HT work item handler that provides more debug inputs
- try to reproduce it as unit test
Alternatively you could try to use JMS based transport and configure persistence for jms queue so messages won't get lost in case of exception.
HTH