Message Title

The scheduling logic is implemented in org.elasticsearch.client.sniff.Sniffer, through its attribute private final task. Error handling is performed by this same object, which is an instance of a private class (org.elasticsearch.client.sniff.Sniffer.Task), and it just swallows exceptions and logs them. It does seem to catch exceptions properly and to schedule the next run as necessary, though.

So the object responsible for the behavior we want to change is stored in a field that we cannot change (private final), and even if we could mutate this field we couldn't provide another instance, because it's a private class.
So no, we cannot change the error handling without re-implementing most of the Sniffer.

Sanne Grinovero If you agree, I will reject this ticket. You mentioned "asking for the feature on the client"; what do you want exactly:

A way for us to plug in a custom error handler, which could either simply log some exceptions and allow rescheduling, or re-throw the more critical ones to prevent rescheduling?
Or simply to fix a problem in the Sniffer with the way it handles errors?

Globally, I'm quite comfortable with the current behavior. Exceptions will be logged at error level, which on any reasonable configuration should be enabled for all loggers, and sniffing will be attempted again whatever happens, which shouldn't hurt (sniffing is not supposed to happen every 2 nanoseconds, it's executed every second at most, and more probably every 10 seconds or so).
The only concern I have is the behavior regarding java.lang.Error. It seems that, if an Error is thrown, it will only be caught by the thread executor and will not be logged, and rescheduling will still happen, without the Error ever being re-thrown. If I had to suggest something to the Elastic team, it would be to catch any Throwable, not just any Exception, and maybe shut down the sniffer when an Error is caught.

If you confirm that it was your only concern, I'll open a ticket on https://github.com/elastic/elasticsearch/issues

Add Comment

This message was sent by Atlassian JIRA