[Hawkular-dev] [Alerts] Thoughts on the notification system

Jay Shaughnessy jshaughn at redhat.com
Fri Feb 6 11:04:42 EST 2015


In Hawkular AlertDefintion -> Trigger.  Other comments inline...

On 2/6/2015 8:48 AM, Thomas Segismont wrote:
> Le 06/02/2015 14:14, Catherine Robson a écrit :
>>
>>> Thomas Segismont <mailto:tsegismo at redhat.com>
>>> February 6, 2015 at 7:17 AM
>>> Hi,
>>>
>>> I've been thinking about the alert notification system lately.
>>>
>>> What's the information needed to send a notification?
>>> How to convert the information into text when a human is the recipient?
>>> How to configure the system?
>>> How to make it fit into the Hawkular suite?
>>>
>>> 1. Contextual data
>>>
>>> A notification has contextual data. Contextual data is comprised of:
>>> - alert definition data
>>> - data involved in the trigger (for example metrics and availability
>>> values)
>> Many times there are symptoms of a problem that are seen through
>> alerts.  If there are many alerts happening around the same trigger
>> time, should we be adding pointers to other potentially related alerts
>> to help users diagnose the root cause faster?
> I think we should, yes. But I'm not sure it's related to the
> notification system. It's more about the alert system integration with
> inventory (inventory is the service which knows about resources
> relationship).

Definitely could be useful.  I agree with Heiko somewhat, in that I 
think it may fall outside of alerting and likely more into a 
presentation layer feature.

>
>
>>> 2. Notifier data
>>>
>>> 2.1 Who is the recipient?
>>>
>>> Email: address
>>> SMS: phone number
>>>
>>> Sometimes the recipient is fixed (for example when sending email to a
>>> mailing-list).
>>> Sometimes it should be picked from user information (for example when
>>> sending emails to a group of users)
>>>
>>> This information depends on the alert definition, but a default should
>>> be configurable for convenience.
>> Are we considering alert escalation?  Setting up a list that the alerts
>> go to by default, and then if they are not resolved by some configurable
>> SLA, they are then sent to another group of people to raise the
>> awareness of the problem?
> Excellent remark. We haven't discussed escalation so far. I need to
> think about how it would impact the notification system.
>
>>> 2.2 How should the message be sent?
>>>
>>> Email: SMTP address/port and credentials
>>> SMS: Web service HTTP URL and credentials
>>>
>>> The information depends on the tenant (in rare cases, on the alert
>>> definition, but let's ignore the problem for now)
>>>
>>> 3. How should the message be formatted?
>>>
>>> When a human is the recipient, information can be turned into text with
>>> a template engine (like freemarker).
>>>
>>> Information depends on the alert definition, but a default should be
>>> configurable for convenience.
>> Can we do full HTML format (maybe txt vs. HTML configurable).  It would
>> be nice to add links directly into the web console for that resource
>> that alerted so users can very quickly click through to investigate the
>> problem.
> Yes, HTML output makes sense in the case of email.
>
> In section 4, I've written an example configuration. The "mode"
> attribute tells the email notifier how it should be the email.
>
> "plaintext" -> send a plain text email
> "html" -> send an HTML email
> "plaintext+html" -> send a multipart email (plaintext part and HTML part)
>
> http://en.wikipedia.org/wiki/HTML_email#Multi-part_formats
>
> In either format, the template can be be made to include a link to the
> alert details in the console.

The Alert will provide only the most basic information, whatever the 
Alerting system can provide.  For the most part that will be the Trigger 
information and the triggering data.  The out-of-box notifiers will 
forward only that information.  For Hawkular/Inventory, we'll want to 
provide extended notifiers that can take that data and decorate it with 
more context.  For example, If we trigger on DataId X > 50.  The default 
notifier would report just that.  But an extended notifier, one that 
knows about inventory, could query to find out which metric ID is X, and 
from there get the resource info, etc.

>
>>> 4. Configuration
>>>
>>> Notifiers may expose REST endpoints (with standardized URIs)
>>>
>>> - default config: /emailnotifier/configuration
>>>
>>> - alert definition level config: /emailnotifier/configuration/1
>>>
>>> Example:
>>>
>>> {
>>> "to": ["paul at foobar.com", "alfred at foobar.com"],
>>> "cc": ["backoffice-mw-ops at foobar.com"],
>>> "subject": "Pool soon exhausted",
>>> "mode": "plaintext+html"
>>> "templates":
>>> [{
>>> name: "plaintext",
>>> uri: "/emailnotifier/configuration/1/templates/plaintext"
>>> },{
>>> name: "html",
>>> uri: "/emailnotifier/configuration/1/templates/html"
>>> }]
>>> }
>>>
>>> - template configs:
>>>
>>> Example:
>>>
>>> /emailnotifier/configuration/1/templates/plaintext
>>> /emailnotifier/configuration/1/templates/html
>>>
>>>
>>> I'm still not sure which component should be responsible of loading user
>>> information when a user (or a group of users) is selected as the
>>> recipient.

I'm not so sure about the REST endpoints but this is something we need 
to discuss further for sure.  We'll likely want some standard format for 
the notifier configuration, something we can easily supply 
programmatically or via a standard GUI component.  And that config 
information will need to be tagged onto the Trigger so it can be applied 
to the generated Alerts.


>>>
>>> 5. Process
>>>
>>> - Alerts sends contextual data on the bus
>>> - Notifier picks it up
>>> - Notifier loads configuration for this alert definition or the
>>> default one
>>> - Notifier applies the template (optional)
>>> - Notifier sends email or invoke sender API

I think the entire Alert should likely be put on the bus, and configured 
notifiers subscribe to the Alerts topic and consume the Alerts for which 
they are relevant.  This is not the way it is currently set up in the 
code, but is an approach I think we should consider.   Also, I'm not 
sure about the need for template configurations, but maybe.


>>> 6. Storage
>>>
>>> There needs to be some shared storage where to bind configuration and
>>> templates to alert definitions.

For now I think we should probably look to H2 for persistence.




More information about the hawkular-dev mailing list