On 02/13/2015 01:32 PM, Catherine Robson wrote:
> Thomas Heute <mailto:theute@redhat.com>
> February 13, 2015 at 4:22 AM
>
> On 02/12/2015 11:25 PM, Catherine Robson wrote:
>> Hi -
>>
>> We see that alerting on response time and downtime are part of what
>> we hope to provide in the first iteration of Hawkular. We'd like to
>> get started on the designs related to alert
>> definition/configuration. I'm hoping that you can all share some of
>> the requirements around alert definitions that you think we need to
>> have for Hawkular at this point. I don't want to overdo it by
>> looking at JON - I'd like to start simple.
> +1
>> Here's the requirements for the web console as I currently am
>> thinking of them, but would like the team to comment on them and
>> add/remove requirements as you see necessary.
>>
>> Overall Alerts
>> As an administrator of a website, I would like to have all alerts
>> sent to me through e-mail.
>> Ad an administrator of a website, I would like to have all alerts
>> sent to me via text message.
>
> That may be implicit but:
> - As an administrator of a website, I would like to have all
> alerts listed only to the console.
Ha! Right :)
>
> We may not implement SMS right from the beginning but having 2 ways
> may be good to initiate the design. (Later we should embed Aerogear
> UPS and have a small app for push on phones, note to potential
> readers, a good student subject or contribution).
>
> 1 important thing:
> - we need some alert "profiles", if I look at my 1000s resources
> I may want them all to follow a same profile and when I want to
> change who receive an email I should do that in a single place (and
> not go through the 1000s resources). There would be several profiles.
Great - this makes sense. Do we need to hand enter user information
(e-mails/phones) or can some of this information be gathered through
KeyCloak potentially?
I think a "contact list" would make sense (and use
Keycloak) for email
and SMS (and later for mobile push). Not necessarily in the first
iteration though.
To clarify exactly what we think a "profile" contains -
please verify
below.
An alert profile is a place where users can set up alerting contact
information and rules for many resources. An alert profile contains:
* A name & description
Yes a user may choose to have various profiles depending on the gravity
of the problem, or set of machines (different people in charge) so he
needs to identify the profile easily.
* Contact information of everyone associated with this profile (auto
or manual?)
This would be explicitely listed, we may need to add "shortcuts" at some
point for instance if a resource has an "owner" we may want to be able
to send an email to the "owner" of the affected resource rather than a
fix person, but that's already more advanced.
* A group of resources this profile applies to
Another alternative is that the resources are not mentioned in the
profile, and you just assign the profile when you're working in the
resources. This feels much more like an "Alert contact group" than a
profile to me in that case, so it is just a terminology change I think
to make it clearer for what to expect from this capability.
A user could be interested to check which resources are affected before
making a change, but that doesn't need to be prominent.
I took the example of changing the email "Alert contact group", but it
could be changing the acceptable response time for all servers. So an
additional point would be
* Alert conditions
So one example of an "alert profile"
- Name: "Neuchatel Datacenter Critical issue"
- Description: "blah blah"
- Condition: "Down for 10min"
- Alerts: "Email bob immediately", "SMS mary after 20min if still
down"
That profile would normally be applied to all EAP servers in Neuchatel
datacenter. If bob gets fired, someone comes in and change the email to
someone else in that alert profile. If the neuchatel datacenter becomes
more critical, someone comes in and change "Down for 10min" to "Down for
2 min"
>
> 3 potential improvements that we may want to think about in the
> design right now (or not):
> - Different addressee: Support for sending email/SMS to someone
> else but the owner
Let me see if I can expand on this use case to make sure we're all on
the same page.
Precondition: An alert fired. It was sent to person A.
Step 1: User sees the alert, and wants to "share" this alert with
person B.
Step 2: User manually enters Person B's e-mail or SMS information.
Step 2 alternate: User selects from a dropdown list of existing known
users to find Person B, and Person B's preferred contact method is
used for sending the alert.
End Goal: Alert is sent to Person B based on the method chosen above.
I really just meant that we can send the email to someone else than the
logged-in user
> - Escalamation: if resource is down for 5 min, send me (or
> someone else) an email, if still down after 30min send me a SMS
Could this use those alert profiles too?
Yes definitely.
(Sorry for that "Escalamation" typo :) I really meant escalation)
> - Multiple alerts for 1 particular event: if resource is down
for
> 5 min, send me an email, send my boss an email and send the IT guy a SMS
>
>
>> Downtime
>> As an administrator of a website, I would like to configure Hawkular
>> so an alert is sent to me every time the system goes down.
>> As an administrator of a website, I would like to configure Hawkular
>> so an alert is only sent to me after the system is down for a
>> certain length of time, so I'm not alerted if there is a very minor
>> downtime event.
> +1
>>
>> Response time
>> As an administrator of a website, I would like to configure Hawkular
>> to alert me when my website's response time is slower than a
>> threshold I have set so I know there may be performance problems.
> It would have to be for some configurable period of time
Ok - so you would never want to alert if we go over *at all* for this
metric, you would only ever want to alert based on a time interval it
was above the threshold for.
Right, unless we look at a percentile (or average) a single response
time value outside the norm doesn't mean anything, this would only
frustrate the person notified. So this needs to include some period of time.
Thomas
>>
>> Are there any other "settings" to the alerts that we should be
>> considering at this point?
>
> At some point in the future we may want to have a warning state, but
> I don't want to surcharge this thread :)
>
> Thomas
>> Thanks,
>> Catherine
>>
>>
>> _______________________________________________
>> hawkular-dev mailing list
>> hawkular-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/hawkular-dev
>
> Catherine Robson <mailto:crobson@redhat.com>
> February 12, 2015 at 5:25 PM
> Hi -
>
> We see that alerting on response time and downtime are part of what
> we hope to provide in the first iteration of Hawkular. We'd like to
> get started on the designs related to alert
> definition/configuration. I'm hoping that you can all share some of
> the requirements around alert definitions that you think we need to
> have for Hawkular at this point. I don't want to overdo it by
> looking at JON - I'd like to start simple. Here's the requirements
> for the web console as I currently am thinking of them, but would
> like the team to comment on them and add/remove requirements as you
> see necessary.
>
> Overall Alerts
> As an administrator of a website, I would like to have all alerts
> sent to me through e-mail.
> Ad an administrator of a website, I would like to have all alerts
> sent to me via text message.
>
> Downtime
> As an administrator of a website, I would like to configure Hawkular
> so an alert is sent to me every time the system goes down.
> As an administrator of a website, I would like to configure Hawkular
> so an alert is only sent to me after the system is down for a certain
> length of time, so I'm not alerted if there is a very minor downtime
> event.
>
> Response time
> As an administrator of a website, I would like to configure Hawkular
> to alert me when my website's response time is slower than a
> threshold I have set so I know there may be performance problems.
>
> Are there any other "settings" to the alerts that we should be
> considering at this point?
>
> Thanks,
> Catherine
--
Catherine Robson
User Experience Design
Red Hat JBoss Middleware
c: 978-944-3825