I think a "contact list" would make sense (and use Keycloak) for email and SMS (and later for mobile push). Not necessarily in the first iteration though.
Ha! Right :)
On 02/12/2015 11:25 PM, Catherine Robson wrote:
Hi -+1
We see that alerting on response time and downtime are part of what we hope to provide in the first iteration of Hawkular. We'd like to get started on the designs related to alert definition/configuration. I'm hoping that you can all share some of the requirements around alert definitions that you think we need to have for Hawkular at this point. I don't want to overdo it by looking at JON - I'd like to start simple.
Here's the requirements for the web console as I currently am thinking of them, but would like the team to comment on them and add/remove requirements as you see necessary.
Overall Alerts
As an administrator of a website, I would like to have all alerts sent to me through e-mail.
Ad an administrator of a website, I would like to have all alerts sent to me via text message.
That may be implicit but:
- As an administrator of a website, I would like to have all alerts listed only to the console.
Great - this makes sense. Do we need to hand enter user information (e-mails/phones) or can some of this information be gathered through KeyCloak potentially?
We may not implement SMS right from the beginning but having 2 ways may be good to initiate the design. (Later we should embed Aerogear UPS and have a small app for push on phones, note to potential readers, a good student subject or contribution).
1 important thing:
- we need some alert "profiles", if I look at my 1000s resources I may want them all to follow a same profile and when I want to change who receive an email I should do that in a single place (and not go through the 1000s resources). There would be several profiles.
To clarify exactly what we think a "profile" contains - please verify below.Yes a user may choose to have various profiles depending on the gravity of the problem, or set of machines (different people in charge) so he needs to identify the profile easily.
An alert profile is a place where users can set up alerting contact information and rules for many resources. An alert profile contains:
- A name & description
This would be explicitely listed, we may need to add "shortcuts" at some point for instance if a resource has an "owner" we may want to be able to send an email to the "owner" of the affected resource rather than a fix person, but that's already more advanced.
- Contact information of everyone associated with this profile (auto or manual?)
- A group of resources this profile applies to
Another alternative is that the resources are not mentioned in the profile, and you just assign the profile when you're working in the resources. This feels much more like an "Alert contact group" than a profile to me in that case, so it is just a terminology change I think to make it clearer for what to expect from this capability.
So one example of an "alert profile"
- Name: "Neuchatel Datacenter Critical issue"
- Description: "blah blah"
- Condition: "Down for 10min"
- Alerts: "Email bob immediately", "SMS mary after 20min if
still down"
That profile would normally be applied to all EAP servers in
Neuchatel datacenter. If bob gets fired, someone comes in and
change the email to someone else in that alert profile. If the
neuchatel datacenter becomes more critical, someone comes in and
change "Down for 10min" to "Down for 2 min"
Let me see if I can expand on this use case to make sure we're all on the same page.
3 potential improvements that we may want to think about in the design right now (or not):
- Different addressee: Support for sending email/SMS to someone else but the owner
Precondition: An alert fired. It was sent to person A.
Step 1: User sees the alert, and wants to "share" this alert with person B.
Step 2: User manually enters Person B's e-mail or SMS information.
Step 2 alternate: User selects from a dropdown list of existing known users to find Person B, and Person B's preferred contact method is used for sending the alert.
End Goal: Alert is sent to Person B based on the method chosen above.
Could this use those alert profiles too?- Escalamation: if resource is down for 5 min, send me (or someone else) an email, if still down after 30min send me a SMS
- Multiple alerts for 1 particular event: if resource is down for 5 min, send me an email, send my boss an email and send the IT guy a SMS
Ok - so you would never want to alert if we go over *at all* for this metric, you would only ever want to alert based on a time interval it was above the threshold for.
Downtime+1
As an administrator of a website, I would like to configure Hawkular so an alert is sent to me every time the system goes down.
As an administrator of a website, I would like to configure Hawkular so an alert is only sent to me after the system is down for a certain length of time, so I'm not alerted if there is a very minor downtime event.
It would have to be for some configurable period of time
Response time
As an administrator of a website, I would like to configure Hawkular to alert me when my website's response time is slower than a threshold I have set so I know there may be performance problems.
Are there any other "settings" to the alerts that we should be considering at this point?
At some point in the future we may want to have a warning state, but I don't want to surcharge this thread :)
Thomas
Thanks,
Catherine
_______________________________________________ hawkular-dev mailing list hawkular-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hawkular-dev
Hi -
We see that alerting on response time and downtime are part of what we hope to provide in the first iteration of Hawkular. We'd like to get started on the designs related to alert definition/configuration. I'm hoping that you can all share some of the requirements around alert definitions that you think we need to have for Hawkular at this point. I don't want to overdo it by looking at JON - I'd like to start simple. Here's the requirements for the web console as I currently am thinking of them, but would like the team to comment on them and add/remove requirements as you see necessary.
Overall Alerts
As an administrator of a website, I would like to have all alerts sent to me through e-mail.
Ad an administrator of a website, I would like to have all alerts sent to me via text message.
Downtime
As an administrator of a website, I would like to configure Hawkular so an alert is sent to me every time the system goes down.
As an administrator of a website, I would like to configure Hawkular so an alert is only sent to me after the system is down for a certain length of time, so I'm not alerted if there is a very minor downtime event.
Response time
As an administrator of a website, I would like to configure Hawkular to alert me when my website's response time is slower than a threshold I have set so I know there may be performance problems.
Are there any other "settings" to the alerts that we should be considering at this point?
Thanks,
Catherine
--
Catherine Robson
User Experience Design
Red Hat JBoss Middleware
c: 978-944-3825