Matthias,
I agree with you and I say interval-based semantics is a go. I also
agree on representing point-in-time events as events with 0 duration.
Another question that bothers me is whether you also intend to
support
different "consumption modes" like 'recent', 'chronicle' or
'unrestricted'
(see paragraph 4.4. in the paper I sent to you; i.e. page 11 + 12 of
the
pdf).
For our use case, 'recent' context is probably best suited -
however at
least 'unrestricted' should be supported too (e.g. in sliding windows,
no events can be discarded). What do you think?
My understanding is that the
unrestricted mode is the default operation
mode of the engine, and as such, we get it for free. My understanding is
also that you can constrain the matching patterns to "emulate" recent and
chronicle modes. So, my suggestions is we leave a more transparent support
to "recent" and "chronicle" modes to a second phase, i.e., as soon as
we
have all the basics working in the engine.
Regarding the ability of the engine to work with non-javabean facts, it
depends on a feature that is in our "to do list": pluggable extractors. The
idea is that you can configure the engine to use different strategies to
obtain a value from a fact. Example:
Cheese( type == "stilton" )
The above pattern makes the engine to use an extractor to read the value
of "type" from the fact. As it is today, the engine uses a
"hardcoded"
extractor that knows how to read that attribute value from a javabean. What
we need is to implement support in the engine to use a different extractor
configured (and eventually provided) by the user. So, if Cheese is a JMS
message, an XML element, an Ontology instance, or a CSV record in a file,
does not matter for the engine, as long as the extractor can "read" that
value and provide it to the engine.
So, not supported yet, but we need that too for our next major release.
[]s
Edson
2007/11/13, Groch, Matthias <matthias.groch(a)sap.com>:
Looking at it, it seems quite feasible to implement the operators and
time-windows necessaries to correlate events using this semantics. The only
thing I still don't know is related to a very practical (in the sense that
is opposed to theory) question:
* A method call in a programming language is an "atomic", point-in-time,
operation. So, inserting an event into the engine, is also a point-in-time
operation. Interval-based semantics presumes that an event has a duration,
and a duration is only available "after" an event has finished. So, since we
need a synchronized clock to allow appropriate reasoning and time-window
management in the engine, how do we implement support for that? Do you
understand my question? We can't allow the engine clock to move backwards,
and we will not be able to wait eternally for events that may never arrive,
so how to we map one semantics into the other?
Will need to support point-in-time semantics for simple/atomic
events, and interval-based semantics only for complex events? What are your
ideas about this?
IMO, the easiest way to unify point- and interval-based
representation would be to choose interval-based semantics, and
then represent atomic events by means of an interval of duration 0, i.e.
start and end time of the interval are equal. This way we would avoid having
two different semantics, everything could be handled using intervals and the
13 relations defined by Allen.
However, for representing primitive events point-based semantics and the
three relations (<, >, =) are sufficient. Problems start when setting up
relations between primitive and complex events. One solution would be to
introduce another 5 relations between points and intervals (like it is done
in the QA approach). However, for reasons of simplicity, I'd prefer to see
the point as an interval of length 0 just for this relation, so we can use
the relations definded on interval-based semantics.
As pointed out by the QA guys, with interval-based semantics you can
unambigiously definine qualitative relations. However, defining quantitative
relations (i.e. durations) is not straight-forward. For instance, let's
say yo wanna check whether one composite event e2 occurs within 2 minutes
after a first one e1 occurred. There are several ways to it, to be precise 4
(plus mixed solutions). You could compare either the start points of both
composite events, or the end points, or the start point of e1 with the end
point of e2, or the end point of e1 with the start point of e2 (or some mid
point inside both of the intervals). All of the options might be meaningful
(depending on the use case), so probably one should offer support for all of
it. Nevertheless it probably makes sense to have a default implementation,
which in our opinion is that you have a look at the end point of e1 and the
start point of e2.
For time-window management, in our opinion the most suitable
interpretation is to only include (composite) events which have
terminated within the given boundaries. I agree that including events
already at the time they start is error-prone since there's no guarantee
that they eventually terminate (and when). Including events only if start
and end point are within the boundaries of the window bears the risk of
"loosing" events starting in one window but ending in another one.
Although it's a nice assumption arriving events to be in the right (i.e.
chronicle) order, there should be a support for delayed events. There should
be a short delay (e.g. 1 min) in which late events are still accepted and
inserted at the right point. That also means that rules which have fired
already might have to be cancelled and reexecuted according to the new
arrival order. In other words, reacting to incoming events should occur
immediately, however, if late events arrive, some reevaluation must be done
(but only in a certain interval; in this example, the system state older
than 1 minute is untouchable). In order not having to wait for events
forever (which may be delayed), one should define an upper bound after which
no more events are accepted or old old events can be discarded. ILOG JRules
uses two qualifiers for that: ARRIVAL_DELAY and MATCHING_HORIZON (see
http://www.ilog.com/products/jrules/documentation/jrules66/rsoptimize/rs_...).
I think we could do it in a similar way.
Another question that bothers me is whether you also intend to support
different "consumption modes" like 'recent', 'chronicle' or
'unrestricted' (see paragraph 4.4. in the paper I sent to you; i.e. page
11 + 12 of the pdf). For our use case, 'recent' context is probably best
suited - however at least 'unrestricted' should be supported too (e.g.
in sliding windows, no events can be discarded). What do you think?
One last question (which is completely unrelated to all above): In a
listener within my sample application, I receive events as JMS messages
(API:
http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/jms/Message.html)
containing a name, a parent id, a timestamp and several additional (unknown)
parameters. As far as know, Drools only supports Java classes relying on
the JavaBean standard of getters and setters without any parameters. I got 2
problems: First of all, the attributes and the corresponding values are
saved in a map inside the JMS message. You access a certain field of a JMS
message by specifying the field name as a parameter, e.g. myMessage.
getStringProperty(java.lang.String name), something which apparently is
not allowed in Drools (or is there a workaround?) The second problem is,
that there could be different events which, besides the standard
attributes, have different attributes. In other words, at compile time I
don't know which attributes an event instance will have, which means I
cannot write a simple standard 'mapping function'. So what I currently do is
to "unwrap" the JMS message, extract all parameters and create a new myEvent
object with well-defined getters and setters. This myEvent instance is then
added to the working meory. When sending a reponse (which is another
event) back to my publish-subscribe-system, I have to go the opposite way,
i.e. wrapping all the information from a myEvent instance in a JMS
message. What would be nice is to directly process JMS messages, without all
the (un)wrapping. Is there a way to do that in Drools?
Cheers,
Matthias
------------------------------
*From:* ed.tirelli(a)gmail.com [mailto:ed.tirelli@gmail.com] *On Behalf Of *Edson
Tirelli
*Sent:* Tuesday, November 13, 2007 1:21 AM
*To:* Groch, Matthias
*Cc:* Walzer, Karen; Mark Proctor
*Subject:* Re: Interval-based vs. Time-based semantics
Matthias,
Interesting paper! I didn't know it before. I now understand the
additional declarative power that the interval-based semantics add to the
reasoning!
Looking at it, it seems quite feasible to implement the operators and
time-windows necessaries to correlate events using this semantics. The only
thing I still don't know is related to a very practical (in the sense that
is opposed to theory) question:
* A method call in a programming language is an "atomic", point-in-time,
operation. So, inserting an event into the engine, is also a point-in-time
operation. Interval-based semantics presumes that an event has a duration,
and a duration is only available "after" an event has finished. So, since we
need a synchronized clock to allow appropriate reasoning and time-window
management in the engine, how do we implement support for that? Do you
understand my question? We can't allow the engine clock to move backwards,
and we will not be able to wait eternally for events that may never arrive,
so how to we map one semantics into the other?
Will need to support point-in-time semantics for simple/atomic
events, and interval-based semantics only for complex events? What are your
ideas about this?
Regards,
Edson
PS: what do you think about moving our conversation to the dev list? is
there any problem for you, your thesis or your job if we get a wider
audience?
2007/11/12, Groch, Matthias <matthias.groch(a)sap.com>:
>
> <<Unified Semantics for Event Correlation Over Time and Space in Hybrid
> Network Environments.pdf>> Edson,
>
> I'm still thinking about how to represent time. In the attached paper,
> there's a counter-example prooving that point-based semantics aren't
> appropriate in all cases. You can find it in paragraph of 4.3, on page
> 11 of the PDF (= page 375 of the original conference proceedings).
> For this reason, I'm in favor of using interval-based semantics instead
> of point-based semantics. Meanwhile, I stumpled across another approach
> combining the aforementioned. It's called Qualitative Algebra (QA). Have
> you ever heard of that? Meiri introduced it
> (
http://citeseer.ist.psu.edu/715273.html), and it was extended by
> Barber
> (
http://citeseer.ist.psu.edu/barber00reasoning.html). It's quite
> interesting since we then could use point-based semantics for primitive
> events and interval-based semantics for complex events, and still are
> able to define relations between them. However, I have the feeling it's
> not very well-examined yet; moreover it would introduce additional
> relations and therefore make things too complex (interval-based
> semantics already deal with 13 relations; but as pointed out, we
> probably cannot get around it)...
> What do you think?
>
> Matthias
>
>
--
Edson Tirelli
Software Engineer - JBoss Rules Core Developer
Office: +55 11 3529-6000
Mobile: +55 11 9287-5646
JBoss, a division of Red Hat @
www.jboss.com
--
Edson Tirelli
Software Engineer - JBoss Rules Core Developer
Office: +55 11 3529-6000
Mobile: +55 11 9287-5646
JBoss, a division of Red Hat @