[rules-dev] Fwd: Window syntax extensions
Michael Neale
michael.neale at gmail.com
Wed Nov 28 00:06:47 EST 2007
Hi Edson.
My $0.02 AUD:
1. I choose time-window( 30 min )
2. I choose option 2: eg TemperatureReading( window-events(60),
window:time(1 min), temp > 50 ) from streamB
3. I choose distinct Address() from $person.address
(I guess lexically speaking, its similar to exists, not, etc).
On Nov 28, 2007 12:08 PM, Edson Tirelli <tirelli at post.com> wrote:
>
> Resending...
>
> ---------- Forwarded message ----------
>
> Karen and Matthias,
>
> Thanks for the suggestions. A few items that we need to solve before we
> decide on a syntax:
>
> 1. The qualification of the units: we need to be able (as a CEP engine) to
> allow the definition of arbitrary time windows, like seconds, minutes or
> hours or compositions of them. So, a few notations could be used: 30 sec, 30
> min, 30 hours (or variations of that), 25:10:23 (25 hours and 10 minutes and
> 23 seconds), etc. I would really like to avoid hard coding the unit parsing
> in the DRL parser, since the parser is one of the more delicate pieces of
> the code to change. We can avoid that if we use delimiters to declare the
> "parameters" of the window, as the parser can simply read it all as a String
> chunk and let each window handler deal with its own units. So, the
> suggestion is to use () or something like that to provide the parameters to
> the window handlers, in a way like that:
>
> window:time( 30 min ) or time-window( 30 min ) or another variation of
> that
> window:events( 60 ) or events-window( 60 ) or another variation of that
>
> 2. The smallest "unit" in a rule definition is a Pattern, and right now, a
> Pattern looks like this:
>
> [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from <source>]
>
> <> : means must be replaced by the corresponding data
> [] : means optional
>
> Now, what we need is a way to declare window constraints for patterns. The
> accumulate example I gave you was not a good one, because accumulates have 2
> or more patterns (the inner patterns and the accumulate result pattern). You
> can even apply different window constraints for each of the patterns.
> So, thinking about a single pattern, how would you define a window for it?
>
> Given the current grammar and the concept of pattern, I see only two
> options: either we define the window constraints inside the pattern
> declaration ( i.e., near regular constraints), or we create another
> keyword to introduce them (like we have the "from" keyword).
>
> option 1: [<var-bind> : ] <Pattern-Type>( [<window-constraints> ,]
> [<constraints>] ) [from <source>]
> option 2: [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from
> <source>] [with <window-constraints>]
>
> Note: replace the "with" above with the keyword we chose.
>
> Just as a real world example, if you look at ESPER queries, they do events
> correlation like this:
>
> select *
> from TxnEventA.win:time(30 minutes) A
> join TxnEventC.win:time(60 minutes) C on A.transactionId = C.transactionId
> join TxnEventB.win:time(30 minutes) B on B.transactionId = C.transactionId
>
> where C.transactionId is null
>
>
> One example of event aggregation is this:
>
> select min(latencyAC) as minLatencyAC, max(latencyAC) as maxLatencyAC, avg(latencyAC) as avgLatencyAC
>
> from CombinedEvent.win:time
> (30 minutes)
>
>
> So, you see that the time windows are defined for each pattern and they
> may be different among joined patterns.
> Accumulate may have multiple inner patterns, and so, we need to be able to
> define windows for each one.
>
> On our initial suggestions, we would define the window constraints inside
> the pattern, as this is a lot better to avoid grammar ambiguities. The use
> of square brackets would help to avoid the need of creating new keywords,
> but as long as we decide that it is worth to create at least one keyword
> (like "window" for instance), we can avoid the []:
>
> TemperatureReading( window:time(30 sec), temp > 50 ) from streamA
> TemperatureReading( window:events(60), window:time(1 min), temp > 50 )
> from streamB
>
> Something closer to what you suggested would be (replace "with" with a
> chosen keyword) :
>
> TemperatureReading( temp > 50 ) from streamB with window:events(60),
> window:time(1 min)
>
> My preference is to keep that inside the pattern, but would like to hear
> from you (if the change proposed in item #1 is ok) and from Mark and
> Michael.
>
> 3. Regarding "distinct", "group by" and "order by", we want to support
> that, specially "distinct". From a syntax perspective, I would like to use a
> similar solution as we chose for the windows, so that we can keep the
> language easier to learn, but if it is best for clarity to use them like
> "distinct PatternA(...)", I'm ok with that too. Unfortunately, we can't
> allow distinct directly for attributes, as different from SQL, we need to
> declare all the patterns that make a tuple. So, instead of saying something
> like:
>
> Person( distinct address )
>
> we need to say something like:
>
> distinct Address() from $person.address
> Address( distinct ) from $person.address
>
> Anyway, item #3 is of minor importance right now.
>
> Sorry for the long e-mail.
>
> Edson
>
> 2007/11/27, kw14 at mail.inf.tu-dresden.de <kw14 at mail.inf.tu-dresden.de>:
> >
> > Hi Mark, hi Edson,
> >
> > in the following you'll find our suggestions regarding Drools syntax
> > extensions for definition of windows and other node restrictions.
> >
> > Edson showed us an example of what you had in mind:
> > Number( avg : doubleValue ) from accumulate( TemperatureReading( [
> > window:events( 30 ), window:time( 60 sec ), distinct ], $temp :
> > temperature ), average( $temp ) )
> >
> > It calculates the average of distinct temperature readings in 60 seconds
> >
> > or over 30 events (whatever comes first).
> >
> > Your notation has the advantage of being short, easy to use and
> > orthogonal
> > to the current grammar.
> >
> > In contrast to the definition using square brackets, we'd prefer a more
> > explicit way of defining windows and other constraints. This increases
> > readability and makes it easier for novices to define rules. You've also
> > used a similar notation in your talk at Synasc 2007.
> >
> > So our version would look like this:
> >
> > Number( avg : doubleValue ) from accumulate within|in|over 60s or 30
> > events( distinct TemperatureReading($temp : temperature ), average(
> > $temp
> > ))
> >
> > It allows using nested 'accumulates' while its clear to which
> > 'accumulate'
> > a window definition belongs. The usage of "in", "within" or "over" as a
> > keyword is up to your preference. The "distinct" keyword could be used
> > in
> > front of either an object or an attribute definition similar to select
> > "distinct *" vs. "select distinct fieldname" in SQL. We think it's
> > clearer
> > to have it in front of the object/attribute it refers to.
> > For collect, a definition would look similar:
> >
> > e.g .
> > ArrayList( ) from collect over 60s ( TemperatureReading( orderBy
> > temperature > 0 , distinct id!=null ) )
> >
> > In this case, we also used an "orderBy" keyword similar to distinct.
> > However, it could also be used like a function as is currently done with
> >
> > aggregation functions in 'accumulate':
> >
> > ArrayList( ) from collect over 60s ( TemperatureReading(
> > $temp:temperature
> > > 0 , distinct id!=null ), orderBy($temp) )
> >
> > For us, orderBy and groupBy only make sense with the set operator
> > 'collect'. Although, they could be used in 'accumulate', if the order of
> > values is relevant to the result of a user-defined aggregation function
> > or
> > if different result sets are created based on a grouping attribute. Are
> > you planning to support this?
> >
> > Window definitions can also make sense for the 'exists' and 'not'
> > operators or a complete rule as already suggested by Edson.
> >
> > An alternative way of describing the windows and other constructs would
> > be
> > the one below. However, it would require additional brackets to show the
> > scope of windows for nested accumulates and was therefore excluded from
> > our considerations.
> >
> > Number( avg : doubleValue ) from accumulate( distinct
> > TemperatureReading($temp : temperature ), average( $temp ) ) over 60s or
> > 30 events
> >
> > Okay, we are aware that these suggestions would require more extensions
> > in
> > the Drools grammar, but we believe the resulting rules are easier to
> > understand than the ones using the square brackets. So, that's open for
> > discussion. :)
> >
> > Cheers, Karen
> >
> >
> >
> >
> >
>
>
> --
> Edson Tirelli
> JBoss Drools Core Development
> Office: +55 11 3529-6000
> Mobile: +55 11 9287-5646
> JBoss, a division of Red Hat @ www.jboss.com
>
> --
> Edson Tirelli
> JBoss Drools Core Development
> Office: +55 11 3529-6000
> Mobile: +55 11 9287-5646
> JBoss, a division of Red Hat @ www.jboss.com
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>
>
--
Michael D Neale
home: www.michaelneale.net
blog: michaelneale.blogspot.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-dev/attachments/20071128/a7609778/attachment.html
More information about the rules-dev
mailing list