[rules-dev] Fwd: Window syntax extensions

Mark Proctor mproctor at codehaus.org
Wed Nov 28 06:25:15 EST 2007


Michael Neale wrote:
> Hi Edson.
>
> My $0.02 AUD:
>
> 1. I choose time-window( 30 min )
we have two different types  of windows - time and length. The above 
means each window would be a fixed separate keyword. Where as "win" and 
then a sub-unit would would indicate the type of window. "-" does not 
really indicate a sub unit, where as : and . do.
> 2. I choose option 2: eg TemperatureReading( window-events(60), 
> window:time(1 min), temp > 50 ) from streamB
I think it would be a mistake to mix keywords and field constraints so 
freely in an almost similar syntax. I think there needs to be a clearer 
distinction between what is a behaviour attribute on a pattern and what 
is a field constraint. behaviour attributes work on sets/groups of 
objects, field constraints work on individual objects.
> 3. I choose distinct Address() from $person.address
> (I guess lexically speaking, its similar to exists, not, etc).
We didn't want ot make distinct a keyword, as their may be other user 
implementations. For this reasn we wanted it to be a text glob the 
parser can suck in and the factory creates the required implementation. 
This is part of the reason for the need for delimeters, this extra 
pluggabiilty/flexability ofcourse has a cost in removing the english 
like qualities.

The other option is we have a fxed and non user-extensible grammar, we 
are still not sure on the likelyhood for the on alternatives to distinct 
- although I can think of one other "last". In banking if you have a 
stream of data come in you want the last entry for a given ticker type. 
So distinct,last etc are simply ways to massage groups of values for 
that window.
>
> On Nov 28, 2007 12:08 PM, Edson Tirelli <tirelli at post.com 
> <mailto:tirelli at post.com>> wrote:
>
>
>        Resending...
>
>     ---------- Forwarded message ----------
>
>        Karen and Matthias,
>
>        Thanks for the suggestions. A few items that we need to solve
>     before we decide on a syntax:
>
>     1. The qualification of the units: we need to be able (as a CEP
>     engine) to allow the definition of arbitrary time windows, like
>     seconds, minutes or hours or compositions of them. So, a few
>     notations could be used: 30 sec, 30 min, 30 hours (or variations
>     of that), 25:10:23 (25 hours and 10 minutes and 23 seconds), etc.
>     I would really like to avoid hard coding the unit parsing in the
>     DRL parser, since the parser is one of the more delicate pieces of
>     the code to change. We can avoid that if we use delimiters to
>     declare t mulhe "parameters" of the window, as the parser can
>     simply read it all as a String chunk and let each window handler
>     deal with its own units. So, the suggestion is to use () or
>     something like that to provide the parameters to the window
>     handlers, in a way like that:
>
>     window:time( 30 min ) or time-window( 30 min ) or another
>     variation of that
>     window:events( 60 ) or events-window( 60 ) or another variation of
>     that
>
>     2. The smallest "unit" in a rule definition is a Pattern, and
>     right now, a Pattern looks like this:
>
>     [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from <source>]
>
>     <> : means must be replaced by the corresponding data
>     [] : means optional
>
>     Now, what we need is a way to declare window constraints for
>     patterns. The accumulate example I gave you was not a good one,
>     because accumulates have 2 or more patterns (the inner patterns
>     and the accumulate result pattern). You can even apply different
>     window constraints for each of the patterns.
>     So, thinking about a single pattern, how would you define a window
>     for it?
>
>     Given the current grammar and the concept of pattern, I see only
>     two options: either we define the window constraints inside the
>     pattern declaration ( i.e., near regular constraints), or we
>     create another keyword to introduce them (like we have the "from"
>     keyword).
>
>     option 1: [<var-bind> : ] <Pattern-Type>( [<window-constraints> ,]
>     [<constraints>] ) [from <source>]
>     option 2: [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from
>     <source>] [with <window-constraints>]
>
>     Note: replace the "with" above with the keyword we chose.
>
>     Just as a real world example, if you look at ESPER queries, they
>     do events correlation like this:
>
>     select * 
>       from TxnEventA.win:time(30 minutes) A
>            join TxnEventC.win:time(60 minutes) C on A.transactionId = C.transactionId
>            join TxnEventB.win:time(30 minutes) B on B.transactionId = 
>     C.transactionId
>
>      where C.transactionId is null
>
>
>     One example of event aggregation is this:
>
>     select min(latencyAC) as minLatencyAC, max(latencyAC) as maxLatencyAC, avg(latencyAC) as avgLatencyAC
>
>
>       from CombinedEvent.win:time
>     (30 minutes)
>
>
>     So, you see that the time windows are defined for each pattern and
>     they may be different among joined patterns.
>     Accumulate may have multiple inner patterns, and so, we need to be
>     able to define windows for each one.
>
>     On our initial suggestions, we would define the window constraints
>     inside the pattern, as this is a lot better to avoid grammar
>     ambiguities. The use of square brackets would help to avoid the
>     need of creating new keywords, but as long as we decide that it is
>     worth to create at least one keyword (like "window" for instance),
>     we can avoid the []:
>
>     TemperatureReading( window:time(30 sec), temp > 50 ) from streamA
>     TemperatureReading( window:events(60), window:time(1 min), temp >
>     50 ) from streamB
>
>     Something closer to what you suggested would be (replace "with"
>     with a chosen keyword) :
>
>     TemperatureReading( temp > 50 ) from streamB with
>     window:events(60), window:time(1 min)
>
>     My preference is to keep that inside the pattern, but would like
>     to hear from you (if the change proposed in item #1 is ok) and
>     from Mark and Michael.
>
>     3. Regarding "distinct", "group by" and "order by", we want to
>     support that, specially "distinct". From a syntax perspective, I
>     would like to use a similar solution as we chose for the windows,
>     so that we can keep the language easier to learn, but if it is
>     best for clarity to use them like "distinct PatternA(...)", I'm ok
>     with that too. Unfortunately, we can't allow distinct directly for
>     attributes, as different from SQL, we need to declare all the
>     patterns that make a tuple. So, instead of saying something like:
>
>     Person( distinct address )
>
>     we need to say something like:
>
>     distinct Address() from $person.address
>     Address( distinct ) from $person.address
>
>     Anyway, item #3 is of minor importance right now.
>
>        Sorry for the long e-mail.
>
>         Edson
>
>     2007/11/27, kw14 at mail.inf.tu-dresden.de
>     <mailto:kw14 at mail.inf.tu-dresden.de> <kw14 at mail.inf.tu-dresden.de
>     <mailto:kw14 at mail.inf.tu-dresden.de>>:
>
>         Hi Mark, hi Edson,
>
>         in the following you'll find our suggestions regarding Drools
>         syntax
>         extensions for definition of windows and other node restrictions.
>
>         Edson showed us an example of what you had in mind:
>         Number( avg : doubleValue ) from accumulate( TemperatureReading( [
>         window:events( 30 ), window:time( 60 sec ), distinct ], $temp :
>         temperature ), average( $temp ) )
>
>         It calculates the average of distinct temperature readings in
>         60 seconds
>         or over 30 events (whatever comes first).
>
>         Your notation has the advantage of being short, easy to use
>         and orthogonal
>         to the current grammar.
>
>         In contrast to the definition using square brackets, we'd
>         prefer a more
>         explicit way of defining windows and other constraints. This
>         increases
>         readability and makes it easier for novices to define rules.
>         You've also
>         used a similar notation in your talk at Synasc 2007.
>
>         So our version would look like this:
>
>         Number( avg : doubleValue ) from accumulate within|in|over 60s
>         or 30
>         events( distinct TemperatureReading($temp : temperature ),
>         average( $temp
>         ))
>
>         It allows using nested 'accumulates' while its clear to which
>         'accumulate'
>         a window definition belongs. The usage of  "in", "within" or
>         "over" as a
>         keyword is up to your preference. The "distinct" keyword could
>         be used in
>         front of either an object or an attribute definition similar
>         to select
>         "distinct *" vs. "select distinct fieldname" in SQL. We think
>         it's clearer
>         to have it in front of the object/attribute it refers to.
>         For collect, a definition would look similar:
>
>         e.g .
>         ArrayList( ) from collect over 60s ( TemperatureReading( orderBy
>         temperature > 0 , distinct id!=null ) )
>
>         In this case, we also used an "orderBy" keyword similar to
>         distinct.
>         However, it could also be used like a function as is currently
>         done with
>         aggregation functions in 'accumulate':
>
>         ArrayList( ) from collect over 60s ( TemperatureReading(
>         $temp:temperature
>         > 0 , distinct id!=null ), orderBy($temp) )
>
>         For us, orderBy and groupBy only make sense with the set operator
>         'collect'. Although, they could be used in 'accumulate', if
>         the order of
>         values is relevant to the result of a user-defined aggregation
>         function or
>         if different result sets are created based on a grouping
>         attribute. Are
>         you planning to support this?
>
>         Window definitions can also make sense for the 'exists' and 'not'
>         operators or a complete rule as already suggested by Edson.
>
>         An alternative way of describing the windows and other
>         constructs would be
>         the one below. However, it would require additional brackets
>         to show the
>         scope of windows for nested accumulates and was therefore
>         excluded from
>         our considerations.
>
>         Number( avg : doubleValue ) from accumulate( distinct
>         TemperatureReading($temp : temperature ), average( $temp ) )
>         over 60s or
>         30 events
>
>         Okay, we are aware that these suggestions would require more
>         extensions in
>         the Drools grammar, but we believe the resulting rules are
>         easier to
>         understand than the ones using the square brackets. So, that's
>         open for
>         discussion. :)
>
>         Cheers, Karen
>
>
>
>
>
>
>
>     -- 
>       Edson Tirelli
>       JBoss Drools Core Development
>       Office: +55 11 3529-6000
>       Mobile: +55 11 9287-5646
>       JBoss, a division of Red Hat @ www.jboss.com <http://www.jboss.com>
>
>     -- 
>       Edson Tirelli
>       JBoss Drools Core Development
>       Office: +55 11 3529-6000
>       Mobile: +55 11 9287-5646
>       JBoss, a division of Red Hat @ www.jboss.com <http://www.jboss.com>
>     _______________________________________________
>     rules-dev mailing list
>     rules-dev at lists.jboss.org <mailto:rules-dev at lists.jboss.org>
>     https://lists.jboss.org/mailman/listinfo/rules-dev
>
>
>
>
> -- 
> Michael D Neale
> home: www.michaelneale.net <http://www.michaelneale.net>
> blog: michaelneale.blogspot.com <http://michaelneale.blogspot.com>
> ------------------------------------------------------------------------
>
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-dev/attachments/20071128/9275429e/attachment.html 


More information about the rules-dev mailing list