[rules-dev] Fwd: Window syntax extensions
Mark Proctor
mproctor at codehaus.org
Wed Nov 28 06:25:15 EST 2007
Michael Neale wrote:
> Hi Edson.
>
> My $0.02 AUD:
>
> 1. I choose time-window( 30 min )
we have two different types of windows - time and length. The above
means each window would be a fixed separate keyword. Where as "win" and
then a sub-unit would would indicate the type of window. "-" does not
really indicate a sub unit, where as : and . do.
> 2. I choose option 2: eg TemperatureReading( window-events(60),
> window:time(1 min), temp > 50 ) from streamB
I think it would be a mistake to mix keywords and field constraints so
freely in an almost similar syntax. I think there needs to be a clearer
distinction between what is a behaviour attribute on a pattern and what
is a field constraint. behaviour attributes work on sets/groups of
objects, field constraints work on individual objects.
> 3. I choose distinct Address() from $person.address
> (I guess lexically speaking, its similar to exists, not, etc).
We didn't want ot make distinct a keyword, as their may be other user
implementations. For this reasn we wanted it to be a text glob the
parser can suck in and the factory creates the required implementation.
This is part of the reason for the need for delimeters, this extra
pluggabiilty/flexability ofcourse has a cost in removing the english
like qualities.
The other option is we have a fxed and non user-extensible grammar, we
are still not sure on the likelyhood for the on alternatives to distinct
- although I can think of one other "last". In banking if you have a
stream of data come in you want the last entry for a given ticker type.
So distinct,last etc are simply ways to massage groups of values for
that window.
>
> On Nov 28, 2007 12:08 PM, Edson Tirelli <tirelli at post.com
> <mailto:tirelli at post.com>> wrote:
>
>
> Resending...
>
> ---------- Forwarded message ----------
>
> Karen and Matthias,
>
> Thanks for the suggestions. A few items that we need to solve
> before we decide on a syntax:
>
> 1. The qualification of the units: we need to be able (as a CEP
> engine) to allow the definition of arbitrary time windows, like
> seconds, minutes or hours or compositions of them. So, a few
> notations could be used: 30 sec, 30 min, 30 hours (or variations
> of that), 25:10:23 (25 hours and 10 minutes and 23 seconds), etc.
> I would really like to avoid hard coding the unit parsing in the
> DRL parser, since the parser is one of the more delicate pieces of
> the code to change. We can avoid that if we use delimiters to
> declare t mulhe "parameters" of the window, as the parser can
> simply read it all as a String chunk and let each window handler
> deal with its own units. So, the suggestion is to use () or
> something like that to provide the parameters to the window
> handlers, in a way like that:
>
> window:time( 30 min ) or time-window( 30 min ) or another
> variation of that
> window:events( 60 ) or events-window( 60 ) or another variation of
> that
>
> 2. The smallest "unit" in a rule definition is a Pattern, and
> right now, a Pattern looks like this:
>
> [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from <source>]
>
> <> : means must be replaced by the corresponding data
> [] : means optional
>
> Now, what we need is a way to declare window constraints for
> patterns. The accumulate example I gave you was not a good one,
> because accumulates have 2 or more patterns (the inner patterns
> and the accumulate result pattern). You can even apply different
> window constraints for each of the patterns.
> So, thinking about a single pattern, how would you define a window
> for it?
>
> Given the current grammar and the concept of pattern, I see only
> two options: either we define the window constraints inside the
> pattern declaration ( i.e., near regular constraints), or we
> create another keyword to introduce them (like we have the "from"
> keyword).
>
> option 1: [<var-bind> : ] <Pattern-Type>( [<window-constraints> ,]
> [<constraints>] ) [from <source>]
> option 2: [<var-bind> : ] <Pattern-Type>( [<constraints>] ) [from
> <source>] [with <window-constraints>]
>
> Note: replace the "with" above with the keyword we chose.
>
> Just as a real world example, if you look at ESPER queries, they
> do events correlation like this:
>
> select *
> from TxnEventA.win:time(30 minutes) A
> join TxnEventC.win:time(60 minutes) C on A.transactionId = C.transactionId
> join TxnEventB.win:time(30 minutes) B on B.transactionId =
> C.transactionId
>
> where C.transactionId is null
>
>
> One example of event aggregation is this:
>
> select min(latencyAC) as minLatencyAC, max(latencyAC) as maxLatencyAC, avg(latencyAC) as avgLatencyAC
>
>
> from CombinedEvent.win:time
> (30 minutes)
>
>
> So, you see that the time windows are defined for each pattern and
> they may be different among joined patterns.
> Accumulate may have multiple inner patterns, and so, we need to be
> able to define windows for each one.
>
> On our initial suggestions, we would define the window constraints
> inside the pattern, as this is a lot better to avoid grammar
> ambiguities. The use of square brackets would help to avoid the
> need of creating new keywords, but as long as we decide that it is
> worth to create at least one keyword (like "window" for instance),
> we can avoid the []:
>
> TemperatureReading( window:time(30 sec), temp > 50 ) from streamA
> TemperatureReading( window:events(60), window:time(1 min), temp >
> 50 ) from streamB
>
> Something closer to what you suggested would be (replace "with"
> with a chosen keyword) :
>
> TemperatureReading( temp > 50 ) from streamB with
> window:events(60), window:time(1 min)
>
> My preference is to keep that inside the pattern, but would like
> to hear from you (if the change proposed in item #1 is ok) and
> from Mark and Michael.
>
> 3. Regarding "distinct", "group by" and "order by", we want to
> support that, specially "distinct". From a syntax perspective, I
> would like to use a similar solution as we chose for the windows,
> so that we can keep the language easier to learn, but if it is
> best for clarity to use them like "distinct PatternA(...)", I'm ok
> with that too. Unfortunately, we can't allow distinct directly for
> attributes, as different from SQL, we need to declare all the
> patterns that make a tuple. So, instead of saying something like:
>
> Person( distinct address )
>
> we need to say something like:
>
> distinct Address() from $person.address
> Address( distinct ) from $person.address
>
> Anyway, item #3 is of minor importance right now.
>
> Sorry for the long e-mail.
>
> Edson
>
> 2007/11/27, kw14 at mail.inf.tu-dresden.de
> <mailto:kw14 at mail.inf.tu-dresden.de> <kw14 at mail.inf.tu-dresden.de
> <mailto:kw14 at mail.inf.tu-dresden.de>>:
>
> Hi Mark, hi Edson,
>
> in the following you'll find our suggestions regarding Drools
> syntax
> extensions for definition of windows and other node restrictions.
>
> Edson showed us an example of what you had in mind:
> Number( avg : doubleValue ) from accumulate( TemperatureReading( [
> window:events( 30 ), window:time( 60 sec ), distinct ], $temp :
> temperature ), average( $temp ) )
>
> It calculates the average of distinct temperature readings in
> 60 seconds
> or over 30 events (whatever comes first).
>
> Your notation has the advantage of being short, easy to use
> and orthogonal
> to the current grammar.
>
> In contrast to the definition using square brackets, we'd
> prefer a more
> explicit way of defining windows and other constraints. This
> increases
> readability and makes it easier for novices to define rules.
> You've also
> used a similar notation in your talk at Synasc 2007.
>
> So our version would look like this:
>
> Number( avg : doubleValue ) from accumulate within|in|over 60s
> or 30
> events( distinct TemperatureReading($temp : temperature ),
> average( $temp
> ))
>
> It allows using nested 'accumulates' while its clear to which
> 'accumulate'
> a window definition belongs. The usage of "in", "within" or
> "over" as a
> keyword is up to your preference. The "distinct" keyword could
> be used in
> front of either an object or an attribute definition similar
> to select
> "distinct *" vs. "select distinct fieldname" in SQL. We think
> it's clearer
> to have it in front of the object/attribute it refers to.
> For collect, a definition would look similar:
>
> e.g .
> ArrayList( ) from collect over 60s ( TemperatureReading( orderBy
> temperature > 0 , distinct id!=null ) )
>
> In this case, we also used an "orderBy" keyword similar to
> distinct.
> However, it could also be used like a function as is currently
> done with
> aggregation functions in 'accumulate':
>
> ArrayList( ) from collect over 60s ( TemperatureReading(
> $temp:temperature
> > 0 , distinct id!=null ), orderBy($temp) )
>
> For us, orderBy and groupBy only make sense with the set operator
> 'collect'. Although, they could be used in 'accumulate', if
> the order of
> values is relevant to the result of a user-defined aggregation
> function or
> if different result sets are created based on a grouping
> attribute. Are
> you planning to support this?
>
> Window definitions can also make sense for the 'exists' and 'not'
> operators or a complete rule as already suggested by Edson.
>
> An alternative way of describing the windows and other
> constructs would be
> the one below. However, it would require additional brackets
> to show the
> scope of windows for nested accumulates and was therefore
> excluded from
> our considerations.
>
> Number( avg : doubleValue ) from accumulate( distinct
> TemperatureReading($temp : temperature ), average( $temp ) )
> over 60s or
> 30 events
>
> Okay, we are aware that these suggestions would require more
> extensions in
> the Drools grammar, but we believe the resulting rules are
> easier to
> understand than the ones using the square brackets. So, that's
> open for
> discussion. :)
>
> Cheers, Karen
>
>
>
>
>
>
>
> --
> Edson Tirelli
> JBoss Drools Core Development
> Office: +55 11 3529-6000
> Mobile: +55 11 9287-5646
> JBoss, a division of Red Hat @ www.jboss.com <http://www.jboss.com>
>
> --
> Edson Tirelli
> JBoss Drools Core Development
> Office: +55 11 3529-6000
> Mobile: +55 11 9287-5646
> JBoss, a division of Red Hat @ www.jboss.com <http://www.jboss.com>
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org <mailto:rules-dev at lists.jboss.org>
> https://lists.jboss.org/mailman/listinfo/rules-dev
>
>
>
>
> --
> Michael D Neale
> home: www.michaelneale.net <http://www.michaelneale.net>
> blog: michaelneale.blogspot.com <http://michaelneale.blogspot.com>
> ------------------------------------------------------------------------
>
> _______________________________________________
> rules-dev mailing list
> rules-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/rules-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/rules-dev/attachments/20071128/9275429e/attachment.html
More information about the rules-dev
mailing list