[
https://issues.jboss.org/browse/JBRULES-3100?page=com.atlassian.jira.plug...
]
Richard Calmbach commented on JBRULES-3100:
-------------------------------------------
Just to clarify: I interpret "A and not(B after A)" to mean: I got A and I got
no B after A as of right now, i.e., the moment I look at the working memory, and
that's the point in time when the rule should get activated. Put another way, when no
explicit upper bound is supplied, "now" forms the implicit upper bound for the
"after" time window. To me, this is the only meaningful interpretation of
"negated unbounded after" (and correspondingly, "unbounded after", for
that matter). So, I definitely want the rule to get scheduled for activation with an
effective activation timestamp of "now" (as opposed to later or never). Whether
the rule fires or not depends strictly speaking on which other activations are on the
agenda and how conflicts get resolved. Conceivably, one of the activations could upon
firing insert a "B" with a timestamp between "A" and "now",
and then the "A and not(B after A)" rule would no longer match (at least,
that's my understanding of the rule engine's operation - correct me if I'm
missing something).
As an example of why we want "A and not(B after A)" type rules to behave this
way, look at the Broker example (broker.drl) where an "A and not(A after A)"
type construct (i.e., same event type) is used (by comparing timestamps directly) to
identify the most recent event of type A (here: StockTick). How would this be done with an
"after" construct if we didn't treat "now" as an implicit upper
bound for the time window?
This is an important point: I can already get Drools to handle "A and not(B after
A)" type rules correctly by comparing the timestamp fields directly instead of using
the "after" construct. Again, "now" forms an implicit upper bound for
the time window, just like with non-event facts, i.e., I look at what I've got right
now and base my decisions on that. Instead of treating Long.MAX_VALUE as special, one
could identify "negated unbounded after" constructs and rewrite them as
duration-less rules (just like what one would get by comparing the timestamp fields
directly) or treat them equivalently, i.e., schedule their activation for right now. In
general, the latest "bounded after" interval endpoint in the LHS should
determine the duration of the rule, with "unbounded after" treated as duration
zero.
Given that Drools already effectively schedules "A and not(B after A)" type
rules for immediate activation, it would actually break backward-compatibility to change
the behavior now. I don't see evidence that users are surprised by the current
behavior. Changing it - now that might be a surprise.
Thanks for looking into this. The Drools event processing support is really powerful, and
it's great to see it getting the attention it deserves.
Long.MAX_VALUE duration for "A and not(B after A)" type
rules causes invalid session clock time in rule RHS when running with pseudo clock
------------------------------------------------------------------------------------------------------------------------------------------
Key: JBRULES-3100
URL:
https://issues.jboss.org/browse/JBRULES-3100
Project: Drools
Issue Type: Bug
Security Level: Public(Everyone can see)
Components: drools-core, drools-core (fusion)
Affects Versions: 5.2.0.Final
Reporter: Richard Calmbach
Assignee: Edson Tirelli
Priority: Critical
Fix For: 5.4.0.Beta2
Ok, this is a subtle one. How did I run into this one? I need dynamic timers that are
started in the RHS because Drools LHS syntax does not support dynamic timers. In order to
be able to unit test my rules even with dynamic timers, I rely on the session clock (aka
TimerService) to schedule and execute these manually defined jobs. Works beautifully with
the real-time clock for all my rules, and also works with the pseudo clock for rules that
don't have a negated unbounded "after" clause. However, the pseudo clock
breaks the timers that are scheduled in the RHS of rules that have a LHS of type "A
and not(B after A)". The reason is extremely subtle, and understanding it required an
excursion into the inner workings of the rule engine, in particular its mechanism for
scheduling activations of rules with non-zero duration. (Good thing that javadoc and
sources are now available via Maven repositories for all Drools artifacts. Thank you,
Drools team and the m2eclipse "Download Artifact Sources" checkbox!)
In PseudoClockScheduler.runCallBacks(), the pseudo clock checks whether any scheduled
jobs need to be executed by comparing their trigger times to the current time
(this.timer). For every job that is due, e.g., a scheduled rule activation, the pseudo
clock saves this.timer to a local variable (savedTimer) and sets this.timer temporarily to
the trigger time of the job (keeping this.timer at that value for the duration of the job
execution, e.g., the firing of the scheduled activation). Now, for rules with a bounded
duration (e.g., "A and not(B after[0s, 10s] A"), this trigger time is a
reasonable value (current time + duration), and everything works out. However, rules of
type "A and not(B after A)" (i.e., with negated unbounded after) have a duration
value of Long.MAX_VALUE. When DurationTimer.createTrigger() constructs a new
PointInTimeTrigger object, it sets the trigger time to "current time +
duration", and with a post-epoch current time (i.e., greater than zero) and duration
== Long.MAX_VALUE, this causes wrap-around and sets the trigger time to a very large
negative number. Up to this point, I think the behavior of the code is intentional (albeit
hacky), to ensure that "negated unbounded after" type rules get activated right
away (as they should). However, this approach has unintended, buggy consequences as we
shall see. When such a scheduled rule fires, then for the duration of its RHS execution,
the session clock has an invalid value (very large negative number). The only way how you
would know that is if you queried the session clock, say because you're trying to
schedule a dynamic timer because Drools LHS syntax only supports static timers. The upshot
is that the dynamic timer job is scheduled with "very large negative number + my
trigger delay" as the trigger time, which PseudoClockScheduler.runCallBacks()
promptly interprets as a job that is due for execution, causing premature and therefore
incorrect triggering of my dynamic timer, thereby breaking my unit tests, while the
real-time clock does not exhibit this problem. And that's how I spent my Friday.
I can work around this bug by rewriting the "A and not(B after A)" construct so
that I'm explicitly comparing the timestamp fields instead of using "after".
However, the current Drools implementation is broken in more than one way: Say, you're
simulating the events at the original Woodstock (or any other events pre-dating the Unix
epoch), and you're setting the pseudo clock to a negative value so that
Date.toString() gives you the actual dates of the simulated events. As long as current
time is a non-positive value, adding duration == Long.MAX_VALUE will yield a time much,
much later than current time. This means that "A and not(B after A)" type rules
will not fire when they should because their trigger time is in a future far, far away
that will never be reached during the simulation. If I interpret this correctly, this also
affects rule activation with the real-time clock if it is set to a negative value (there
goes your time traveler market share...).
I'm not sure what the best fix for these bugs is. A real fix probably has to rework
how "negated unbounded after" type rule activations get scheduled. The duration
== Long.MAX_VALUE hack isn't going to cut it. Maybe these rules don't need
scheduling at all since their activations should be created right away? Anyway, tricky
stuff.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see:
http://www.atlassian.com/software/jira