Just wondering : does these problems happends because you use clocks and jobs, or is it
general to fusion 5.3 ?
Did not migrate my project on 5.3 for now.
So do you think there is a risk, knowing that I only use very basic fusion features ?
----- Mail original -----
De: "Edson Tirelli" <ed.tirelli(a)gmail.com>
À: "Rules Users List" <rules-users(a)lists.jboss.org>
Envoyé: Mercredi 9 Novembre 2011 14:53:00
Objet: Re: [rules-users] Bugs in Drools 5.3.0 break Fusion event processing
Richard,
This is great info. Yes, please open JIRA's for all 3 issues and we will make sure
this is fixed for the next release.
Thank you,
Edson
2011/11/9 Richard Calmbach < rcalmbac(a)gmail.com >
I am making extensive use of the event processing features of the Drools rule engine.
Upgrading from Drools 5.2.0.Final to Drools 5.3.0.Final broke 47 of my unit tests and also
broke my functional tests. There seem to be multiple changes in Drools 5.3.0 that cause
incorrect behavior and/or break backward compatibility. Here are the results of my
investigation so far.
Issue F1:
While tracking down the failures of my unit tests, I experimented with the Broker example,
and while probably not finding the root cause of the broken unit tests, I nonetheless came
across what clearly seems to be incorrect behavior in the DefaultTimerJobInstance.call()
method. The bug only reveals itself after all input has been processed, so it is masked by
the fact that running the Broker demo through the entire sequence in stocktickstream.dat
(1100 lines) takes a long time. In order to reveal the problem more quickly, run the demo
with only the first 14 lines in stocktickstream.dat. Do so for both the 5.2.0 Broker demo
(against Drools 5.2.0) and the 5.3.0 Broker demo (against Drools 5.3.0). The Broker
example code in both versions is identical, so only the Drools-internal code changes
matter.
When running the 5.2.0 Broker demo to the end, you get one java.text.ParseException (given
the structure of the example code, that's expected, albeit not elegant, and not the
focus of our investigation). In particular, no matter how many lines stocktickstream.dat
contains, you always get exactly one ParseException at the end.
Contrast this with running the 5.3.0 Broker demo: At the end you get N occurrences of
java.text.ParseException, where N is the number of lines in stocktickstream.dat. So for 14
lines you get 14 occurrences of ParseException. Looking at two specific methods shows us
why:
Method org.drools.examples.broker.events.EventFeeder.FeedJob.execute(JobContext):
public void execute(JobContext context) {
this.sink.receive( ((FeedContext) context).event );
if ( this.source.hasNext() ) {
((FeedContext) context).setEvent( this.source.getNext() );
this.trigger.setNextFireTime( ((FeedContext) context).getEvent().getDate() );
clock.scheduleJob( this,
context,
trigger );
}
}
Note in particular how this method already takes care of scheduling the next job execution
by updating the next fire time of the job's existing FeedTrigger instance.
Unfortunately, in Drools 5.3.0, DefaultTimerJobInstance.call() does a duplicate scheduling
of the same job:
Method org.drools.time.impl.DefaultTimerJobInstance.call():
public Void call() throws Exception {
this.trigger.nextFireTime(); // need to pop
if ( handle.isCancel() ) {
return null;
}
this.job.execute( this.ctx );
if ( handle.isCancel() ) {
return null;
}
// our triggers allow for flexible rescheduling
Date date = this.trigger.hasNextFireTime();
if ( date != null ) {
scheduler.internalSchedule( this );
}
return null;
}
So, every job is duplicated and that's why there are 2*N calls to
org.drools.examples.broker.events.StockTickPersister.load() instead of N.
I think the rescheduling inside DefaultTimerJobInstance.call() is incorrect. For one, it
breaks backward compatibility, and it is unexpected. The job should be in charge of
deciding whether there is another job to schedule or what to do. Implicitly scheduling the
next job just by updating the trigger time is a little too much magic.
Issue F2:
This is the bug that causes my unit tests to fail. I have not pinpointed the root cause,
but it seems to have to do with the event scheduling Drools does as part of its job
execution mechanism. Its symptom is a NullPointerException during insertion of an event.
What makes it so tricky is that with the out-of-the-box configuration, Drools catches such
exceptions in org.drools.time.impl.PseudoClockScheduler.runCallBacks() and passes them to
the aptly named "DoNothingSystemEventListener", which literally does nothing,
not so much as logging (the methods are empty). So you don't actually know that the
event insertion failed, you only wonder why your mock WorkingMemoryEventListener is
telling you that your expectations are not met. The stack trace (as copied from the
Eclipse Debug view, hence the unusual formatting) inside Drools is:
Date.getMillisOf(Date) line: 939
Date.compareTo(Date) line: 959
DefaultTimerJobInstance.compareTo(DefaultTimerJobInstance) line: 38
DefaultTimerJobInstance.compareTo(Object) line: 13
PriorityQueue<E>.siftUpComparable(int, E) line: 582
PriorityQueue<E>.siftUp(int, E) line: 574
PriorityQueue<E>.offer(E) line: 274
PriorityQueue<E>.add(E) line: 251
PseudoClockScheduler.internalSchedule(TimerJobInstance) line: 136
PseudoClockScheduler.scheduleJob(Job, JobContext, Trigger) line: 126
ObjectTypeNode.assertObject(InternalFactHandle, PropagationContext, InternalWorkingMemory)
line: 230
EntryPointNode.assertObject(InternalFactHandle, PropagationContext, ObjectTypeConf,
InternalWorkingMemory) line: 244
NamedEntryPoint.insert(InternalFactHandle, Object, Rule, Activation, ObjectTypeConf) line:
330
NamedEntryPoint.insert(Object, boolean, boolean, Rule, Activation) line: 291
NamedEntryPoint.insert(Object) line: 116
NamedEntryPoint.insert(Object) line: 48
<my code calling into Drools>
Here is method
org.drools.time.impl.DefaultTimerJobInstance.compareTo(DefaultTimerJobInstance):
public int compareTo(DefaultTimerJobInstance o) {
return this.trigger.hasNextFireTime().compareTo( o.getTrigger().hasNextFireTime() );
}
Essentially, this method calls java.util.Date.compareTo(Date) with a null argument, which,
as documented, causes a NullPointerException. Sometimes, this.trigger.hasNextFireTime()
already returns null, and then the NPE gets thrown in DefaultTimerJobInstance.compareTo()
itself.
I've seen different stack traces leading to this NPE, so this must be affecting
scheduling and job execution quite broadly.
Issue F3:
It is debatable whether this is a bug, but it is a backward-compatibility breaking change.
Previously, when scheduling a job with org.drools.time.TimerService.scheduleJob(Job job,
JobContext ctx, Trigger trigger) (for both real-time and pseudo clock), you could pass a
null JobContext (say, because you didn't need one), and it would work. However, in
Drools 5.3.0, this causes a NullPointerException at:
org.drools.time.impl.DefaultTimerJobFactoryManager.createTimerJobInstance(Job, JobContext,
Trigger, JobHandle, InternalSchedulerService) line: 25
I realize that if it's not in knowledge-api-<version>.jar, it's not an
official API, but the available interfaces and classes in org.drools.time.** (as used in
the Broker example) are *very* useful for test harnesses *and* for production code (for
implementing dynamic timers, for instance). So, this is more of a heads-up: If you are
suddenly getting an NPE, this might be the cause.
Please let me know whether I should create JIRA bug reports for issues F1 and F2. Also,
I'd be interested to hear whether others have run into issues with Fusion in Drools
5.3.0.
_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
--
Edson Tirelli
JBoss Drools Core Development
JBoss by Red Hat @
www.jboss.com
_______________________________________________
rules-users mailing list
rules-users(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users